130 86 9MB
English Pages 329 [322] Year 2023
Studies in Computational Intelligence 1108
Paolo Barsocchi · Naga Srinivasu Parvathaneni · Amik Garg · Akash Kumar Bhoi · Filippo Palumbo Editors
Enabling Person-Centric Healthcare Using Ambient Assistive Technology Personalized and Patient-Centric Healthcare Services in AAT
Studies in Computational Intelligence Volume 1108
Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, selforganizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
Paolo Barsocchi · Naga Srinivasu Parvathaneni · Amik Garg · Akash Kumar Bhoi · Filippo Palumbo Editors
Enabling Person-Centric Healthcare Using Ambient Assistive Technology Personalized and Patient-Centric Healthcare Services in AAT
Editors Paolo Barsocchi National Research Council Institute of Information Science and Technologies Pisa, Italy Amik Garg KIET Group of Institutions Ghaziabad, India Filippo Palumbo National Research Council Institute of Information Science and Technologies Pisa, Italy
Naga Srinivasu Parvathaneni Department of Computer Science and Engineering Prasad V Potluri Siddhartha Institute of Technology Vijayawada, India Akash Kumar Bhoi KIET Group of Institutions Ghaziabad, India Sikkim Manipal University Sikkim, India
ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-031-38280-2 ISBN 978-3-031-38281-9 (eBook) https://doi.org/10.1007/978-3-031-38281-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The healthcare industry is evolving from essential and non-profitable services to patient-centric healthcare services where individuals pay for the quality of services. Generally, the approach considers the provider’s performance by analyzing key actions that result in better health outcomes and patient satisfaction. The patientcentric model is a process for health services to establish relationships with professionals, individuals, and their communities to better align decisions with the patient’s goals, requirements, and preferences. Smart environments are being created while we move away from personal computers and toward a future where powerful, small devices are dispersed across the user’s surroundings, allowing the ambient augmentation of healthcare procedures with the capacity to detect, analyze, and integrate data for better surveillance of patients. This connectivity of gadgets, devices, and smart sensors allows the dynamic creation, analysis, and exchange of numerous data kinds, resulting in a rise in current healthcare services’ operational efficiency and efficacy. In such a context, ambient assisted Technology (AAT) plays a critical role in increasing the sustainability of healthcare services, making them accessible to elderly adults, and maintaining individuals safe and comfortable in their residential locations. Human activity recognition and behavior comprehension are among the main research areas in the AAL technology, intending to accurately identify movements, activities, and circumstances in surroundings. Elements of AAL can be located in remote monitoring and demotic assistive technology. Technical upgrades in sensors, smaller and cheaper devices, and increasing computer processing capacity have led to a rise in research on the subject in recent times. AAL applications are designed to allow the elderly or handicapped to remain freely and comfortably in their living environment for as long as feasible. Living surroundings include the house and the community, shopping complexes, and other public venues. Through AAT, every individual would be treated independently, and the technology will assist in person-centric healthcare. AAT applications are made up of complicated ecosystems of disparate data appliances and intelligent artifacts that may help persons with special needs in various ways. AAL and medical IoT technologies are covered in this book in diverse ways. It
v
vi
Preface
will be beneficial to those in bioengineering, healthcare informatics, and the Internet of Medical Things. Chapter “Sensor Datasets for Human Daily Safety and Well-Being” presents the study on sensor datasets for daily human safety and well-being, where the authors discussed the air quality sensors, Camera RGB/RGB-D, and Video Mocap Sensors and their associated case studies, discussing the role of sensors in real-time implementation and their tradeoff. Chapter “Habitpad: A Habit-Change Person-Centric Healthcare Mobile Application with Machine Leaning and Gamification Features for Obesity” presents a Person-Centric Healthcare Mobile Application named Habitpad for tracking Obesity. Habitpad is designed to keep track of obese patients’ data and habit changes in order to assist them in maintaining a balanced and healthy lifestyle. The application relays on data analytics through machine learning algorithms. Chapter “Human Centered Mathematics: A Framework for Medical Applications Based on Extended Reality and Artificial Intelligence” presents a framework named HuMath for medical applications based on Extended Reality and Artificial Intelligence, which uses Human-Centered mathematics, a concept that uses very complicated representations to construct the underlying architecture of advancements in physical and emotional rehabilitation, prioritization, and decision assistance. Chapter “Attentive Vision-Based Model for Sarcopenia Screening by Automating Timed Up-and-Go (TUG) Test” presents an attentive Vision-Based model for sarcopenia screening in elderly people. The sarcopenia screening is done by observing body posture and movement patterns, and the model has obtained a promising accuracy of 93.7%. Chapter “AAL with Deep Learning to Classify the Diseases Remotely from the Image Data” presents the Ambient Assistive Living Technology along with deep-learning algorithms for classifying the disease remotely from the image data. The study has briefly discussed the recent advancements in Ambient Assistive Living technology and its implementation with various deep-learning models. Chapter “Heart Failure Prediction Using Radial Basis with Metaheuristic Optimization” presents the heart failure prediction model using a radial basis with metaheuristic optimization. Authors have proposed a model based on the Radial Basis Function neural network with Genetic Algorithm optimization to precisely predict heart failure, and the experimental results have proven that the model is able to attain an accuracy of 92.6%. In Chapter “Healthcare Management and Prediction of Future Illness Through Autonomous Intelligent Advisory System Using AAT Computational Framework”, the authors have proposed an AAT computational framework named H-Pilot that relies on the Social Internet of Things for predicting future illness. The authors have discussed the recent advancements in ambient assisted living technology and various applications that rely on the AAT, categories of AAT, and Tools and Services rendered by AAT. The study also outlines the challenges associated with AAL at various levels. Chapter “ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based on ECG Dataset” presents the Arrhythmia detection model based on the ECG dataset using deep-learning models like ResNet-50 using a Conv-1D model with LSTM for the classification, and the proposed model
Preface
vii
is able to obtain a reasonable accuracy of 98.7%. Chapter “A Review of Brain-Computer Interface (BCI) System: Advancement and Applications” presents a review of the Brain-Computer Interface system. The authors have discussed recent trends in brain-computer interfaces along with their challenges, applications of BCI, and implementation aspects of BCI. Chapter “Optimized TSA ResNet Architecture with TSH—Discriminatory Features for Kidney Stone Classification from QUS Images” presents a study on kidney stone classification from QUS Images using the Optimized TSA-ResNet over the statistical and histogram-based features. The experimental results obtained an accuracy of 98.9%, and the authors have discussed the architecture of the TSAResNet model, and the experimental results are presented over the divergent datasets. Chapter “Ambient Healthcare: A New Paradigm in Medical Zone” presents the study on Ambient Healthcare, motivation, and associated challenges in ambient healthcare. The study includes various relevant models like body area networks, Mesh Sensor Networks, and some associated real-time applications. Chapter “Illuminating Unexplored Corners in Healthcare Space Using Ambience Intelligence” presents the study of ambient intelligence and artificial intelligence in the healthcare domain. The authors have discussed challenges in healthcare and delivered an appropriate solution for stakeholders pertaining to human behavior analysis and discussion on home-assisted living arrangements for elderly people. Chapter “Depression Assessment in Youths Using an Enhanced Deep Learning Approach” discusses the topic of depression assessment in youths using an enhanced deep-learning approach. The proposed model works on a support vector machine model with a convolutional neural network computed feature vector, and the experimental results have proven that the proposed model has yielded an accuracy of 92.5%. Chapter “Telemedicine Enabled Remote Digital Healthcare System” presents the study on telemedicine Enabled Remote Digital Healthcare Systems over an Internet of Medical Things platform. The authors have discussed the implementation details of telemedicine with the future scope of the technology. The editors wish to thank all the authors who have contributed to this book and sincerely thank the Springer Nature Editorial and production team for their constant support. We send our best wishes to our readers! Pisa, Italy Vijayawada, India Ghaziabad, India Ghaziabad, India Pisa, Italy
Paolo Barsocchi Naga Srinivasu Parvathaneni Amik Garg Akash Kumar Bhoi Filippo Palumbo
Contents
Sensor Datasets for Human Daily Safety and Well-Being . . . . . . . . . . . . . . Nino Cauli, Silvia M. Massa, Diego Reforgiato Recupero, and Daniele Riboni Habitpad: A Habit-Change Person-Centric Healthcare Mobile Application with Machine Leaning and Gamification Features for Obesity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wan Chai Xuan and Pantea Keikhosrokiani Human Centered Mathematics: A Framework for Medical Applications Based on Extended Reality and Artificial Intelligence . . . . . Yarlin A. Ortiz-Toro, O. L. Quintero, and Christian Andrés Diaz León Attentive Vision-Based Model for Sarcopenia Screening by Automating Timed Up-and-Go (TUG) Test . . . . . . . . . . . . . . . . . . . . . . . . H. M. K. K. M. B. Herath, A. G. B. P. Jayasekara, B. G. D. A. Madhusanka, and G. M. K. B. Karunasena
1
27
57
85
AAL with Deep Learning to Classify the Diseases Remotely from the Image Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A. Sharmila, E. L. Dhivya Priya, K. S. Tamilselvan, and K. R. Gokul Anand Heart Failure Prediction Using Radial Basis with Metaheuristic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Varshitha Vankadaru, Greeshmanth Penugonda, Naga Srinivasu Parvathaneni, and Akash Kumar Bhoi Healthcare Management and Prediction of Future Illness Through Autonomous Intelligent Advisory System Using AAT Computational Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Haritha Akkineni, Madhu Bala Myneni, Y. Suresh, Siva Velaga, and P. Phani Prasanthi
ix
x
Contents
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based on ECG Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Ojaswa Yadav, Ayush Singh, Aman Sinha, Chirag Vinit Garg, and P. Sriramalakshmi A Review of Brain-Computer Interface (BCI) System: Advancement and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Bishal Kumar Gupta, Tawal Kumar Koirala, Jyoti Rai, Baidyanath Panda, and Akash Kumar Bhoi Optimized TSA ResNet Architecture with TSH—Discriminatory Features for Kidney Stone Classification from QUS Images . . . . . . . . . . . . 227 P. Nagaraj, V. Muneeswaran, Josephine Selle Jeyanathan, Baidyanath Panda, and Akash Kumar Bhoi Ambient Healthcare: A New Paradigm in Medical Zone . . . . . . . . . . . . . . . 247 Sreemoyee Samanta, Adrija Mitra, Sushruta Mishra, and Naga Srinivasu Parvathaneni Illuminating Unexplored Corners in Healthcare Space Using Ambience Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Sagnik Ghosh, Dibyendu Mehta, Shubham Kumar, Sushruta Mishra, Baidyanath Panda, and Naga Srinivasu Parvathaneni Depression Assessment in Youths Using an Enhanced Deep Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Shainee Pattnaik, Anwesha Nayak, Sushruta Mishra, Biswajit Brahma, and Akash Kumar Bhoi Telemedicine Enabled Remote Digital Healthcare System . . . . . . . . . . . . . 301 Shambhavi Singh, Nigar Hussain, Sushruta Mishra, Biswajit Brahma, and Akash Kumar Bhoi
About the Editors
Paolo Barsocchi is Senior Researcher at the Information Science and Technologies Institute of the National Research Council in Pisa, Italy. In 2008, he was Visiting Researcher at the Universitat Autònoma de Barcelona, Spain. Since 2017, he has been the Head of the Wireless Networks Research Laboratory. He is included in the World’s Top 2% Scientists according to the Stanford University List in 2020 and 2021. His research interests are in the areas of the Internet of Things, wireless sensor networks, cyber-physical systems, machine learning and data analysis techniques, smart environments, ambient assisted living, activity recognition, and indoor localization. He has been nominated as a regional competence reference person for advanced manufacturing solutions in Industry 4.0 in 2017, and as a contact person in the Cluster-PON call in 2017 for the CNR Department DIITET. He has been (and currently is) involved in several European projects, national and regional projects, in the following listed. The overall amount of attracted and managed funds both at European and national level is about e4M. Naga Srinivasu Parvathaneni is Associate Professor in the Computer Science and Engineering Department at Prasad V. Potluri Siddhartha Institute of Technology, India. He obtained his Bachelor’s degree in Computer Science Engineering from SSIET, JNTU Kakinada (2011), and a Master’s degree in Computer Science Technology from GITAM University, Visakhapatnam (2013). He was awarded a doctoral degree by GITAM University for his thesis on Automatic Segmentation Methods for Volumetric Estimate of damaged Areas in Astrocytoma instances Identified from the 2D Brain MR Imaging. His fields of study include biomedical imaging, soft computing, explainable AI, and healthcare informatics. He has published numerous publications in reputed peer-reviewed journals and has edited book volumes with various publishers like Springer, Elsevier, IGI Global, and Bentham Science. He was an active reviewer for more than 40 journals indexed in Scopus and Web of Science. He also served as guest editor and technical advisory board member for various internationally reputed conferences.
xi
xii
About the Editors
Dr. Amik Garg is currently serving as Director, KIET Group of Institutions (DelhiNCR), Ghaziabad. He holds his B.E. in Mechanical Engineering from Delhi Technological University in 1986 (erstwhile Delhi College of Engineering) and subsequently both M.Tech & Ph.D. (Industrial Engineering) from IIT Delhi. He has also received an award of commendation for the innovation at the workplace from the Government in the year 2004. He’s an accomplished engineering professional who’s based out of Delhi and carrying 35+ years of experience in industry & academia which majorly comprises of working with different government & private organizations in various leadership roles. He has published several papers in International Journals, and his research areas are maintenance management, supply chain management, information systems, performance measurement, etc. As an academic leader, his focus has always been to create experienced engineers duly aligned with the needs of Industry 4.0. Akash Kumar Bhoi [B.Tech, M.Tech, Ph.D.] is listed in the World’s Top 2% of Scientists for single-year impact for the year 2022 (compiled by John P.A. Ioannidis, Stanford University & published by Elsevier BV) and currently associated with Directorate of Research, Sikkim Manipal University as Adjunct Research Faculty and also with the KIET Group of Institutions, India as Adjunct Faculty. He is also working as a Research Associate at Wireless Networks (WN) Research Laboratory, Institute of Information Science and Technologies, National Research Council (ISTI-CRN) Pisa, Italy. He was appointed as the honorary title of “Adjunct Fellow” Institute for Sustainable Industries & Liveable Cities (ISILC), Victoria University, Melbourne, Australia, for the period from August 1, 2021, to July 31, 2022. He was the University Ph.D. Course Coordinator for “Research & Publication Ethics (RPE) at SMU.” He is the former Assistant Professor (SG) of Sikkim Manipal Institute of Technology and served about 10 years. He is a member of IEEE, ISEIS, and IAENG, an associate member of IEI, UACEE, and an editorial board member reviewer of Indian and International journals. He is also a regular reviewer of reputed journals, namely IEEE, Springer, Elsevier, Taylor and Francis, Inderscience, etc. His research areas are Biomedical Technologies, the Internet of Things, Computational Intelligence, Antenna, and Renewable Energy. He has published several papers in national and international journals and conferences and 150+ publications registered in the Scopus database. He has also served on numerous organizing panels for international conferences and workshops. He is currently editing several books with Springer Nature, Elsevier, and Routledge & CRC Press. He is also serving as guest editor for special issues of journals like Springer Nature, Wiley | Hindawi, and Inderscience. Filippo Palumbo received a Ph.D. in Computer Science from the University of Pisa, Italy, in 2016, and an M.Sc. Computer Science Engineering, with honors, from Polytechnic University of Bari, Italy, in 2010. His research interests include the application of AI to wireless sensor networks for intelligent system design and software development in distributed systems. He has participated in several EU and national funded research actions in the areas of Ambient Intelligence.
Sensor Datasets for Human Daily Safety and Well-Being Nino Cauli, Silvia M. Massa, Diego Reforgiato Recupero, and Daniele Riboni
Abstract Improvements in global wealth and well-being result in an increase in the life expectancy of the population and, consequently, determine an increase in the number of people with physical or mental impairments. This slice of the population needs daily assistance and monitoring to live a safe and productive life. For this reason, several researchers are focusing their work on health care technologies. A broad collection of sensors has been developed to monitor our health status, our daily behavior, and the environment where we live. In this chapter we present two types of non-invasive sensors used in the health care domain: air quality sensors and cameras. For each type of sensor, we describe the health care application areas, the technology used to implement them, the existing datasets, and the data collection process and issues. Moreover, we analyze a real-life health care application for both air quality sensors and cameras. Keywords Health care · Air quality sensors · Cameras · Active aging · Deep learning · Robotics
N. Cauli · S. M. Massa · D. R. Recupero (B) · D. Riboni Mathematics and Computer Science Department, University of Cagliari, Via Ospedale 72, 09124 Cagliari, Italy e-mail: [email protected] N. Cauli e-mail: [email protected] S. M. Massa e-mail: [email protected] D. Riboni e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_1
1
2
N. Cauli et al.
1 Introduction In order to guarantee well-being and a proper health care of the population, there is the need of effective solutions for responsive assistance to people in need, regular monitoring of everyday activities, and safety evaluations of the environments where we live. These services should always be available either in public or private areas where we spend our life. It is unreasonable to think that medical doctors and caregivers would be able to constantly monitor and provide daily assistance to each of their patients without the support of autonomous systems and dedicated technology. In the last 10 years, during the Fourth Industrial Revolution (Industry 4.0) there was rapid growth and change in technology due to the increase of interconnectivity and the introduction of smart automation. Industrial processes and health care are now supported by advanced and smart technologies like big data, cloud computing, deep learning, Internet of Things (IoT), mobile devices, and robotics. In particular, the smart health care (s-health) paradigm uses intelligent models, smart sensors, cloud services, and communication networks to provide context-aware health care services. S-health exploitation of contextual and user-centric data extracted from smart sensors provides more effective and personalized health care services to patients. Today smart devices and smart environments (homes, hospitals, and cities) are equipped with a wide range of sensors capable to extract multiple types of data. Each application scenario presents different requirements in accuracy, dimensions, costs, invasivity, and reliability, and the proper set of sensors must be selected. Sensors can be divided into two groups according to the type of sensed data: user-centric, and contextual sensors [1]. User-centric sensors acquire data related to the individuals, while contextual sensors focus on contextual attributes of the environment. User-centric sensors are used to extract information on physiological parameters and health status of an individual, like cardiovascular activity, brain activity, muscular activity, body temperature, electrodermal activity, body motion, and body tracking. Most of these sensors can be easily integrated into wearable devices. Wearable devices span from highly invasive (implants and body-worn accessories) to low invasive (on-skin tattoos, smart textiles, and ingestible sensors). Typical biosignal sensors integrated within wearable devices are skin electrodes, pressure sensors, pulse oximeter, electrochemical, photo-sensors, thermistors, pyroelectric, impedance sensors, accelerometers and IMUs. Some biosignals like body motion, body location, and body temperature can be captured using external sensors that are not attached to the body of the patients. RGB, infrared, and depth cameras are powerful tools able to capture body features and motion. Supported by machine learning models, RGB cameras can be used to precisely track the location of bodies and objects, track their skeletal pose and detect their motion. External cameras have the advantage to be non-invasive compared to wearable sensors. Contextual sensors extract information from the environment, particularly regarding air quality (i.e. pollution, temperature, humidity, and pressure), water
Sensor Datasets for Human Daily Safety and Well-Being
3
contamination, acoustic contamination, and electromagnetic radiations. This type of sensor tends to be less invasive compared to user-centric ones. When we deal with medical data, one important issue is the shortage of available datasets. Health, biomedical, and physiological records are sensitive data and for this reason, their disclosure is severely limited by strict security and privacy standards. Data collected in hospitals or care centers are usually managed by the medical facility and are kept private. Another problem is the difficulty in generating large datasets. Collecting data from patients is time-consuming (i.e., guaranteeing the safety of the patients, obtaining permissions from the hospital and the patients, and setting up the experiments), and the recording sessions can result tiring for the patients. In some cases (e.g., motion analysis), data from different domains can be adapted to medical studies, but more often data augmentation is needed. To better understand all the procedures and problems involved in the implementation of a health care monitoring system, in the remainder of the chapter we analyse in detail two types of sensors (one contextual and the other user-centric) and for each of them, we present a case study. We decided to focus our attention on non-invasive, low cost and popular sensors: air quality sensors and cameras. The air quality sensor has applications in several domains, one of which is health care. The elements that compose the air let us know whether the place where we are is safe for our health and whether or what events are taking place. In the health care domain context, we will show the use of the Uhoo sensor combined with a statistical feature extraction method and a deep neural network for recognizing cooking activities to keep a food diary. RGB cameras and RGB-D depth sensors are cheap and powerful tools for recognizing and predicting human action and motion. RGB videos store information on colors, lighting, appearance, and motion, while from depth sensors it is possible to derive information on 3D shapes, positioning, and distance. As a case study, in this chapter, we present a robotic coach to promote active aging for seniors (Dr VCoach). The robotic coach is in charge of proposing a proper exercise schedule based on human directives, monitoring the exercise performed by the patient/senior, and correcting the exercise in case of mistakes. Air quality sensors are addressed in Sect. 2 while Sect. 3 focuses on RGB/RGB-D cameras and Video Mocap Sensors. Finally, Sect. 4 draws the conclusions for this chapter.
2 Air Quality Sensors Air quality sensors are widely applied in the health care domain. Low-cost ones [2–5] are particularly popular because, compared to reference-grade air quality monitors, the purchase and operating costs are lower, the spatial density is higher, the acquired data can be displayed with different time-resolutions, and field distribution, data collection, and transmission are easier to implement [6].
4
N. Cauli et al.
More and more of these low-cost air quality sensors are available in the market. Their characteristics are not standard and vary from sensor to sensor. These are commonly reported by the manufacturer in the sensor manual and include the following features: general operation such as charging mode, data storage and retrieval mode, operating conditions, possible expiration date, calibration mode, performance (accuracy and bias), maintenance mode, the response time when there is a change in conditions, pollutants detected, and known interference. Therefore, before buying an air quality sensor, we need to ask ourselves several questions, such as: What do we want to measure? What is the sensor’s ability to be accurate when it is far from the gas source or when the gas concentration is very low or very high? What are the accuracy and bias of the measurements? Is calibration necessary, and how is it done? What is the response time? What is the quality and durability of the hardware? Is the sensor usable for end users? How much does it cost? [7]. The sensors produced can monitor one or more air quality parameters, the most common are: ozone (O3 ), carbon monoxide (CO), carbon dioxide (CO2 ), sulfur dioxide (SO2 ), nitrogen dioxide (NO2 ), particulate matter (PM), volatile organic compound (VOC), temperature, and humidity. The possible sources of the mentioned gases are varied and often unknown, although we deal with them daily. Some of these are: electric utilities; gasoline vapors; unventilated fuel and gas-type space heaters; tobacco smoke; gas-type water heaters; wood stoves and fireplaces; gas-powered equipment; worn or poorly-adjusted and maintained combustion devices; people’s breath; burning of fossil fuels by means of transport; cows farming; production of rice or other fruit and vegetable cultivation; combustion of coal, oil, and gas that contains sulfur [6, 8]. Some companies that produce low-cost air quality monitor sensors with the respective air parameters monitored by their products are: . . . . . . . . . . . . . . . . . .
GSS: CO2 [9], Sharp Microelectronics: VOC, PM [10], CO2 Meter: CO2 , temperature, humidity [11], FIGARO: CO, NO2 , SO2 [12], Netatmo: CO2 , humidity, temperature [13], Nissha: CO, VOC, O3 [14], Scienoc: O3 , CO, SO2 , NO2 [15], Foobot: PM, VOC, temperature, humidity [16], Ohmetech.io: VOC, CO2 , temperature, humidity [17], SPEC Sensors: NO2 , SO2 , CO, O3 [18], Yoctopuce: CO2 , VOC, temperature, humidity [19], AWAIR: CO2 , VOC, PM, temperature, humidity [20], Air Mentor: VOC, PM, CO2 , temperature, humidity [21], ELT SENSOR: CO, CO2 , NO2 , SO2 , VOC [22], Monnit: PM, CO, CO2 , temperature, humidity [23–26], Plantower: PM, VOC, CO2 , temperature, humidity [27], Sensirion: VOC, CO2 , PM, humidity, temperature [28], SGX SensorTech: CO, CO2 , SO2 , NO2 , VOC [29],
Sensor Datasets for Human Daily Safety and Well-Being
. . . . . . .
5
Shinyei: PM, O3 , CO2 , humidity, temperature [30–32], Telaire: CO2 , PM, VOC, humidity, temperature [33], Renesas: NO2 , O3 , VOC, CO2 , temperature, humidity [34–36], HANWEI: PM, CO2 , VOC, CO, temperature, humidity [37, 38], Alphasense: CO, VOC, NO2 , O3 , PM, SO2 , temperature, humidity [39], uHoo: VOC, O3 , PM, NO2 , CO, CO2 , humidity, temperature [40], Winsen: CO2 , CO, PM, O3 , SO2 , VOC, temperature, humidity [41].
When the air quality parameters assume abnormal values, various health disorders can arise such as fatigue; dizziness; nausea; eye, nose, and throat irritation; headache; flu-like symptoms; airway inflammation; respiratory disease; airway narrowing; chest pain; angina; reduced brain function; impaired vision and coordination; various degrees of toxic symptoms; lung infections; vascular and endothelial dysfunction; alterations in heart rate variability; coagulation; liver, kidney, and central nervous system damage; cancer; and fetal death [6, 7]. As it can be seen, the effects associated with exposure to polluted air can lead to consequences that vary in severity and include death [42]. Moreover, air pollutants can impact our lives by damaging vegetation, reducing visibility, and affecting global climate conditions [7]. It is, therefore, necessary to monitor air quality, especially for population groups that are most vulnerable to air pollution and most prone to develop a disease or an abnormal condition. These groups include: children aged 13 years or younger, the elderly aged 65 years or older, young people aged 18 years or younger with asthma, normal adults with asthma, and people with chronic obstructive pulmonary disease (COPD) [43]. There are several guidelines, provided by different international agencies, that report the acceptable values of air pollutants for health. National Ambient Air Quality Standards (NAAQS) [44], and Canadian Ambient Air Quality Standards (CAAQS) [45] provide information about pollutants that are common in outdoor air. Office of Safety and Health Administration (OSHA) [46], National Institute for Occupational Safety and Health (NIOSH) [47], and American Conference of Governmental Industrial Hygienists (ACGIH) [48] provide information to ensure safe and healthy working conditions in workplaces. American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) [49] provides information for indoor environments, and World Health Organization (WHO) [50] provides information for air quality in general. Furthermore, there are several open-source air quality datasets available online. For instance, the non-profit organization OpenAQ [51] provides open-air quality data collected from 426 data sources whose sensors are placed in 47,295 locations around the world. The parameters principally monitored are PM2.5 , PM10 , O3 , SO2 , NO2 , CO, and black carbon (BC). Air quality monitoring sensors can be anchored to the ground and immobile or attached to moving objects or vehicles. Data are produced from a wide variety of sources: community entities such as community-led organizations or student activist groups; research entities, such as research institutions or academic institutions; and government entities, such as city authority and national government. The division of the data into low-cost and regular grade is not done according to the
6
N. Cauli et al.
price range of the used sensor, this is because what is considered “low-cost” varies according to socioeconomic and regional variables. Therefore, OpenAQ tags data as reference-grade only when the source is government. The most common purpose for which these sensors are employed is to provide information on air healthiness [52, 53]. However, the data acquired through these types of sensors can be used for more complex tasks, such as alerting a person when a specific event occurs and providing detailed information about a detected problem [54–57]. In this context, in Sect. 2.1 we illustrate a context-aware application in which air quality parameters are monitored in order to identify meal preparation and to support food journaling [58].
2.1 Case Study: Maintenance of a Food Diary A possible use case of the air quality sensor in health care is the maintenance of a food diary in which the food consumed and its quantity is reported [59]. Diet data analysis is extremely important to evaluate the healthiness of an individual’s nutrition and for setting up interventions when necessary. The 2018 Global Nutrition Report [60] of the World Health Organization reveals that malnutrition affects, in different forms, every country of the world. Following an improper diet can be a health risk because it can lead to the occurrence of various diseases such as diabetes, heart disease, stroke, several types of cancer, musculoskeletal disorders, and respiratory symptoms [61]. Moreover, underweight and higher levels of obesity have been associated with increased mortality compared to the normal-weight category [62]. Despite all the efforts of health education to improve eating habits, good dietary practices are often neglected, and there is a growing need for new technologies that can assist individuals in following good dietary practices [63]. Traditional methods for keeping food diaries are based on interviews and questionnaires to assess the patients’ eating routines [64]. However, there are several solutions for keeping independently a food diary that can reduce the burden of data acquisition [65]. These generally consist of mobile apps that frequently use smartphone sensors, such as the camera or the microphone [66–69], and store locally or on the cloud food information and calorie counts for long-term monitoring [66]. Casas et al. proposed an elementary text-based conversational agent [70]. Zhu et al. employ computer vision tools and pictures taken before and after food consumption to accurately recognize the kind of food and estimate the eaten quantity [67]. Sen et al. [71] created a smartwatch-based system to detect eating gestures, and recognize food through pictures and computer vision software. Chi et al. [72], provide accurate calorie counts using a combination of cameras, connected kitchen scales, and food databases. Yordanova et al. use data acquired from various sensors in a smart kitchen to identify cooking activities, particularly data regarding temperature, humidity, light/ noise/dust levels, individual movements, use of certain objects, water, and electricity [73, 74]. Other solutions rely on the automatic classification of chewing sounds [68],
Sensor Datasets for Human Daily Safety and Well-Being
7
Fig. 1 System that automatically recognizes cooking activities using an air quality sensor
and recognition of eating moments through analysis of heart rate and activity patterns [75]. These solutions are not privacy-friendly, are obtrusive, and require effort by users in the long term [76], especially if the users are elderly and suffering from forgetfulness or physical problems. Below we illustrate a system that addresses these problems and automatically recognizes food preparation at home utilizing an air quality sensor (Fig. 1). In this system, data are acquired through the Uhoo air quality sensor, a commercial sensor that does not require calibration or manual settings. Figure 2 indicates an example of such a quality sensor. It can monitor several parameters, including temperature, humidity, carbon dioxide, volatile organic compounds, particulate matter, nitrogen dioxide, carbon monoxide, and ozone. The readings are taken and sent to the cloud every minute, where a DATA CLEANING module performs data preprocessing to eliminate possible errors in sensor readings. Open APIs are exploited to query the measured data. Then, in the FEATURE EXTRACTION module, features related to the change in gas trends and environmental parameters, due to activity in the kitchen and the use of tools such as the oven or gas stove, are extracted by considering 30-min time windows. At last, these features are used by the Deep Neural Network in the ONLINE RECOGNITION module to recognize food preparation in real-time. The network was trained using a dataset collected from different participants in real homes and under various conditions over 8 months. In the consecutive sections, we will explain in more detail the modules that compose the system, Sect. 2.1.1 2.1.2 2.1.3, the dataset used to train the neural network, Sect. 2.1.4, and an initial proposal for coupling the system with a robotic assistant Sect. 2.1.5. The system presented is an initial investigation of the use of the air quality sensor to identify cooking activity and therefore has limitations. It was designed for people
8
N. Cauli et al.
Fig. 2 a The air quality sensor used in the experimental setup. b The hourly trend of carbon dioxide in the kitchen in a day. Each point represents the average carbon dioxide value in the kitchen during a given hour. The points can take on different colors: green represents comfortable values for human life; red represents uncomfortable values; yellow represents intermediate values
who eat most of their meals at home, as this is a typical situation for many elderly people. To compile the food diary when the user consumes meals outside the home, the system could be extended with mobile solutions. The current system is suitable to recognize the preparation of hot meals, to recognize the preparation of cold meals the system could be extended with other sensors capable of recognizing the interaction with kitchen components. Future scenarios with more inhabitants will be studied. Furthermore, advanced solutions for conversational interfaces [77], human-computer interaction [78], usability [79], and diet analysis [80], including calorie count [81] and adaptive interfaces for supporting behavior change [82], will be addressed in future work.
2.1.1
Data Cleaning Module
Data acquired through sensors are generally affected by a relevant level of noise, consequently, these must be pre-processed. Many air quality monitors, such as the Uhoo, perform internal pre-processing of raw data by smoothing the values of consecutive readings before sending them to the user application. In this work, the only other pre-processing that is done is to neglect the set of consecutive data where more than 50% of the values are missing due to network errors or power outages.
2.1.2
Feature Extraction Module
To be trained properly, the neural network needs relevant data as input. Therefore, the trend of air quality data acquired during cooking and not was carefully analyzed to extract features useful to discriminate between ‘cooking’ and ‘not-cooking’ activities. The feature vectors were created considering six sensor data parameters: temperature, humidity, carbon dioxide, volatile organic compounds, particulate matter, and nitrogen dioxide. Multiple parameters were used in the work because a single parameter is not sufficient to recognize cooking activities. For instance, increased CO2 levels indicate food preparation with a gas stove, but false positives may occur when
Sensor Datasets for Human Daily Safety and Well-Being
9
more than one person is present in the kitchen, especially if the window is closed and the kitchen is small or poorly ventilated. Moreover, when a meal is prepared without the use of a gas stove, CO2 levels tend to be stable and false negatives occur. Another example is the temperature since its variations are strongly influenced by climatic factors and other external conditions. The feature vector consists not only of the absolute value of the parameters but also of the difference between the current value and the past values, to perform real-time predictions, and statistical features. More precisely, from the differences between the most recent value and that of the previous 5, 10, 15, 20, and 25 min, the average, minimum and maximum value of the last 5 min, as well as the standard deviation of these values. The time of the day is another important indicator of food preparation since cooking is normally carried out at specific times. Then the current time of day is evaluated as the number of minutes elapsed since midnight.
2.1.3
Online Recognition Module
A deep neural network, a Multilayer Perceptron (MLP) composed of four layers, was used for cooking activity recognition. If we consider nF as the number of input features, the number of units per layer is nF/2 for the input layer, nF for the first hidden layer, 2nF for the second hidden layer, and one for the output layer since we are performing a binary classification. The layers are fully connected (dense). The activation functions chosen are the Leaky Rectified Linear Units function (LeakyReLU) for the input and hidden layers, and the Sigmoid function for the output layer since we need a binary output value. To increase the stability of the neural network a Batch Normalization layer was added before the input layer and before every LeakyReLU layer. To prevent over-fitting Dropout layers were added after every LeakyReLU layer with an input units fraction to drop equal to 0.5. The learning rate was set at 0.0001. The optimizer chosen was Adam and the loss function was the binary cross-entropy.
2.1.4
Dataset
The acquisition of the dataset took 8 months and was performed by middle-aged single inhabitants, couples, families with children, groups of roommate students, a senior living alone, and a senior living with a middle-aged person. The volunteers lived in homes having different characteristics. So the data are subject to many variables: the kind of person who cooked most in the house (3 men and 5 women, ages ranging from 20 to 72); the area where the house is situated (city vs countryside); the sensor distance from domestic appliances used for cooking (ranging from 5 cm to 1.5 m); the presence of air conditioning, pellet stove or windows in the kitchen; the house structure (separate kitchen from the dining room or open space); and the season where the data were acquired.
10
N. Cauli et al.
The participants self-annotated for one month: date, start and end time of cooking, cooked foods, domestic appliances used for cooking, and presence of an open window. The dataset was acquired in real-world environments and in naturalistic conditions; we did not rely on multiple annotators and we could not evaluate interrater reliability, so the self-annotations inevitably may contain missing or wrong labels. At the end of the month, the annotations were digitized. Each data record contains: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
date, time, temperature (in °C), relative humidity (in percentage %), PM2.5 (Fine Particulate Matter in µg/m3), TVOC (Total Volatile Organic Compound in ppb), CO2 (Carbon Dioxide in ppm), CO (Carbon Monoxide in ppm), air pressure (in hPa), O3 (Ozone in ppb), NO2 (Nitrogen Dioxide in ppb), current activity (1 if the user is cooking a meal, 0 otherwise), type of cooked food (e.g., rice, salad).
The final dataset is composed of 16,323 cooking records, and 334,228 not-cooking records, so the classes are strongly unbalanced.
2.1.5
Use of a Robotic Assistant
The system can be coupled with personal digital assistants or dialogue systems like the Zora robot. A preliminary study has been proposed in which this humanoid interacts with the user when the system recognizes that something has been cooked, as shown in Fig. 3. More specifically, the robot is informed if the new data read by the Uhoo sensor is classified as ‘cooking’ by the neural network. Before starting the interaction with the user, the robot checks if there is anyone in the kitchen by taking some pictures of the room. For privacy reasons, the activity of detecting people in the environment is optional and the user can decide whether to activate it or not. If the robot identifies one or more people in the pictures, it asks what food the user is preparing. If camera-based recognition is disabled, the robot still asks its question. Once the user answers, the robot performs speech-to-text processing and sends the extracted food and sensor data to the neural network, which extends the training data with the new annotations and, periodically, retrains the overall model.
Sensor Datasets for Human Daily Safety and Well-Being
11
Fig. 3 A preliminary architecture where the cooking recognition system is coupled with the robotic assistant Zora. Note that point (4) is optional: the user may enable or not the camera-based object detection of the robot. If (4) is disabled, the action goes directly from point (3) to point (5)
3 Camera RGB/RGB-D and Video Mocap Sensors As we already mentioned, monitoring, and analysing the daily activities and movements of individuals provides important information to doctors and caregivers. In order not to create discomfort and influence subjects’ daily behaviour, monitoring sensors need to be at the same time non-invasive and able to extract meaningful information. Image-based sensors are a valid candidate for this task. RGB and depth cameras possess several advantages over another type of sensors: non-invasive (external, no need to be worn), large operating area (they work on long distances and wide fields of view), possibility to set up a multi-camera system, multiple target tracking, visual and motion information, and high precision. Camera sensors are widely used for health monitoring in different scenarios. A straightforward application of camera sensors in health care is surveillance systems. Surveillance cameras can be used to monitor patients and elders during their everyday activities for fall detection, homeland security, and ambient assistive living [83, 84]. Video emotion recognition is another field of application for camera sensors. A system able to recognise emotion from videos can be extremely useful in support of depression or mental stress treatments [85]. Technologies able to support doctors and therapists can boost the efficiency of physical rehabilitation and monitor daily physical training for elders. Recently, several researchers are focusing on autonomous video monitoring systems for rehabilitation [86] and active aging promotion [87]. Camera sensor rigs come in different types and setups depending on their use. The following are the most common image sensor systems use for monitoring in health care: . RGB cameras: RGB cameras capture visible light in red, green, and blue (RGB) wavelengths. An internal matrix-shaped sensor is able to convert the light into
12
N. Cauli et al.
electrical signals and store them as a matrix of pixels. Each pixel represents the intensity of Red, Blue, and Green light components with 3 integer values. RGB cameras tend to be sensitive to light changes, motion blur, occlusions, and cluttering, but have the advantage to be relatively cheap, and able to grasp motion, colours, objects’ shape, and scene structure information. Multi-camera systems can be used to enhance the area covered by the sensors, reduce occlusion problems, improve scene understanding, and improve tracking precision. Omnidirectional cameras are another possible solution to extend the working area. Omnidirectional cameras have a 360 field of view in the horizontal plane. The extracted omnidirectional image can be then projected into perspective 2d images. Due to the recent development of Deep Learning (DL), Convolutional Neural Networks (CNNs), and dedicated processing hardware like graphical cards and accelerators, the use of RGB images as the main input for automated systems is increasing exponentially. . Depth sensors (RGB-D cameras): One drawback of RGB data is the difficulty in extracting 3D information from single images. RGB-D cameras overcome this problem by being able to extract both RGB and depth information. One of the first RGB-D cameras that revolutionised both the gaming industry and Computer Vision research was the Microsoft Kinect sensor (the last instalment comes with the name of Azure Kinect [88]). Other than an RGB camera, the Kinect is equipped with a depth sensor and a multi-array microphone. The Kinect measures the distance of the objects in its surroundings calculating the time of flight of an infrared light emitted by the Kinect itself. Kinect cameras are able to generate sequences of RGB images, 3D point clouds and skeleton data of multiple tracked subjects. A direct competitor of the Kinect is the Intel RealSense [89]. The RealSense uses a stereo camera setup enhanced by infrared data to calculate the depth information, resulting more suited to collect data outdoors. The sensor generates RGB images and point cloud data, while the skeleton data can be obtained using third-party libraries. Unfortunately, the data from RGB-D sensors becomes very noisy for distant objects and the sensors can be used only at a limited distance range. . Video MoCap systems: Optical Motion Capture (MoCap) systems consist of a setup of multiple cameras or optical sensors able to track the position and motion of subjects at very high frequency and precision. There are three types of optical MoCap systems: Passive-Markers, Active-Markers, and Markerless. Passive-Markers systems track retroreflective markers via infrared cameras, and they are the most accurate and common method. Active-Markers systems use special cameras able to track LED markers emitting light. Markerless techniques, on the other hand, use specialised software to track the subjects. Markerless systems are less invasive, but they lack tracking precision and data collection frequency. Vicon Motion Systems [90] offers a large range of Passive-Marker tracking systems with different prices and performances. Vicon systems are widely used in military, entertainment, sports, medical applications, and robotics. In most of the areas of application of cameras for health care, video monitoring systems must be able to track and identify the subjects of interest, and understand their
Sensor Datasets for Human Daily Safety and Well-Being
13
behaviour. For this reason, video action recognition and prediction are fundamental features of health monitoring systems based on cameras. Video action recognition can be split into two main steps: action representation and action classification [91]. Traditionally, handcrafted features are used to represent the actions [92, 93] and standard classifiers are used to recognise the action (e.g. SVN, k-means). With the improvements in computational power and the increase in quality and size of video action recognition datasets [94–96], CNNs are now the de facto standard to extract features for action representation [97–99]. Since most of the models used to process videos from cameras are based on DL and CNNs, dataset collection is a crucial factor to achieve successful training of the models. DL models are eager for data, and their training process requires large and varied datasets. While several large RGB video datasets for action classification exist [100–108], RGB-D datasets tend to be smaller and less varied [86, 109–115]. With an efficient labeling process, large and varied RGB datasets can be created from existing video extracted from movies or online video sharing platforms like youtube [100, 101, 103, 104, 106, 107]. On the other hand, RGB-D datasets must be collected for each specific case, and, due to the limitation of the sensors, are usually staged indoors and in controlled environments. Another issue is the lack of a video dataset specific for health care and rehabilitation (i.e. Human Mortality Database (HMD) to track head movements [116] or the University of Idaho-Physical Rehabilitation Movement Data (UI-PRMD) [86] with RGB, depth and skeleton data of rehabilitation exercises). In health care video analysis, datasets are often collected in hospitals, care centers, or other medical facilities, having patients as subjects. For this reason, data collection is ruled by several safety protocols and the process becomes very time-consuming. Moreover, medical data are protected by strict privacy protocols, making it difficult to obtain past recordings from hospitals. Finetuning DL models on large generic video datasets is a common strategy to relax the problem of data shortage. For example, Kinetics [106] and UCF101 [107] are large RGB video action recognition datasets collected from youtube that can be used for finetuning. Moreover, datasets collected for other application areas can contain subjects performing movements and actions similar to those performed in health care and rehabilitation scenarios. Therefore, DL models for health care action recognition can be trained using datasets of sport [100, 104], day-life activities [102, 105, 108–111] or surveillance videos. The Toyota Smart home dataset [109], for example, contains a relatively large collection of RGB, depth, and skeleton videos of elders performing day-life activities. Data augmentation techniques address the lack of data by artificially generating new ones. In this way, small dataset collected from cameras can be expanded and used for training. Image data augmentation techniques vary from the most basic ones (noise injection, cropping, flipping, rotating, translating, and histogram and RGB values alteration) [117], to more complex ones (data augmentation using generative adversarial networks GAN [118], domain randomization [119], and data augmentation using simulation [120]). Thanks to the advancement in the video games industry, modern game engines (Unity [121] and Unreal Engine [122] among others) are able to generate in a few milliseconds photo-realistic images, simulate realistic physical
14
N. Cauli et al.
interaction between objects, and they offer powerful scripting and designing tools to recreate detailed artificial scenes. Exploiting game engines, researchers can create varied and faithful to reality synthetic datasets to train large DL models for video action recognition. A deeper analysis of video data augmentation through simulation is tackled in the recent survey of N. Cauli and D. Reforgiato Recupero [123]. In the next section, we present the Dr. VCoach project [124], a case study of video action recognition in a health care scenario. The goal of this ongoing project is the study and development of a robotic coach able to help and monitor elders during daily light training sessions, crucial to promoting active aging. The robotic coach will be equipped with an action recognition and prediction DL model working only on RGB camera image streams. Using the action recognition and prediction module the robot will monitor the execution of the exercises performed by the elders and correct them in case of mistakes. The project will also introduce a procedure for the creation of a new video dataset containing sets of exercises performed by elders that will be augmented in simulation.
3.1 Case Study: Robotic Coach for Active Ageing This section introduces an application of camera sensors in the health care domain: the Dr. VCoach project. Dr. VCoach project received funding as Individual Fellowship in the Marie Sklodowska-Curie Actions European Union’s programme. The project started in September 2021 and will end in December 2023. The output of Dr. VCoach project will be a robotic coach able to assist elders on their physical training routine in order to promote active aging. The increase in global wealth and well-being results in an increased percentage of the elder population. This older slice of the population needs daily assistance and monitoring in order to live a safe and productive life. Several researchers focused their work on assistive technologies and monitoring systems for elders, but prevention is better than cure. For this reason, technologies promoting active aging appear to be the right solution to increase elders’ independence and well-being [125, 126]. In order to guarantee that people stay in charge of their own lives for as long as possible, regular physical activity is critical. The elder must perform a daily training schedule regularly and correctly. Usually, this is not the case and the presence of a caregiver to monitor the elder activities and progress is needed. This solution is not scalable and researchers are trying to automatise the teaching and monitoring process. The idea is to have a virtual coach able to propose exercises, monitor the execution (possibly correcting errors), and send the results to a doctor [126–128]. Even if a computer, a screen, and a camera are enough to implement a virtual coach, humanoid robots appear to be a promising solution [129]. Interacting with a robot able to move and show emotions is more engaging than staring at a computer screen. Also, a humanoid robot is able to show and perform by it-self the exercise, making the movement easier to understand by the elder. For this particular setup, a small humanoid robot like the Nao [130] is an optimal choice. In scenarios with
Sensor Datasets for Human Daily Safety and Well-Being
15
complex interaction levels (supporting the elder during exercise, preventing falls, or assisting the elder), more advanced robotic platforms are needed (e.g. Pepper [131], DoRo [125]). The core feature of a robotic coach is the ability to recognise and predict human actions from its onboard RGB/RGB-D cameras. In order to use the time information of video sequences, researchers use different approaches: 3D convolutions to take into consideration space and time [97], multi-stream networks using both optical flow and RGB images as input [98], and hybrid networks that fuse together CNNs with recurrent neural networks (RNNs) architectures [99]. The most recent action prediction systems tend to use the combination of CNNs and RNNs [132]. In order to monitor the correct execution of a particular action, there is no need to specifically classify the action performed. In the Predictive Coding cognition theory [133], the brain is constantly predicting the sensory outcome (top-down process) and comparing it with the actual one. At the same time, the error between predicted and actual sensory stimuli is back-propagated to the highest layers (bottom-up process) in order to revise and update the internal predictive models (a similar idea applied to robot control was studied by the Principal Investigator (PI) under the name of Expected Perception [134, 135]). Several researchers already implemented models able to predict future frames based on an expected action and past frames [136, 137]. Jun Tani implemented on robotics platforms models based on the Predictive Coding paradigm [138]. One of the most recent is the Predictive Visuo-Motor Deep Dynamic Neural Network (P-VMDNN) [139]. This Deep-RNN model can be used both to predict visual and proprioceptive stimuli and to recognise an action performed by a human placed in front of the robot. Comparing the predicted motion with the one performed by the elder, it is possible to detect mistakes in the execution of the exercise. The comparison between the predicted and actual state (motion features or directly the entire frame) can also be used by a robot to spot the physical limitations of the elder and the predictive model can be updated accordingly. To successfully understand the directives of a human user, a robotic coach also needs speech recognition and natural language processing (NLP) modules. With the introduction of cloud processing power and the advancement in machine learning and artificial neural networks, robust speech recognition software programs are now available (e.g. Google Cloud Speech-to-Text [140]). The main objective of the DR VCoach project is the following: Design and implement an affordable robotic coach able to propose a proper exercise schedule based on human directives, monitor the exercise performed by the patient/elder and correct it in case of mistakes.
The output of the project will be a robotic coach to assist the elders during their daily physical training. The robot will be able to understand verbal commands from the user (e.g. what will be the training schedule of today? Today I feel tired, can we do a lighter training?), to define the sequence of exercises to be performed, to show the exercises to the user (with a verbal description and performing them by itself), to
16
N. Cauli et al.
Fig. 4 The robotic coach architecture
monitor the user performing the exercises using RGB cameras, to eventually find some errors in the execution and to suggest a correction to the mistake. Figure 4 shows the proposed architecture for a robotic coach. The whole system can be divided into four modules: . Speech module: This is the module in charge of the vocal interaction between the robot and the elders. A Natural Language Processing (NLP) sub-module interprets the elders’ instructions received via the robot’s embedded microphones. The instructions are used to select the proper exercise to be sent to the Exercise scheduler. A speech generator sub-module informs the elders, through the robot’s speakers, on how well they are performing the exercises and what they need to improve in their execution. . Exercise scheduler: The role of this module is to break the selected exercise into atomic actions, and send these actions to the Error detector and Action recognition and prediction modules. Moreover, this module contains a controller that translates the actions in motor commands to send to the robot. . Error detector: The Error detector module analyses the results coming from the Action recognition and prediction module based on the required actions received by the Exercise scheduler. After evaluating the performed action, the module sends an evaluation report to the speech module in order to inform the elder. . Action recognition and prediction: This module has the dual task of recognising the action performed by the elder using the RGB videos coming from its embedded camera and predicting the future viso-proprioceptive stimuli based on the action that is being performed. The predicted frames and encoders’ positions are sent to the Error detector module to be analysed. Even if several datasets for video action recognition exist, none of them cover completely the type of movements and labels needed by the project. For this reason, to train the deep learning action prediction system, a new dataset must be created. Intel RealSense D455 (see Fig. 5) depth camera will be used to collect a relatively small dataset of RGB-D videos of people performing the exercises in the lab environment. In parallel, a set of patients will be selected to expand this initial dataset with clinical subjects. The core dataset will be augmented in simulation using a synthetic image generator implemented in Unity by one of the authors of this chapter (see Sect. 3.1.1
Sensor Datasets for Human Daily Safety and Well-Being
17
Fig. 5 Intel RealSense D455 camera
for more information). Using simulated avatars moved with the originally recorded data and adding randomization to the motion, background, lighting, and camera position, a large augmented dataset will be created. The system will be implemented on Zora [141] (a NAO robot with a software layer to make it usable by non-ICT people). The NAO robot is frequently used in studies on assistive robotics focusing on social interactions and affectivity. In medical and physical intervention the NAO robot is mainly used as a motivator or a demonstrator (see the complete review of A. Robaczewski et al. [142]). The research focus of the project will be on the action recognition and prediction module (introduction of a new action recognition model, creation of a novel dataset, and definition of the actions). At the end of the project, the robotic coach will be deployed and tested on patients in a real-life scenario (clinical study). The focus of the project will be on developing technologies to promote active aging. Even so, the impact of the project is not limited to the area of active aging. A system able to predict the intended actions of humans filmed by a camera is extremely important in multiple areas: visual surveillance, video retrieval, entertainment, human-robot interaction, and autonomous vehicles to mention a few. Also, making available a tool to generate an action prediction dataset using simulation can save a huge amount of time in the labeling process.
3.1.1
Synthetic Video Generator for Data Augmentation
During the project, a new dataset containing video clips and skeleton data of single subjects performing specific training exercises will be collected. It would be excessively time-consuming to record enough data to train the action recognition and prediction model. To solve the data shortage problem, we implemented a synthetic video generator using Unity and skeleton data collected via an Intel RealSense D455 camera (a screenshot of the generator rendering a video sequence is depicted in Fig. 6). The videos rendered using the generator are highly randomizable. Tweaking specific parameters it is possible to generate video with random lighting, skybox images, camera resolution, recording fps, camera position, animations, animation speed, animation mirroring, subject 3D models, and floor textures (Fig. 7 shows the settings menu of the generator). The output of the generator is the rendered RGB frames and the skeleton joints’ positions for each frame. Using the generator we are able to augment an original small video action recognition dataset with highly randomized synthetic images,
18
N. Cauli et al.
Fig. 6 The synthetic video generator in action. The program shows the randomized video scene that is being rendered
Fig. 7 The settings menu of the synthetic video generator
improving the training performance and making the training even possible for large models where the variability and size of the training set are crucial.
Sensor Datasets for Human Daily Safety and Well-Being
19
4 Conclusion and Open Challenges The increase in life expectancy has determined an increase in the number of people needing specific support for carrying out their daily living activities, including the elderly, frail, and disabled persons. Fortunately, the recent availability of sensor devices and artificial intelligence methods provides us with novel opportunities to support those people in an effective and sustainable manner. In this paper, we considered two relevant classes of sensors: air quality sensors, and environmental cameras. We have presented the current technology, its applications for human daily safety and well-being, the data collection process, and the available datasets. We also illustrated advanced systems that exploit air quality sensors and cameras for context awareness and interaction with the user. Although these technologies provide exciting opportunities for improving ambient assisted living tools, several research challenges should be addressed. A fundamental issue regards the protection of privacy. Of course, the observation of videos, especially in private homes, may put the user’s privacy at risk. Similarly, the analysis of air quality data may reveal private aspects such as the execution of certain activities (e.g., cooking), or the presence/absence of people in given rooms [143]. Hence, those solutions should be implemented by adopting a privacy-by-design approach [144]. It is also necessary to devise effective cyber-security methods to avoid the involuntary disclosure of sensitive sensor data. This aspect is particularly important in Internet of Things environments, which are prone to several kinds of attacks [145]. Finally, the use of AI-based methods for supporting frail and disabled people poses different challenges regarding ethical and legal aspects, including algorithmic fairness and bias, informed consent acquisition, and safety [146].
References 1. Batista, E., Moncusi, M. A., & L´opez-Aguilar, P., Mart´ınez-Ballest´e, A., & Solanas, A. (2021). Sensors for context-aware smart healthcare: A security perspective. Sensors, 21(20), 6886. 2. Abdullah, A., Ismael, A., Rashid, A., Abou-ElNour, A., & Tarique, M. (2015). Real time wireless health monitoring application using mobile devices. International Journal of Computer Networks & Communications (IJCNC), 7(3), 13–30. 3. Agg, C., & Khimji, S. (2021). Perception of wellbeing in educational spaces. Building Services Engineering Research and Technology, 42(6), 677–689. 4. Angelucci, A., Kuller, D., & Aliverti, A. (2020). A home telemedicine system for continuous respiratory monitoring. IEEE Journal of Biomedical and Health Informatics, 25(4), 1247– 1256. 5. Rantas, J., Wang, D., Jarrard, W., Sterchi, J., Wang, A., Varnosfaderani, M. P., & Heydarian, A. (2021). A user interface informing medical staff on continuous indoor environmental quality to support patient care and airborne disease mitigation. In 2021 Systems and Information Engineering Design Symposium (SIEDS) (pp. 1–6). IEEE. 6. Zhang, H., & Srinivasan, R. (2020). A systematic review of air quality sensors, guidelines, and measurement studies for indoor air quality management. Sustainability, 12(21), 9045.
20
N. Cauli et al.
7. Williams, R., Kilaru, V., Snyder, E., Kaufman, A., Dye, T., Rutter, A., Russell, A., & Hafner, H. (2014). Air sensor guidebook. US Environmental Protection Agency. 8. Epa sources greenhouse gas emissions page. https://www.epa.gov/ghgemissions/sources-gre enhouse-gas-emissions. Accessed 12 October 2022. 9. Gss homepage. https://www.gassensing.co.uk/. Accessed 12 October 2022. 10. Sharp microelectronics environmental-sensors page. https://www.mouser.it/c/sensors/enviro nmental-sensors/. Accessed 12 October 2022. 11. Co2 meter homepage. https://www.co2meter.com/. Accessed 12 October 2022. 12. Figaro homepage. https://www.figaro.co.jp/en/. Accessed 12 October 2022. 13. Netatmo air quality sensors page. https://www.netatmo.com/en-gb/aircare/homecoach. Accessed 12 October 2022. 14. Nissha gas sensors page. https://www.nissha.com/english/products/allproducts/gas.html. Accessed 12 October 2022. 15. Scienoc gas sensors and detectors page. https://www.scienoc.com/gas_sensors_and_detect ors.html. Accessed 12 October 2022. 16. Foobot homepage. https://foobot.io/features/. Accessed 12 October 2022. 17. Ohmtech.io homepage. https://ohmtech.io/. Accessed 12 October 2022. 18. Spec sensors homepage. https://www.spec-sensors.com/. Accessed 12 October 2022. 19. Yoctopuce/Usb environmental sensors page. https://www.yoctopuce.com/EN/products/cat egory/usb-environmental-sensors. Accessed 12 October 2022. 20. Awair homepage. https://uk.getawair.com/. Accessed 12 October 2022. 21. Air mentor homepage. http://www.airmentor.eu/products.html. Accessed 12 October 2022. 22. Elt sensor homepage. http://eltsensor.co.kr/main. Accessed 12 October 2022. 23. Monnit air quality sensors page. https://www.monnit.com/products/sensors/air-quality/pmmeter/. Accessed 12 October 2022. 24. Monnit gas detection sensors page. https://www.monnit.com/products/sensors/gas-detection/ . Accessed 12 October 2022. 25. Monnit humidity sensors page. https://www.monnit.com/products/sensors/humidity/rh/. Accessed 12 October 2022. 26. Monnit temperature sensors page. https://www.monnit.com/products/sensors/temperature/. Accessed 12 October 2022. 27. Plantower homepage. https://www.plantower.com/en/. Accessed 12 October 2022. 28. Sensirion homepage. https://sensirion.com/. Accessed 12 October 2022. 29. Sgx sensortech homepage. https://www.sgxsensortech.com/. Accessed 12 October 2022. 30. Shinyei humidity sensors page. https://www.shinyei.co.jp/stc/eng/products/humidity/index. html. Accessed 12 October 2022. 31. Shinyei iaq sensors page. https://www.shinyei.co.jp/stc/eng/products/iaq/index.html. Accessed 12 October 2022. 32. Shinyei particle sensors page. https://www.shinyei.co.jp/stc/eng/products/optical/index.html. Accessed 12 October 2022. 33. Telaire homepage. https://www.amphenol-sensors.com/en/telaire. Accessed 12 October 2022. 34. Renesas environmental sensors page. https://www.renesas.com/sg/en/products/sensor-pro ducts/environmental-sensors. Accessed 12 October 2022. 35. Renesas humidity sensors page. https://www.renesas.com/sg/en/products/sensor-products/ humidity-sensors. Accessed 12 October 2022. 36. Renesas temperature sensors page. https://www.renesas.com/sg/en/products/sensor-products/ temperature-sensors. Accessed 12 October 2022. 37. Hanwei domestic gas alarm products page. https://www.hwsensor.com/domestic-gas-alarmproducts. Accessed 12 October 2022. 38. Hanwei industrial gas detections page. https://www.hwsensor.com/industrial-gas-detection/. Accessed 12 October 2022. 39. Alphasense homepage. https://www.alphasense.com/. Accessed 12 October 2022. 40. uhoo homepage. https://getuhoo.com/. Accessed 12 October 2022. 41. Winsen homepage. https://www.winsen-sensor.com/. Accessed 12 October 2022.
Sensor Datasets for Human Daily Safety and Well-Being
21
42. World Health Organization, et al. (2016). Ambient air pollution: A global assessment of exposure and burden of disease. 43. Center for Disease Control. (CDC, P., et al.: Populations at risk from particulate air pollution— United states, 1992. MMWR. Morbidity and Mortality Weekly Report, 43(16), 290–293. 44. National ambient air quality standards page. https://www.epa.gov/naaqs. Accessed 12 October 2022. 45. Canadian ambient air quality standards page. https://ccme.ca/en/air-quality-report. Accessed 12 October 2022. 46. Office of safety and health administration page. https://www.osha.gov/indoor-air-quality. Accessed 12 October 2022. 47. National institute for occupational safety and health page. https://www.cdc.gov/niosh/. Accessed 12 October 2022. 48. American conference of governmental industrial hygienists page. https://www.acgih.org/. Accessed 12 October 2022. 49. American society of heating, refrigerating and air-conditioning engineers page. https://www. ashrae.org/technical-resources/bookstore/indoor-air-quality-guide. Accessed 12 October 2022. 50. World Health Organization page. https://www.who.int/news-room/feature-stories/detail/ what-are-the-who-air-quality-guidelines. Accessed 12 October 2022. 51. Openaq air quality datasets download page. https://openaq.org/#/locations?page=1. Accessed 12 October 2022. 52. Chen, M., Yang, J., Hu, L., Hossain, M. S., & Muhammad, G. (2018). Urban healthcare big data system based on crowdsourced and cloud-based air quality indicators. IEEE Communications Magazine, 56(11), 14–20. 53. Ramos, F., Trilles, S., & Mu˜noz, A., Huerta, J. (2018). Promoting pollution-free routes in smart cities using air quality sensor networks. Sensors, 18(8), 2507. 54. Jaimini, U., Banerjee, T., Romine, W., Thirunarayan, K., Sheth, A., & Kalra, M. (2017). Investigation of an indoor air quality sensor for asthma management in children. IEEE Sensors Letters, 1(2), 1–4. 55. Semple, S., Ibrahim, A. E., Apsley, A., Steiner, M., & Turner, S. (2015). Using a new, lowcost air quality sensor to quantify second-hand smoke (shs) levels in homes. Tobacco Control, 24(2), 153–158. 56. Peladarinos, N., Cheimaras, V., Piromalis, D., Arvanitis, K. G., Papageorgas, P., Monios, N., Dogas, I., Stojmenovic, M., & Tsaramirsis, G. (2021). Early warning systems for COVID-19 infections based on low-cost indoor air-quality sensors and lpwans. Sensors, 21(18), 6183. 57. Iskandaryan, D., Ramos, F., & Trilles, S. (2020). Air quality prediction in smart cities using machine learning technologies based on sensor data: A review. Applied Sciences, 10(7), 2401. 58. Gerina, F., Massa, S. M., Moi, F., Reforgiato Recupero, D., & Riboni, D. (2020). Recognition of cooking activities through air quality sensor data for supporting food journaling. Humancentric Computing and Information Sciences, 10(1), 1–26. 59. DiFilippo, K. N., Huang, W. H., Andrade, J. E., & Chapman-Novakofski, K. M. (2015). The use of mobile apps to improve nutrition outcomes: A systematic literature review. Journal of Telemedicine and Telecare, 21(5), 243–253. 60. World Health Organization global nutrition report page. https://www.who.int/nutrition/global nutritionreport/en/. Accessed 12 October 2022. 61. World Health Organization obesity and overweight detail page. https://www.who.int/newsroom/fact-sheets/detail/obesity-and-overweight. Accessed 12 October 2022. 62. Flegal, K. M., Graubard, B. I., Williamson, D. F., & Gail, M. H. (2005). Excess deaths associated with underweight, overweight, and obesity. JAMA, 293(15), 1861–1867. 63. Bouwman, L., Hiddink, G., Koelen, M., Korthals, M., Van’t Veer, P., & Van Woerkum, C. (2005). Personalized nutrition communication through ICT application: How to overcome the gap between potential effectiveness and reality. European Journal of Clinical Nutrition, 59(1), S108–S116.
22
N. Cauli et al.
64. Marr, J. W. (1971). Individual dietary surveys: Purposes and methods. World Review of Nutrition and Dietetics, 13, 105–164. 65. Bruno, V., Resende, S., & Juan, C. (2017). A survey on automated food monitoring and dietary management systems. Journal of Health & Medical Informatics, 8(3). https://doi.org/ 10.4172/2157-7420.1000272 66. Ahmed, S., Srinivasu, P. N., Alhumam, A., & Alarfaj, M. (2022). AAL and internet of medical things for monitoring type-2 diabetic patients. Diagnostics, 12(11), 2739. https://doi.org/10. 3390/diagnostics12112739 67. Zhu, F., Bosch, M., Woo, I., Kim, S., Boushey, C. J., Ebert, D. S., & Delp, E. J. (2010). The use of mobile devices in aiding dietary assessment and evaluation. IEEE Journal of Selected Topics in Signal Processing, 4(4), 756–766. 68. Amft, O., St¨ager, M., Lukowicz, P., & Tr¨oster, G. (2005). Analysis of chewing sounds for dietary monitoring. International Conference on Ubiquitous Computing (pp. 56–72). Springer. 69. Mankoff, J., Hsieh, G., Hung, H. C., Lee, S., & Nitao, E. (2002). Using low-cost sensing to support nutritional awareness. International Conference on Ubiquitous Computing (pp. 371– 378). Springer. 70. Casas, J., Mugellini, E., & Khaled, O. A. (2018). Food diary coaching chatbot. In Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers (pp. 1676–1680). 71. Sen, S., Subbaraju, V., Misra, A., Balan, R., & Lee, Y. (2018). Annapurna: Building a realworld smartwatch-based automated food journal. In 2018 IEEE 19th International Symposium on “A World of Wireless, Mobile and Multimedia Networks (WoWMoM) (pp. 1–6). IEEE. 72. Chi, P. Y. P., Chen, J. H., Chu, H. H., & Lo, J. L. (2008). Enabling calorie-aware cooking in a smart kitchen. International Conference on Persuasive Technology (pp. 116–127). Springer. 73. Yordanova, K., L¨udtke, S., Whitehouse, S., Kr¨uger, F., Paiement, A., Mirmehdi, M., Craddock, I., & Kirste, T. Analysing cooking behaviour in home settings: Towards health monitoring. Sensors, 19(3), 646. 74. Yordanova, K., Whitehouse, S., Paiement, A., Mirmehdi, M., Kirste, T., & Craddock, I. (2017). What’s cooking and why? behaviour recognition during unscripted cooking tasks for health monitoring. In 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops) (pp. 18–21). IEEE. 75. Oh, H., Nguyen, J., Soundararajan, S., & Jain, R. (2018). Multimodal food journaling. In Proceedings of the 3rd International Workshop on Multimedia for Personal Health and Health Care (pp. 39–47). 76. Cordeiro, F., Epstein, D. A., Thomaz, E., Bales, E., Jagannathan, A. K., Abowd, G. D., & Fogarty, J. (2015). Barriers and negative nudges: Exploring challenges in food journaling. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 1159–1162). 77. Celino, I., & Calegari, G. R. (2020). Submitting surveys via a conversational interface: An evaluation of user acceptance and approach effectiveness. International Journal of HumanComputer Studies, 139, 102410. 78. Riboni, D. (2019). Opportunistic pervasive computing: Adaptive context recognition and interfaces. CCF Transactions on Pervasive Computing and Interaction, 1(2), 125–139. 79. Wildenbos, G. A., Peute, L., & Jaspers, M. (2018). Aging barriers influencing mobile health usability for older adults: A literature based framework (mold-us). International Journal of Medical Informatics, 114, 66–75. 80. Guenther, P. M., Casavale, K. O., Reedy, J., Kirkpatrick, S. I., Hiza, H. A., Kuczynski, K. J., Kahle, L. L., & Krebs-Smith, S. M. (2013). Update of the healthy eating index: Hei-2010. Journal of the Academy of Nutrition and Dietetics, 113(4), 569–580. 81. Romano, K. A., Swanbrow Becker, M. A., Colgary, C. D., & Magnuson, A. (2018). Helpful or harmful? The comparative value of self-weighing and calorie counting versus intuitive eating on the eating disorder symptomology of college students. Eating and Weight Disorders-Studies on Anorexia, Bulimia and Obesity, 23(6), 841–848.
Sensor Datasets for Human Daily Safety and Well-Being
23
82. Michie, S., West, R., Sheals, K., & Godinho, C. A. (2018). Evaluating the effectiveness of behavior change techniques in health-related behavior: A scoping review of methods used. Translational Behavioral Medicine, 8(2), 212–224. 83. Dhiman, C., & Vishwakarma, D. K. (2019). A review of state-of-the-art techniques for abnormal human activity recognition. Engineering Applications of Artificial Intelligence, 77, 21–45. https://www.sciencedirect.com/science/article/pii/S0952197618301775 84. Rajavel, R., Ravichandran, S. K., Harimoorthy, K., Nagappan, P., & Gobichettipalayam, K. R. (2022). Iot-based smart healthcare video surveillance system using edge computing. Journal of Ambient Intelligence and Humanized Computing, 13(6), 3195–3207. 85. Joshi, M. L., & Kanoongo, N. (2022). Depression detection using emotional artificial intelligence and machine learning: A closer review. Materials Today: Proceedings, 58, 217–226. 86. Vakanski, A., Jun, H. P., Paul, D., & Baker, R. (2018). A data set of human body movements for physical rehabilitation exercises. Data, 3(1), 2. 87. Nino, C., & Diego, R. (2020). Video action recognition and prediction architecture for a robotic coach. In 1st Workshop on Smart Personal Health Interfaces, SmartPhil 2020, 2596, 69–77. 88. Microsoft: Azure kinect homepage. https://azure.microsoft.com/en-us/products/kinect-dk/# overview. Accessed 5 October 2022. 89. Intel: Realsense homepage. https://www.intel.com/content/www/us/en/architecture-and-tec hnology/realsense-overview.html. Accessed 5 October 2022. 90. Ltd, V. M. S.: Vicon homepage. https://www.vicon.com/. Accessed 5 October 2022. 91. Kong, Y., & Fu, Y. (2018). Human action recognition and prediction: A survey. arXiv:1806. 11230. 92. Jia, K., & Yeung, D. Y. (2008). Human action recognition using local spatio-temporal discriminant embedding. In 2008 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE. 93. Yuan, C., Wu, B., Li, X., Hu, W., Maybank, S., & Wang, F. (2016). Fusing R features and local features with context-aware kernels for action ecognition. International Journal of Computer Vision, 118(2), 151–171. 94. Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., & Vijaynarasimhan, S. (2016). Youtube-8m: A large-scale video classification benchmark. arXiv: 1609.08675. 95. Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., et al. (2017). The kinetics human action video dataset. arXiv: 1705.06950. 96. Monfort, M., Andonian, A., Zhou, B., Ramakrishnan, K., Bargal, S. A., Yan, Y., Brown, L., Fan, Q., Gutfreund, D., Vondrick, C., et al. (2019). Moments in time dataset: One million videos for event understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence. 97. Ji, S., Xu, W., Yang, M., & Yu, K. (2012). 3d convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 221–231. 98. Simonyan, K., & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in Neural Information Processing Systems (pp. 568–576). 99. Yue-Hei Ng, J., Hausknecht, M., Vijayanarasimhan, S., Vinyals, O., Monga, R., & Toderici, G. (2015). Beyond short snippets: Deep networks for video classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4694–4702). 100. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Largescale video classification with convolutional neural networks. CVPR (pp. 1725–1732). 101. Naga Srinivasu, P., JayaLakshmi, G., Jhaveri, R. H., & Phani Praveen, S. P. (2022). Ambient assistive living for monitoring the physical activity of diabetic adults through body area networks. Mobile Information Systems, Article ID 3169927, 18. https://doi.org/10.1155/2022/ 3169927
24
N. Cauli et al.
102. Li, W., Mahadevan, V., & Vasconcelos, N. (2013). Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), 18–32. 103. Marsza lek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In IEEE Conference on Computer Vision & Pattern Recognition. 104. Rodriguez, M. D., Ahmed, J., & Shah, M. (2008). Action mach a spatio-temporal maximum average correlation height filter for action recognition. 2008 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE. 105. Singh, S., Velastin, S. A., & Ragheb, H. (2010). Muhavi: A multicamera human action video dataset for the evaluation of action recognition methods. In Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International Conference on (pp. 48–55). IEEE. 106. Smaira, L., Carreira, J., Noland, E., Clancy, E., Wu, A., & Zisserman, A. (2020). A short note on the kinetics-700–2020 human action dataset. arXiv:2010.10864. 107. Soomro, K., Zamir, A. R., & Shah, M. (2012). Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv:1212.0402 (2012) 108. Weinland, D., Ronfard, R., & Boyer, E. (2006). Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 104(2–3), 249–257. 109. Das, S., Dai, R., Koperski, M., Minciullo, L., Garattoni, L., Bremond, F., & Francesca, G. (2019). Toyota smarthome: Real-world activities of daily living. In Proceedings of the IEEE/ CVF International Conference on Computer Vision (pp. 833–842). 110. Liu, J., Shahroudy, A., Perez, M., Wang, G., Duan, L. Y., & Kot, A. C. (2019). Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10), 2684–2701. 111. Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1290–1297). IEEE. 112. Xia, L., Chen, C., & Aggarwal, J. (2012). View invariant human action recognition using histograms of 3d joints. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference (pp. 20–27). IEEE. 113. XN, W. L. Northwestern-UCLA multiview action 3d dataset. http://wangjiangb.github.io/my_ data.html. Accessed 5 October 2022. 114. Yun, K., Honorio, J., Chattopadhyay, D., Berg, T. L., & Samaras, D. (2012). Two-person interaction detection using body-pose features and multiple instance learning. In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (pp. 28–35). IEEE. 115. Zhang, J., Li, W., Wang, P., Ogunbona, P., Liu, S., & Tang, C. (2016). A large scale rgb-d dataset for action recognition. In International workshop on understanding human activities through 3D sensors (pp. 101–114). Springer. 116. Corbillon, X., De Simone, F., & Simon, G. (2017). 360-degree video head movement dataset. Proceedings of the 8th ACM on Multimedia Systems Conference (pp. 199–204). 117. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, K. Q. Weinberger (Eds.), Advances in neural information processing systems (vol. 25). Curran Associates. 118. Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4401–4410). 119. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2017). Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/ RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 23–30). IEEE. 120. Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., Birchfield, S. (2018). Deep object pose estimation for semantic robotic grasping of household objects. arXiv:1809.10790. 121. Technologies, U. Unity homepage. https://unity.com/. Accessed 5 October 2022. 122. Games, E. Unreal engine homepage. https://www.unrealengine.com/en-US/. Accessed 5 October 2022
Sensor Datasets for Human Daily Safety and Well-Being
25
123. Cauli, N., & Reforgiato Recupero, D. (2022). Survey on videos data augmentation for deep learning models. Future Internet, 14(3), 93. 124. Cauli, N. Dr.vcoach website. https://drvcoach.unica.it/about.html. Accessed 5 October 2022. 125. Cavallo, F., Limosani, R., Manzi, A., Bonaccorsi, M., Esposito, R., Di Rocco, M., Pecora, F., Teti, G., Saffiotti, A., & Dario, P. (2014). Development of a socially believable multi-robot solution from town to home. Cognitive Computation, 6(4), 954–967. 126. Parra, C., Silveira, P., Far, I. K., Daniel, F., De Bruin, E. D., Cernuzzi, L., D’Andrea, V., Casati, F., et al. (2014). Information technology for active ageing: A review of theory and practice. Foundations and Trends® in Human–Computer Interaction, 7(4), 351–448. 127. Albaina, I. M., Visser, T., Van Der Mast, C. A., & Vastenburg, M. H. (2009). Flowie: A persuasive virtual coach to motivate elderly individuals to walk. In 2009 3rd International Conference on Pervasive Computing Technologies for Healthcare (pp. 1–7). IEEE. 128. Institute for Systems and Robotics, & Lisbon, R. Aha project homepage. http://welcome.isr. tecnico. ulisboa.pt/aha-project/. Accessed 5 October 2022. 129. ˇCai´c, M., Avelino, J., Mahr, D., Odekerken-Schr¨oder, G., & Bernardino, A. (2020). Robotic versus human coaches for active aging: An automated social presence perspective. International Journal of Social Robotics, 12(4), 867–882. 130. Aldebaran: Nao robot homepage. https://www.aldebaran.com/en/nao. Accessed 5 October 2022. 131. Aldebaran: Pepper robot homepage. https://www.aldebaran.com/en/pepper. Accessed 5 October 2022. 132. Lee, N., Choi, W., Vernaza, P., Choy, C. B., Torr, P. H., & Chandraker, M. (2017). Desire: Distant future prediction in dynamic scenes with interacting agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 336–345). 133. Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1), 79. 134. Barrera, A., & Laschi, C. (2010). Anticipatory visual perception as a bio-inspired mechanism underlying robot locomotion. In 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology (pp. 3206–3209). IEEE. 135. Cauli, N., Falotico, E., Bernardino, A., Santos-Victor, J., & Laschi, C. (2016). Correcting for changes: Expected perception-based control for reaching a moving target. IEEE Robotics & Automation Magazine, 23(1), 63–70. 136. Finn, C., Goodfellow, I., & Levine, S.: Unsupervised learning for physical interaction through video prediction. Advances in Neural Information Processing Systems (pp. 64–72). 137. Jung, M., Matsumoto, T., & Tani, J. (2019). Goal-directed behavior under variational predictive coding: Dynamic organization of visual attention and working memory. arXiv:1903. 04932. 138. Tani, J. (2016). Exploring robotic minds: Actions, symbols, and consciousness as selforganizing dynamic phenomena. Oxford University Press. 139. Hwang, J., Kim, J., Ahmadi, A., Choi, M., & Tani, J. (2018). Dealing with large-scale spatiotemporal patterns in imitative interaction between a robot and a human by using the predictive coding framework. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 140. Google: Speech to text homepage. https://cloud.google.com/speech-to-text/. Accessed 5 October 2022. 141. Robotics, Z. Zora/nao robot hompage. https://www.zorarobotics.be/robots/nao. Accessed 5 October 2022. 142. Pulido, J. C., Suarez-Mejias, C., Gonzalez, J. C., Ruiz, A. D., Ferri, P. F., Sahuquillo, M. E. M., De Vargas, C. E. R., Infante-Cossio, P., Calderon, C. L. P., & Fernandez, F. (2019). A socially assistive robotic platform for upper-limb rehabilitation: A longitudinal study with pediatric patients. IEEE Robotics & Automation Magazine, 26(2), 24–39. 143. Gerina, F., Pes, B., Reforgiato Recupero, D., & Riboni, D. (2019). Toward supporting food journaling using air quality data mining and a social robot. European Conference on Ambient Intelligence (pp. 318–323). Springer.
26
N. Cauli et al.
144. Bettini, C., & Riboni, D. (2015). Privacy protection in pervasive systems: State of the art and technical challenges. Pervasive and Mobile Computing, 17, 159–174. 145. Toch, E., Bettini, C., Shmueli, E., Radaelli, L., Lanzi, A., Riboni, D., & Lepri, B. (2018). The privacy implications of cyber security systems: A technological survey. ACM Computing Surveys (CSUR), 51(2), 1–27. 146. Gerke, S., Minssen, T., & Cohen, G. (2020). Ethical and legal challenges of artificial intelligence-driven healthcare. Artificial intelligence in healthcare (pp. 295–336). Elsevier.
Habitpad: A Habit-Change Person-Centric Healthcare Mobile Application with Machine Leaning and Gamification Features for Obesity Wan Chai Xuan and Pantea Keikhosrokiani
Abstract Obesity increases the risk of chronic diseases and malignancies. Individuals may develop poor habits that lead to weight gain and excess body fat. To address this chronic health concern, various treatments such as dietary changes, physical exercise, weight-loss training, and the adoption of health programs promoting a balanced diet are recommended. In order to support obese patients in maintaining a healthy lifestyle, a mobile application called Habitpad is proposed in this chapter. Habitpad aims to track the data and habit changes of obese patients, providing them with valuable insights. This study utilizes data analytics and machine learning techniques to classify the level of obesity based on patients’ habit data. Several machine learning algorithms were compared, and the decision tree algorithm demonstrated the highest performance with 94% accuracy. To encourage the use of this application, gamification features including challenges, point rewards, and leaderboards are incorporated. This study anticipates reducing the obesity rate by promoting positive behavior change and healthier habits among patients. Keywords Obesity · Habit Change · Mobile Application · Data analytics · Machine learning · Classification · Gamification
1 Introduction Obesity is a complex and non-communicable disease (NCDs) defined as excess of fat in the body that can impair health. Obesity is not just a cosmetic consideration, and it is one of the most dangerous health issues in the world [1, 2]. [1] estimates that W. C. Xuan · P. Keikhosrokiani (B) School of Computer Sciences, Universiti Sains Malaysia, Minden, Penang, Malaysia e-mail: [email protected] P. Keikhosrokiani Faculty of Information Technology and Electrical Engineering, University of Oulu, Oulu, Finland Faculty of Medicine, University of Oulu, Oulu, Finland © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_2
27
28
W. C. Xuan and P. Keikhosrokiani
in 2016 around 39% of persons aged 18 and older were overweight and 13% were obese. The greatest rate of adult obesity in Southeast Asia is seen in Malaysia, where 50.1% of adults are either overweight (30.4%) or obese (19.7%) [3]. This condition occurs due to people’s unhealthy habits and lifestyles. Obesity is a substantial risk factor for a variety of chronic diseases, including cardiovascular disorders like heart disease and stroke, which are among the major causes of death worldwide. Obesity not only can affect people’s health, but also social relationship due to having poor self-image, low self-esteem, and limitations in production and movement. Patients with obesity or other chronic conditions may have negative effects from their unhealthy behaviors, including smoking, inactivity, poor eating habits, sleep deprivation, and lack of exercise. Researchers’ focus has been drawn to the significance of early illness detection through healthcare systems employing data analytics [4–15]. Therefore, in order to avoid future adverse health difficulties, a habit-change in the right and healthy path is necessary. The created mobile application keeps tabs on patients’ routines and everyday activities and records the information in a database. The standard of patient care and support may be raised with the right application and adoption of analytical techniques in healthcare [5, 16]. This chapter aims to propose habit-change mobile application with gamification features for obese patients, named Habitpad. This is an application that can help obese patients to keep track of their living habits. Furthermore, Habitpad can classify a patient’s obese level from the patient’s historical habits’ data. Hence, the changes of a patient’s habits can be analyzed using this application and shown to the doctor in a graph form. Besides, Habitpad also implements additional features like gamification features that allow users to earn points for completing tasks and redeem the points for rewards. Users can also receive badges by reaching the target of step count, and the accumulated total step count is shown in leaderboard that is visible to every user. Persuasive features that consist of reminder, tips, information, and appointment making are added in Habitpad. Gamification and persuasive features can increase engagement and encouragement among the patients, as well as increase usage rate of Habitpad. In the proposed work, machine learning techniques are used for classification in implementing Habitpad. Data classification (KNN, Decision tree, SVM) algorithms will be implemented in Habitpad. The main contributions of this chapters are as follows: . A habit-change mobile application with gamification features for obese patients, named Habitpad. . An analytical model for the classification of a patient’s obesity level based on historical habits’ data. . Comparison of machine learning models for seven obesity classes The remaining sections of this chapter are as follows: Sect. 2 reviews the related works and compares the existing systems. Section 3 inclused development methodology and introduces the proposed Habitpad mobile application. The system requirements, design, development methodology, data analytics, tests, and evaluation are included in this section. Section 4 consists of analysis, results, and discission. Finally, this chapter is wrapped with deployment strategy (Sect. 5), concluding remarks and future directions in Sect. 6.
Habitpad: A Habit-Change Person-Centric Healthcare Mobile …
29
2 Related Work 2.1 Habit Change According to [8, 17, 18], a habit is a routine of behavior that is repeated regularly and tends to occur subconsciously. Through the process of habit formation, new behaviors might become automatic. Because the behavioral patterns that humans repeat become imprinted in brain circuits, old habits are difficult to break and new habits are difficult to form, but new habits also can be formed through repetition. Habits can bring a huge impact. Bad habits can ruin people’s lives, while good habits can make them extremely successful. Obesity and sedentary lifestyle are quickly becoming healthcare concerns in modern countries due to lack of physical activity and poor diet. Hence, it is important for patients to change their poor habits to lead a healthier lifestyle. In this modern and advanced societies, habit-change can be tracked and monitored through IoT devices such as smartwatch, fitness band and smartphone to promote healthy changes in behavior. With the sophisticated sensor technologies that provide details of physiological data such as steps count, calories burnt, heart rate, water intake and blood pressure, the effectiveness of tracking habit-change will be improved. Having this habit data can help patients and doctors make better decisions in the future [9, 19].
2.2 Gamification Gamification is a technique for integrating game mechanics and design features into non-game environments. It may also be described as a collection of techniques and methods for problem-solving that employ or make use of the characteristics of game components [20, 21]. Some classic game elements are points, rewards, badges and leaderboards. Points are basic game elements used in a wide range of games and gamified applications. They are often awarded for completing certain tasks in a gamified environment, and they are used to quantitatively reflect a user’s progress. Points serve a variety of objectives, one of which is to offer feedback. Points are used to track a user’s in-game activity and provide constant and rapid feedback as well as rewards. Rewards can be either extrinsic (gift, loyalty points) or intrinsic (social status, progress). The release of dopamine in our brain is triggered by rewards, and recent research has shown that when rewards are unknown, our dopamine system skyrockets motivation. Gamification is a strong hack that concentrates attention and boosts motivation by linking behaviors to a changing reward schedule. Badges are gamification features in the form of incentives that represent a user’s achievements. For example, a user who has just completed 3,000 step count can be
30
W. C. Xuan and P. Keikhosrokiani
given a badge to show them that their accomplishment is recognized and valued, which can frequently motivate them to attain even larger goals. Users are ranked on leaderboards based on their relative success, which is measured against a set of success criteria. Leaderboards assist users achieve their full potential by competing against their peers and proving their competency to themselves. For example, to reach the top spot or advance up one rung on the ladder, the user must earn a certain number of points.
2.3 Machine Learning Classification Algorithms Classification is a type of “pattern recognition” in which classification algorithms are applied to training data in order to locate the same pattern in new data sets. There are some algorithms in machine learning such as K-Nearest Neighbors (KNN), decision tree (DT) and Support Vector Machines (SVM). One of the simplest classification algorithms based on Supervised Learning Technique is K-Nearest Neighbors (KNN). The KNN algorithm maintains all previously collected data and classifies incoming data points depending on how closely they resemble previously collected data. This suggests that fresh data might be promptly categorized into one of the appropriate categories as it comes into existence. It is also known as a lazy learner algorithm since it keeps the dataset and performs an action on it when the time comes to classify it rather than instantly learning from the training set [15]. A supervised machine learning approach called decision trees (DT) may be applied to both classification and regression problems. It is a tree-structured classifier, where internal nodes stand in for dataset properties, branches for decision-making processes, and each leaf node just contains the result. It serves as a graphic representation of how to find every option that meets the requirements for a problem or choice [22, 23]. One of the most well-liked supervised learning techniques for handling classification and regression issues is the Support Vector Machines (SVM) algorithm. The goal of the SVM method is to determine the optimal decision boundary or line for classifying n-dimensional space into groups so that subsequent data points may be quickly assigned to the appropriate category. The optimal choice boundary is known as a hyperplane [15].
2.4 Ambient Assisted Living Technology and Remote Person-Centric Healthcare The delivery of health and social care services for patients is increasingly dependent on ambient assisted living technology. In order to boost older patients’ use of modern
Habitpad: A Habit-Change Person-Centric Healthcare Mobile …
31
technology, a variety of actions have been implemented globally. The creation of unique digital services that improve wellness or address social issues has received a lot of attention in this area [24]. The integration of information and communication technology in a patient’s life that helps the patient to stay active longer, maintain social connections, and live independently is referred to as an ambient assisted living setting. Low levels of physical activity are linked to greater death rates among the elderly, higher blood pressure, heart disease, obesity, weakened immune systems, depression, anxiety, and lower cognitive function. Ambient assisted living technologies provide an important solution to monitor patients remotely [24]. For instance, in order to manage hospital home care services, a study [25] describes an ambient assisted living healthcare architecture. The proposed remote healthcare system depends on implementing an event manager to combine sources ranging from mobile phones to web-based applications. Another study [26] detects physical activity within lifelogs to help with ambient assisted living and avoid obesity.
2.5 Comparison of Existing Habit Change Healthcare Applications Some of existing applications that fall into the same category of habit-change mobile application as our proposed solution, Habitpad, will be studied to determine the features and the technique used. The following sections focus on the discussion between these three applications that can be found in the Google Play Store – HealthifyMe, Health Infinity and Garmin Connect. Table 1 shows the comparison of features implemented in HealthifyMe, Health Infinity and Garmin Connect.
2.5.1
HealthifyMe
HealthifyMe [27] is an India’s health and fitness application which is developed by HealthifyMe, Inc. to help users improve health and lose weight. For HealthifyMe, it focuses more on helping users to lose weight and become fit by providing weight loss diet plans and personal trainers services. It helps users to implement self-care through setting different goals, known as “tracker”. Each tracker carries different goal, and the trackers are being recommended to users based on the onboarding assessment which studies an individual’s lifestyle or background information. HealthifyMe is known as India’s best dietitian app because it has the largest database of Indian foods including international cuisines and healthy recipes, from dal to dosa, with Indian serving sizes. HealthifyMe has use AI technique. User can snap pictures of food to calculate nutritional values from the food database. Besides, users can get personalized health and weight loss suggestions 24* 7 from Ria, the world’s first AIpowered nutritionist. It is driven by over 200 million food and gym logs that allow
32
W. C. Xuan and P. Keikhosrokiani
Table 1 Features comparison of existing applications Feature
HealthifyMe
Health infinity
Garmin connect
User registration
Yes, sign up using phone number, email address, Facebook or Google account
Yes, sign up using email address or Google account
Yes, sign up using email address
Profile Management
Yes
Yes
Yes
Statistic
Yes, each tracker has a weekly statistic
Yes, available in daily and monthly statistics
Yes, available in daily, weekly, monthly, and yearly averages of health statistics
Trackers
– – – – – – – –
– – – –
Calories tracker Water tracker Weight tracker Heart rate tracker – Route tracker – Sleep tracker
– – – – – – – –
Gamification
Yes, contains various tasks with points for levelling up and leaderboard
No
Yes, contains various challenges with badges and leaderboard
Reminders
Yes, each tracker has its own reminder
Yes, contains weight, meals, challenges, water, and medication reminder
No
Tips & Information
Yes, contains blog, Q&A, health No advice, recipes, workout & diet videos, tips, lesson and more
Yes, contains personalized advice, helpful video, and articles (in Garmin Coach features)
Sync data
Syncs activity & step counter data by seamlessly integrating with Samsung Health, Google Fit, Garmin, and Fitbit
Connects to Google Fit and syncs health data with other apps (PRO feature)
Syncs with other apps like MyFitnessPal and Strava
Other features
– One-on-one coaching (PRO feature) – Shares statistics to other platforms – Chatbot – Join group (PRO feature)
– Health calculator – Garmin Coach get custoized workouts – Calories – Tracks lifespan of calculator gear like shoes and – Notes bicycles – Builds connections with other users
Calories tracker Water tracker Steps tracker Workout tracker Weight tracker Sleep tracker Medicine tracker Hand wash tracker
Calories tracker Hydration tracker Steps tracker Workout tracker Weight tracker Sleep tracker Heart rate tracker Menstrual cycle tracker
Habitpad: A Habit-Change Person-Centric Healthcare Mobile …
33
users to get instant answers, insights, and feedback on diet plans and workouts. In HealthifyMe, there is a ‘Discover’ function that guides users to find health advice, recipes, videos, articles and daily dose of motivation for fitness goals through fresh content on the application’s feed that are renewed every day. It also consists of gamification features that include tasks, leaderboards and persuasive features that allow users to join groups to communicate and discuss with other app users and coaches. These features further motivate users to keep on using the app as it can generate a sense of accompaniment or peer motivation so that users do not feel that they are alone. HealthifyMe is proven to be able to assist users in losing their weight through the ‘Transformation’ function within the app.
2.5.2
Health Infinity
Health Infinity [28] is an all-in-one health and fitness tracker which is developed by Droid Infinity, Inc. to help people reach their health goals, lose weight, make better food choices and stay fit. For Health Infinity, it focuses more on tracking health activities among application users. It only tracks calories, water intake, weight, heart rate, route and sleep cycle to generate simple statistics. It has calories calculator and health calculator to calculate ideal weight, body fat, metabolic rate, muscle, target heart rate and much more. One of the special features for this application is that it does not require any external devices such as smart watch or fitness band. Instead, it only uses the sensor on mobile such as mobile’s camera and flash to monitor heart rate accurately and record GPS-based activities with precise stats like pace, route, distance, and calories burned. It also uses mobile’s sensor to detect sleep cycle and workout challenges like pull ups, push-ups, sit ups and squats.
2.5.3
Garmin Connect
Garmin Connect [29] is a mobile created by Garmin Ltd. for monitoring, analyzing, and sharing health and fitness activities captured by associated Garmin devices. It is not just about the data. By using Garmin Connect, other than tracking activities, the digital insights provide customized workouts that adapt to user’s ability, personalized advice, helpful articles and videos. For Garmin Connect, it focuses more in providing a detailed visual analysis and health statistics. Users can view daily, weekly, monthly and yearly averages of health statistics, historic tracking of their favorite activities and more. On the app’s dashboard, a customization feature is provided for users to customize what they want to see and the order of what they want to see. Besides, Garmin Connect consists a ‘Challenges’ feature where it can collect and manage badges upon completion of certain challenges. There is also a ‘New Feeds’ feature where users can share workout data with other users. These features help to boost users’ motivation and give them encouragement. Other than tracking health activities, it has ‘Garmin Coach’ feature. This feature lets users select their race goal from run
34
W. C. Xuan and P. Keikhosrokiani
or walk up with expert coaching and a dynamic training plan that shifts based on their goals and performance.
3 Development Methodology Agile development is the chosen development methodology for this project as its concepts are in line with our objectives. It is an iterative development where all the requirements and proposed solutions evolve through the collective effort among selforganizing and cross-functional discussion with end-users. This development method focuses on disciplined project management to boost rapid change and adaptation alongside instilling good cooperation and appropriate practices for quick delivery of high-quality applications. Rapid change of a system is to help the system deal with different users. The purpose of rapid system modification is to assist the system in dealing with a variety of users. As a result, an agile methodology is used to guarantee that changes can be made as the project is being developed. A few processes are involved in Agile Methodology. The processes are planning, requirement analysis, designing, development, testing and integration, and maintenance. Figure 1 shows the Agile Methodology model in this project.
3.1 Planning and Requirement Analysis Phase After the project bidding and the project confirmation, the first meeting with supervisor is conducted to discuss the details of this project. After the discussion, Habitpad - a habit-change tracking application with gamification features - is agreed upon to dedicate to obese patients. In this phase, users’ requirements are elicited by distributing online survey. Google Form is used as a platform to create questionnaires for survey. Fig. 1 Agile methodology model
Habitpad: A Habit-Change Person-Centric Healthcare Mobile …
35
A survey was conducted for approximately one week, from November 22 to November 28, 2021, in which the number of 30 responses were collected. The proposed mobile application was demonstrated to the respondents via a recorded video and the respondents filled up the online questionnaire after watching the recorded video. Based on the survey result shown in Table 2, this survey comprises of 43.3% male and 56.7% female responses (with majority age-range of 18–24 years old). There are 5 out of 30 respondents who have obesity problem, and they do not agree that one’s habit will cause obesity because it may be caused by genetics. The rest 83.3% of respondents do not have obesity problem and they believe that one’s habit will cause obesity. Besides, there are 70% of respondents who prefer to use digital habit tracking than analog (notebook, bulletin board, calendar) to track habits. Only 11 respondents, however, have habit tracker applications installed on their laptop or mobile device, and 19 respondents have none at all. This tendency suggests that not many individuals are paying attention to their lifestyle habits. Among 30 respondents, 21 of them think that habit tracker applications with gamification features (points, level, badges, leaderboard) are attractive and affective in ensuring continuous use of those applications. Lastly, 63.3% of respondents agree that habit-change for obese patients may help in reducing obesity rate. This is a big support and encouragement for me to do this project. There are some functional requirements for Habitpad application. The application can be used by three different categories of users—patient, doctor, and admin. For instance, the application should provide registration for patient and doctor to obtain all basic demographic details. The details of patient and doctor can be updated anytime. The application should allow the patient to track his/her habit of diet, exercise, water intake and step count. Based on the tracked habits, the application classifies the obese level of patient and provides detailed statistics. To increase the usage rate of the application, it should allow patients to accumulate reward points for successfully completing each task and then to utilize those points to redeem rewards. The application also allows patients to collect step count badges monthly, view step count leaderboard monthly, and join various step count challenges. The application should have the ability to set reminders for each tracker in order to remind patients to track habit at a fixed time. In addition, the application should provide health tips, articles, and information about obesity. The application should allow patients to do follow-up sessions with doctors, as well as to schedule and manage appointments. Besides, the application should also allow doctors to review and analyze patients’ habits and their progress over time. In addition, the Habitpad application must enable doctors to counsel patients and write prescriptions. Admins should be able to control tips, awards, and step count challenges through the application. System quality characteristics are non-functional criteria that outline the workings of the system and place limitations on its usefulness. Four categories make up the Habitpad application’s non-functional requirements: (1) Usability, (2) Reliability, (3) Security, and (4) Performance. As part of the usability requirements, the design of Habitpad focuses on simplicity so that it can be easily navigated by users. Simple and friendly interfaces also provide great user experiences when using habit tracking applications for obese patients. As for reliability, Habitpad provides 24/7 service to
36
W. C. Xuan and P. Keikhosrokiani
Table 2 Survey Result Question
Frequency
Percentage (%)
What is your gender? Male
13
43.3
Female
17
56.7
What is your age? less than 18
2
6.7
18–24
14
46.7
25–34
6
20
35–44
4
13.3
45–54
2
6.7
55 or older
2
6.7
Yes
5
16.7
No
25
83.3
Yes
25
83.3
No
5
16.7
Are you have obesity problem?
Do you think that one’s habit will cause obesity?
Do you prefer analog (notebook, bulletin board, calendar) or digital habit tracking? Analog
9
30
Digital
21
70
Yes
11
36.7
No
19
63.3
Haveyouinstalledhabit-changetracker application or system in your mobile or laptop?
Is that a habit-change tracker application with gamification features (points, level, badges, leaderboard) is attractive for you or make you continue using the application? Yes
21
70
No
9
30
Do you think a habit-change tracker application for obese patients can reduce obesity rate? Yes
19
63.3
No
11
36.7
Habitpad: A Habit-Change Person-Centric Healthcare Mobile …
37
ensure its availability and readiness to receive or serve users’ requests at all times, unless in the case where there is no Internet connection or poor Internet connection. Security function requires every user to log in into the application by entering their own unique password. Patients’ profile data should be secured and protected since those data are private. Last but not least, Habitpad is stable and optimized to provide the best performance and efficiency to all users.
3.2 Designing Phase In term of design, requirement specifications from first phase will be studied in detail. System design includes the system architecture, use case diagram, flowchart, database design diagram and system interface design. These diagrams visualize how the requirements are implemented in the proposed system and how the system interacts with users. System design also aids in specifying required hardware and software for realizing system functionalities. As this project involves mobile application, proper IDEs are needed for coding using Android Studio.
3.2.1
System Architecture
Figure 2 provides a detailed description of the “Habitpad” application’s system architecture. The Habitpad app gathers and keeps track of patient data, including demographic information, food, step count, degree of activity, and water intake habits. The collected data will then be stored to MySQL to be used by Habitpad application. In order for patients to maintain a fit and healthy lifestyle and avoid probable chronic diseases, the Habitpad application is crucial in gathering, organizing, and evaluating patient data. The Habitpad program pulls information from the database and uses machine learning tools and techniques to help clinicians categorize patients’ levels of obesity and track changes in their behaviors. In addition, data visualizations for analysis and decision-making by clinicians are made using the data. Figure 3 shows the detailed system architecture for obese level classification— with habit data as the input and displayed users’ obese level in the application as the output. To classify obese levels of user, deploying machine learning in android application is needed. First, the model classification carries out analysis based on the model that has been trained before. After that, for implementation of machine learning in Android application, Flask API is implemented, and frontend depends on java. A Flask API is implemented which helps to perform machine learning models and the output will be in JSON format. The habit data as inputs will take from the android application, then hit at Flask API. The response from the Flask API will be displayed back in the Android application. To solve a problem which the Android application cannot detect the Flask API that is running locally in system, thus, the Heroku is used to deploy Flask API online. Finally, the output will be displayed in the application.
38
W. C. Xuan and P. Keikhosrokiani
Fig. 2 Habitpad system architecture
Fig. 3 Detailed system architecture for obese level classification
3.2.2
Module Design
Figure 4 displays a modular diagram for Habitpad. Four core modules make up the Habitpad application: (1) patient management; (2) habit-change tracking and
Habitpad: A Habit-Change Person-Centric Healthcare Mobile …
39
Fig. 4 Module diagram
analytics; (3) gamification; and (4) persuasive features module. These modules are designed to help the application’s many users, including patients, physicians, and administrators.
3.2.3
General Use Case Diagram
Habitpad usecase diagram is illustrated in Fig. 5. Patient Management includes submodules of register account, manage account and statistics. When using Habitpad for the first time, a patient is required to register an account and enter basic information such as age, gender, height, weight, daily activities rate, medical condition, etc. The information and patient’s profile can be updated at any time. The basic information on weight and height are used to calculate BMI. Total Daily Energy Expenditure (TDEE) measures the number of calories burned per day, also calculated based on the basic information of patient. The collected data from patient are visualized to assist patients or doctors in decision-making for further prescription. Habit-Change Tracking and Analytics Module include submodules of habit trackers and classify obese level. The habit data of patients can be tracked by different habit trackers in Habitpad such as diet, workout, step count and water intake. The step count is detected by smartphone motion sensor. Other habit data like diet, workout and water intake are tracked manually by user. Big data analytics is then applied to these habit data for classifying obese level. Gamification Module includes submodules of points and rewards, badges and leaderboard, and challenges. There have various tasks with different points. Patients are awarded points for successful completion of every task. The accumulated points are important to patients as points can be used to redeem rewards such as vouchers or gifts. Patients’ total accumulated step count will affect the badges achieved and the
40
W. C. Xuan and P. Keikhosrokiani Gamification Module 170° and ZH < 0 Backward walking
Target-to-chair walk
moved as quickly as but felt safe and comfortable. The following series of activities were performed during the experiment. 1. Get up from the chair (if the necessary subject can use the support of the chair arms). 2. Walk a 3 m (10 feet) distance (a masking tape marked destination). 3. Turn 180 degrees on the marked destination. 4. Walk back 3 m distance to the stated location. 5. Turn 180 degrees on the starting location. 6. Sit down on the chair (if the necessary subject can use the support of the chair arms). During the above series of activities, we calculated the time T1 (Activity 1), T2 (Activity 2, 3, 4, and 5), and T3 (Activity 6). We measured TUG time using the traditional method (stopwatch) and the automated TUG tool to evaluate the accuracy. The next section examines the findings and interpretation of the above.
4 Results and Analysis This section discusses the results and analysis of the proposed attentive vision model. Figure 6 shows the body landmark during the different activities. The image shows that the knee and hip angles have more significant variation during the TUG activities. Figures 7, 8, and 9 shows the developed attentive vision model for sarcopenia screening. The test case’s real-time knee angle was calculated using hip, knee, and ankle coordinates generated during TUG activities. Initially, a 3 m test arena and chair arrangement must be set up. The front foot of the chair must be aligned with the right green line. The 3 m arena must always align within the green lines. The redline indicated in the image was used to evaluate the direction of the hip’s z-coordinate. This direction was used to determine whether the test case go-forward or backward direction. Figures 8 and 9 show the forward and backward directions while experimenting. The knee variation during the TUG test is shown in Fig. 10. We observed that the graph’s breadth varies during the experiment for several test scenarios, but the
96
H. M. K. K. M. B. Herath et al.
Fig. 6 Human pose extraction using landmarks
Fig. 7 The subject performs sit-to-stand in the TUG test
Fig. 8 The subject performs a 3 m forward walk in the TUG test
graph’s form does not. The graph’s width increases as the test case has a sarcopenia condition. As a result, we can recognize the servility associated with the sarcopenia condition. Also, we found that those over 70 needed support (by a person or gait support) with sitting-to-stand and stand-to-sit activities.
Attentive Vision-Based Model for Sarcopenia Screening by Automating …
97
Fig. 9 The subject performs a 3 m backward walk in the TUG test
Fig. 10 Knee angle variation during the TUG experiment
Table 7 shows the knee angle variation and the individual’s body poses and direction during the experiment. The body pose is in a walking motion from times 3 through 12, and the knee angle gradually increases until it reaches its maximum. This demonstrates that the person has been walking for some time with a straight leg or a leg that is almost straight. When compared to its maximum of 180 degrees, the knee angle starts to significantly decline and drops to 176.3 degrees. The body posture changes back to seated, and the knee angle drops to 98.1 and 89.9 degrees, respectively. This suggests the individual has resumed sitting after ceasing to move. The body position data can be important for assessing sarcopenia. The ability to go from sitting to walking and back again indicates that the person can move and operate physically to some extent. The variation in knee angle for several test cases is shown in Fig. 11. Clinical tests employing the conventional TUG test and hand grip experiment revealed that S-1 and S-2 test patients had sarcopenia. As seen in Fig. 11, S-1 and S-2 displayed higher TUG times, indicative of server sarcopenia. The test case struggled with sit-to-stand and stand-to-sit activities, as seen in S-2. This phenomenon was also depicted on the
98 Table 7 Knee angle vs. Activity and TUG movement identification (Fwd: Forward direction, Rev: Backward direction)
H. M. K. K. M. B. Herath et al.
Body pose
Direction
0
89.97
Sit
Fwd
1
98.84
Sitting
Fwd
2
118.56
Sitting
Fwd
3
170.26
Walking
Fwd
4
178.58
Walking
Fwd
5
178.63
Walking
Fwd
6
180.00
Walking
Fwd
7
172.76
Walking
Fwd
8
180.10
Walking
Fwd
9
180.0
Walking
Rev
10
170.31
Walking
Rev
11
180.05
Walking
Rev
12
180.00
Walking
Rev
13
176.33
Walking
Rev
14
98.10
Sitting
Fwd
15
89.97
Sit
Fwd
Time frame/(s)
Knee angle (ZK )/(°)
graph since there has been a more significant variation because the test case hardly attempted the activity. Table 8 depicts the experiment results of the proposed attentive vision method. T1, T2, and T3 describes as follows: T1: time taken to sit-to-stand, T2: time taken to forward 3 m walk→turn 180°→backward 3 m walk→turn 180°, T3: time taken to stand-to-sit activities, and TL: TUG time. Figure 12 shows the variation of TUG time with the traditional method and the proposed attentive method. We observed that the human mistake of the timekeeper in the conventional technique led to inaccuracy. Therefore, there is no error for human mistake in our system. As a result, the proposed approach is typically ±0.8 seconds more accurate. As shown in Fig. 12, the proposed method shows promising results in automating the TUG test. The experiment was tested with six adults and four healthy elders. Experimental results suggested that the proposed TUG time-generating methodology showed promising results with 93.7% accuracy. TUG time baseline for healthy subjects was identified as 13.07 seconds.
Attentive Vision-Based Model for Sarcopenia Screening by Automating …
99
Fig. 11 Knee angle variation of the test cases Table 8 Automated TUG time for different test cases Test case
Test time/(s) T1
T2
T3
S-1
1.32
8.35
3.09
TL 12.76
S-2
1.62
7.95
2.91
12.48
S-3
0.95
4.34
1.8
7.09
S-4
1.33
8.71
2.32
12.36
S-5
1.27
8.25
1.9
11.42
S-6
1.54
7.85
2.2
11.59
S-7
1.98
8.56
2.3
12.84
S-8
1.33
8.31
2.49
12.13
S-9
1.14
9.49
1.75
12.38
S-10
1.8
8.12
2.4
12.32
100
H. M. K. K. M. B. Herath et al.
Fig. 12 TUG time variation of automated method vs. traditional method
5 Discussion The way sarcopenia is recognized and treated has the potential to change thanks to vision systems with automated TUG. These technologies can precisely track and analyze older persons’ motions while taking the TUG test, thanks to computer vision and machine learning techniques. One possible advantage of utilizing vision technologies with automated TUG is the ability to objectively assess mobility and functional performance. This can aid in the early detection of sarcopenia and the tracking of its development over time, resulting in earlier treatments and better results. A higher incidence of falls and disability, as well as a loss in muscle mass and strength, are linked to slower TUG times. The TUG time is generally slower in people with sarcopenia than in people without, showing a loss in mobility and functional ability. The vision system with automated TUG and the conventional method has a high degree of agreement, which suggests that the vision system correctly records the movements of older individuals during the TUG test. This is significant because precise assessment of TUG time is essential for identifying people at risk for accidents, impairment, and sarcopenia, as well as for tracking the development of these disorders over time. Moreover, utilizing vision systems with automated TUG might also present certain restrictions and difficulties. For instance, older persons may struggle to adapt to technology or feel uneasy about being watched by a camera. There can also be worries about data security and privacy. Along with the TUG test, gait speed testing can improve the sarcopenia prediction model’s accuracy. Contrarily, gait speed gauges how long it takes someone to cover a certain distance at their typical speed. Together with reduced mobility, a greater risk of falling, and an increased chance of being disabled and dying, sarcopenia is also characterized by a slower gait speed. The TUG test and gait speed are both simple,
Attentive Vision-Based Model for Sarcopenia Screening by Automating …
101
affordable tests that may be easily carried out in clinical settings to detect people at risk of sarcopenia. By combining the 3mWT and TUG into a single system, the next step is to increase the accuracy of sarcopenia prediction. Furthermore, we want to evaluate it using elders who are living in nursing homes and hospitals. Overall, the results of the proposed system imply that a vision system with automated TUG is a valid and accurate instrument for calculating TUG time in older individuals. This technology should be further investigated and developed as a tool for clinical practice since it can potentially enhance the diagnosis and management of sarcopenia, falls, and disability.
6 Conclusion Sarcopenia, a disorder marked by decreased muscle mass and strength with aging, can be assessed using an automated TUG test. The TUG test is a quick and simple exercise that gauges how long it takes someone to get up from a chair, walk three meters, turn around, walk back, and then sit back down. This research provided a novel method of attentive vision model for identifying the functional mobility of elders in the domestic environment. By automating conventional TUG tests, the attentive vision system was developed. With 93.7% accuracy, the proposed approach produced promising results. The developed system is simple to use and inexpensive. Elders can engage in the testing because the proposed technology is non-wearable and requires no interaction. The next step of this research is to develop an attentive vision model by embedding the TUG test, and 3mWT (3 m walk test) to evaluate the sarcopenia with higher sensitivity.
References 1. Siddhisena, K. A. P. (2005). Socio-economic implications of ageing in Sri Lanka: An overview (Oxford Institute of Ageing Working Papers, pp. 1–27). Oxford Institute of Ageing. 2. Rosenberg, I. H. (1989). Summary comments. The American Journal of Clinical Nutrition, 50(5), 1231–1233. 3. Rosenberg, I. H. (1997). Sarcopenia: Origins and clinical relevance. The Journal of Nutrition, 127(5), 990S–991S. 4. Rathnayake, N., Alwis, G., Lenora, J., & Lekamwasam, S. (2019). Cutoff values for the determination of sarcopenia and the prevalence of the condition in middle-aged women: A study from Sri Lanka. Ceylon Medical Journal, 64(1), 9–16. 5. Kenner, A. M. (2008). Securing the elderly body: Dementia, surveillance, and the politics of “aging in place.” Surveillance & Society, 5(3), 252–269. 6. Milte, R., & Crotty, M. (2014). Musculoskeletal health, frailty and functional decline. Best Practice & Research Clinical Rheumatology, 28(3), 395–410. 7. Beaudart, C., Zaaria, M., Pasleau, F., Reginster, J. Y., & Bruyère, O. (2017). Health outcomes of sarcopenia: A systematic review and meta-analysis. PLoS One, 12(1), e0169548.
102
H. M. K. K. M. B. Herath et al.
8. Van Ancum, J. M., Alcazar, J., Meskers, C. G., Nielsen, B. R., Suetta, C., & Maier, A. B. (2020). Impact of using the updated EWGSOP2 definition in diagnosing sarcopenia: A clinical perspective. Archives of Gerontology and Geriatrics, 90, 104125. 9. Dent, E., Morley, J. E., Cruz-Jentoft, A. J., Arai, H., Kritchevsky, S. B., Guralnik, J., Bauer, J. M., Pahor, M., Clark, B. C., Cesari, M., Ruiz, J., Sieber, C. C., Aubertin-Leheudre, M., Waters, D. L., Visvanathan, R., Landi, F., Villareal, D. T., Fielding, R., Won, C. W., … Vellas, B. (2018). International clinical practice guidelines for sarcopenia (ICFSR): screening, diagnosis and management. The Journal of Nutrition, Health & Aging, 22, 1148–1161. 10. Lee, W. J., Liu, L. K., Peng, L. N., Lin, M. H., Chen, L. K., & ILAS Research Group. (2013). Comparisons of sarcopenia defined by IWGS and EWGSOP criteria among older people: Results from the I-Lan longitudinal aging study. Journal of the American Medical Directors Association, 14(7), 528-e1. 11. Morley, J. E., Baumgartner, R. N., Roubenoff, R., Mayer, J., & Nair, K. S. (2001). Sarcopenia. Journal of Laboratory and Clinical Medicine, 137(4), 231–243. 12. Cruz-Jentoft, A. J., Baeyens, J. P., Bauer, J. M., Boirie, Y., Cederholm, T., Landi, F., Martin, F. C., Michel, J.-P., Rolland, Y., Schneider, S. M., Topinková, E., Vandewoude, M., & Zamboni, M. (2010). Sarcopenia: European consensus on definition and diagnosis. Age and Ageing, 39(4), 412–423. 13. Pauzi, A. S. B., Mohd Nazri, F. B., Sani, S., Bataineh, A. M., Hisyam, M. N., Jaafar, M. H., & Mohamed, A. S. A. (2021, November 23–25). Movement estimation using mediapipe blazepose. In Advances in Visual Informatics: 7th International Visual Informatics Conference, IVIC 2021. Proceedings 7 (pp. 562–571). Springer International Publishing. 14. Bazarevsky, V., Grishchenko, I., Raveendran, K., Zhu, T., Zhang, F., & Grundmann, M. (2020). Blazepose: On-device real-time body pose tracking. arXiv preprint arXiv:2006.10204 15. Sprint, G., Cook, D. J., & Weeks, D. L. (2015). Toward automating clinical assessments: A survey of the timed up and go. IEEE Reviews in Biomedical Engineering, 8, 64–77. 16. Dhar, M., Kapoor, N., Suastika, K., Khamseh, M. E., Selim, S., Kumar, V., Raza, S. A., Azmat, U., Pathania, M., Mahadeb, Y. P. R., Singhal, S., Naseri, M. W., Aryana IGP, S., Thapa, S. D., Jacob, J., Somasundaram, N., Latheef, A., Dhakal, G. P., & Kalra, S. (2022). South Asian Working Action Group on SARCOpenia (SWAG-SARCO)—A consensus document. Osteoporosis and Sarcopenia, 8(2), 35–57. 17. Choo, P. L., Tou, N. X., Pang, B. W. J., Lau, L. K., Jabbar, K. A., Seah, W. T., Chen, K. K., Ng, T. P., & Wee, S. L. (2021). Timed Up and Go (TUG) reference values and predictive cutoffs for fall risk and disability in Singaporean community-dwelling adults: Yishun cross-sectional study and Singapore longitudinal aging study. Journal of the American Medical Directors Association, 22(8), 1640–1645. 18. Martinez, B. P., Gomes, I. B., Oliveira, C. S. D., Ramos, I. R., Rocha, M. D. M., Forgiarini Júnior, L. A., Camelier, F. W. R., & Camelier, A. A. (2015). Accuracy of the Timed Up and Go test for predicting sarcopenia in elderly hospitalized patients. Clinics, 70, 369–372. 19. Bischoff, H. A., et al. (2003). Identifying a cut-off point for normal mobility: A comparison of the timed ‘up and go’ test in community-dwelling and institutionalised elderly women. Age and Ageing, 32(3), 315–320. 20. Filippin, L. S., et al. (2017). Timed Up and Go test as a sarcopenia screening tool in homedwelling elderly persons. Revista Brasileira de Geriatria e Gerontologia, 20, 556–561. 21. Teixeira, E., Bohn, L., Guimarães, J. P., & Marques-Aleixo, I. (2022). Portable digital monitoring system for sarcopenia screening and diagnosis. Geriatrics, 7(6), 121. 22. Kim, J. K., Bae, M. N., Lee, K., Kim, J. C., & Hong, S. G. (2022). Explainable artificial intelligence and wearable sensor-based gait analysis to identify patients with osteopenia and sarcopenia in daily life. Biosensors, 12(3), 167. 23. Ko, J. B., Kim, K. B., Shin, Y. S., Han, H., Han, S. K., Jung, D. Y., & Hong, J. S. (2021). Predicting sarcopenia of female elderly from physical activity performance measurement using machine learning classifiers. Clinical Interventions in Aging, 16, 1723–1733.
Attentive Vision-Based Model for Sarcopenia Screening by Automating …
103
24. Zarzeczny, R., Nawrat-Szołtysik, A., Polak, A., Maliszewski, J., Kiełtyka, A., Matyja, B., Dudek, M., Zborowska, J., & Wajdman, A. (2017). Aging effect on the instrumented TimedUp-and-Go test variables in nursing home women aged 80–93 years. Biogerontology, 18, 651– 663. 25. Savoie, P., Cameron, J. A., Kaye, M. E., & Scheme, E. J. (2019). Automation of the TimedUp-and-Go test using a conventional video camera. IEEE Journal of Biomedical and Health Informatics, 24(4), 1196–1205. 26. Kim, Y. J., Choi, J., Moon, J., Sung, K. R., & Choi, J. (2021). A sarcopenia detection system using an RGB-D camera and an ultrasound probe: Eye-in-hand approach. Biosensors, 11(7), 243.
AAL with Deep Learning to Classify the Diseases Remotely from the Image Data A. Sharmila, E. L. Dhivya Priya, K. S. Tamilselvan, and K. R. Gokul Anand
Abstract In the preceding decennium, Deployment of ambient assistive living technology to promote self-dependent life is keep on intensifying [1]. The populace and the divergence of inherent features on the way to an aged population results in incorporating unfamiliar provocations to the current habitants from each of two a remunerative and communal perspective. Ambient Assistive Living technology can be able to proffer a bunch of clarifications for refining the fineness of survival of mankind, permitting personnel to stay finer and unaccompanied for long time, to assist the people possess disorders, and the subsisting mechanization proffers enormous assistance for caregivers, the proposed technology offers immense support for caretakers and medical subordinates. An extensive investigation is demonstrated to label the prime fashion towards the blossoming of Ambient Assistive Living technology and its requirement for self-dependent living [2]. The ambient technology incorporates deep learning techniques [3–8] to scrutinize the information gathered by the system and to eliminate the requirement of superior’s suggestions. Keywords Ambient assistive living · Wearable gadgets · Smart devices · Habitat sensors · Data collection · Methodologies for data analysis · Deep learning · Sensor technologies · Graphical User Interface · Deep Learning Architectures A. Sharmila (B) Department of Electronics and Communication, Bannari Amman Institute of Technology, Erode, India e-mail: [email protected] E. L. D. Priya Department of Electronics and Communication, Erode Sengunthar Engineering College, Erode, India K. S. Tamilselvan Department of Electronics and Communication, KPR Institute of Engineering and Technology, Coimbatore, India K. R. G. Anand Department of Electronics and Communication, Dr Mahalingam College of Engineering and Technology, Coimbatore, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_5
105
106
A. Sharmila et al.
1 Introduction Ambient Assistive Living Technology employs the emerging techniques to ensure the safety of individuals and promotes sophisticated living. The techniques engaged in ambient assistive living are smart gadgets, wireless sensor networks, sensors for assuring the healthy living. The technology promotes independent living [9]. The ambient assistive living technology incorporated artificial intelligence and deep learning techniques to promote effortless living. The individuals wish to lead an admirable aspect of life in all stages of life. The ambient Assistive Living Technology encourages, furnish the possibility of staying unattended and self-dependent. The techniques exploited in ambient assistive living are user friendly. It is eminently unified with the habitat of the people [10]. The AAL technology suggests the bundle of clarifications to enhance the grade of life, stimulate the healthy life, independent life. The system immensely assists the custodian and medical experts. It is gift for the people with impairment. The ambient assistive living technology promotes self-dependency in the impaired and the elder people. Apart from traditional wearable and self-monitoring devices the technology incorporates mechanism to monitor the falling of impairment people, movements of the individuals, assists in proceeding their daily routines by intimating through remainders [11]. The system immense assist in keeping the person with inability away from the hospitals. The ambient assistive living technology monitors the complete status of the elder or impaired people by employing various sensors inclusive of passive and active infrared sensors, ultrasonic sensors to examine the motions of the impaired, elder people, glucose levels in the patients can be inspected by the glucometer, motions of muscles, activities of the heart can be monitored by the positioning of the respective sensors. The notable achievements of ambient assistive living technology is the hiring of video sensors to ensure the safety of the individuals. The sound generated can be examined with the microphones to inspect the activities and the requirements of the impaired people [11]. In inclusion cloud based ambient assistive living technology aims to gather the particulars from the ambient assistive living technology sensors and to offers the required solution to take care of the elders [12]. Numerous technologies are incorporated to improve the rate in which the data transferred from the sensors or the gadgets employed in the ambient assistive living technology to the cloud to bridge or reduce the gap in acquiring the gestures and take immediate measures in saving the life of the individuals. The ambient assistive technology after collecting the requirements from the gadgets positioned, process the gestures to proffer the immediate response to their caretakers. Ambient Assistive Living Technology in addition employs numerous artificial and deep learning techniques to offer accuracy in the prediction of state of the individuals. The system after incorporating the artificial intelligence techniques lessen the need for the medical professionals in predicting the disease affected the individuals and suggests the proper remedies for the same [12]. The technology becomes more
AAL with Deep Learning to Classify the Diseases Remotely …
107
powerful and beneficial after the evolving of emerging techniques in transferring the gestures and predicting the healthy state of the impaired people. Ambient assistive living technology in inclusion monitors the numerous specifications of the environment of the elder people. The specifications like temperature, emissions of the hazardous gases in the locale of the elder or the self-dependent individuals to ensure the safety. The system also monitors the interactions of the impaired people with others [13]. Deep learning or machine learning technologies can also be incorporated with the ambient assistive living technology to inspect the behavior of the impaired people. It compares the behaviors of the individuals with the dataset gathers from the routines of the elders to predict the inconvenience if occurred in the individuals. The system offers the child care to the self-dependents. Machine learning algorithms inclusive of support vector machine is incorporated in fall detection of the impaired people, incorporated in ambient assistive living technology. The system predicts the abnormal activities of the elders by inspecting the daily activities of them [14]. The system forms the dataset by capturing the daily routines and the numerous activities of the individuals [15]. The system aims to monitor the fall of the individuals in case of the exigency situations and take immediate measures to ensure the safety of the impaired people. It lessen the false predictions by gathering large dataset upon the daily activities of the elders or the impaired people and assists in lessening the tension of the caretakers [15]. The caretakers are alerted once there is an exigency and left free the false falls which can be taken care by the dependents of their own.
1.1 Motivation of Scheming Ambient Assisted Living The populace and the divergence of inherent features routed to the initiation of Ambient Assistive Living (AAL) technologies. Though the implement of ambient assistive living technologies is high rated, the benefits out of which worth it [1]. The individuals wish to lead an independent life in all stages. The proposed system assures the self-dependent life by offering the appropriate gadgets. It proffers courage to stay alone, unaccompanied by anyone’s guidance. The proposed system conveys the flexible requirements and the reliability concerns by the way of concatenating to the communal surroundings. The routine life instances get benefited out of ambient living technologies are listed below 1. Smart Kitchen Ambient Living Technology assists it defend us from the fire accident in kitchen. The technology promotes automation in cooking [2]. Kitchen gets more beneficial out of ambient assistive living technology. It is a locale for fire injuries. The system integrates the wireless sensor networks, radio frequency identification modules, incorporating numerous devices to deal with the riddles, deep learning techniques.
108
2. 3. 4. 5. 6. 7. 8.
A. Sharmila et al.
Looks after Thieving AAL technology looks after the thieving and secure us. Turning the light on and off contextually.—Smart home Setting the temperature of the room with a smart thermostat. Controlling the music as per the user’s choice. Intelligent detection of smoke with a smart smoke detector. Use during hiking. In case of a self-learning system, there is a networking of the sensors.
The fusion and evaluation of data, help in performing routine tasks. The system provides assistance as per the user’s age, to resolve the issues. The Ambient Assistive Living technology makes it adoptable to the requisites of the persons. Individual wish to lead a self-dependent life. The technology proffers immense assistance for independent life [10]. The personals irrespective of ages get benefited from ambient assistive living technology. Elder people, younger generation can exploit ambient living technology. The technology proffers assurance for improving the grade of living. The impairment people and their family members, doctors, caregivers, nurses find it beneficial and assists in preventing from serious issues. The technology upgrades the conversation provision, intervention with the community. 1. Lessening the Count of Caregivers by Exploiting Ambient Assisted Living: Elder people in home, physical impairment people requisites the assistance of the caregivers to fulfill their daily routines. For instance to walk, consume food they require the assistance of the care takers. Ambient Assistive Technology is an alternate for the care givers and the elder people become self-dependent. The technology lessen the requirement of caregivers to a huge proportions [9]. Deployment of ambient assistive living technology is a boon for the dependent personnel. Devoid of intervention of caregivers they can be able to turn the lights, fans, television, accessing the door by preventing interdicted entries, forbid the fire mishap in cooking sectors by ambient assistive living technology. 2. Ambient Intelligence (AMI): Ambient intelligence in ambient assistive living technology is the ability of the incorporating numerous gadgets hat is computers to deal with the similar riddles inclusive of inspect the state of the habitat and to update it to the individuals concerned. Ambient assistive living technology comprises of artificial intelligence technology to know the progress, health status of the personals and to proffer indispensable services by incorporating instance opinion accomplishing techniques [1]. The techniques gain its priority by servicing peculiarly for the disease detection, fall detection, instance services on emergencies, reminder for in taking medicines, lessening the visits often to hospitals. 3. Exigency Reply Approach: Smartphones becomes the portion of Homo sapiens routine existence now a days. Ambient Assistive Living technology assists in assuring a high end security by incorporating mobile phones as an exigency reply approach [2]. The vital requisite of the
AAL with Deep Learning to Classify the Diseases Remotely …
109
exigency reply system includes personals fall down detection, monitoring by video supervision. The provisions incorporated in the system is to awake the caregivers and the family members and assists in acquiring immediate measures and can deal the exigency situations in the gentle mode. 4. Assistance of Ambient Assisted Living in Flexibility and Robotization: The ambient assistive living technologies reassure the old people and the people with disability to lay hold of individuals, to implement the routine in the paucity of interference of custodians [10]. The proposed system redeems the hesitation of the dependents, reduces the requisite for the custodians. The ambient assistive technology continually keep an eye on the tormented individuals, records of them on a steady base, provokes alerts on extremity. 5. Ambient Assisted Living Preferring Fitness Gestures: Employment of sensors technologies in smart watches, phones immensely assist the ambient assistive living technology. Peculiarly the sensors inclusive of motion sensor, pressure sensor, pulse-oximeter, Global positioning sensors, accelerometer sensors, temperature sensors, optical sensors detects the fitness status of individuals inclusive of pressure, sugar measures, heart related issues and detects the mobility of the dependents [9]. The smart watches and the phones assisting wearable gadgets proffers health gestures to their caregivers and the custodians regarding their fitness status at the specified interval. 6. Ill Health Supervision: Deployment of Information technology in Ambient Assistive Living immensely assist the ambient Living. Ambient Assistive Living technology monitors the dependents all the time and proffers adequate suggestions and assistance whenever there is an exigency. The technology keeps the impairment people away from the hospital and prevents the sudden urge to the hospitals, lessen the tension of the persons and their care givers. It prevents the caregivers from spending more money. If there is a condition of exigency the system alerts and sends gestures to the caregivers of the dependents the system also assists in getting suggestions from the medical experts without visiting the hospitals. 7. Dosette Box: The impairment people or the older people has to take the medicine upon the medical professional description on a daily basis. Owing to ill health they have to consume more tablets to maintain fitness [1]. Elder people may forget to consume the tablets on time. The dosette box prevents the impairment people from failing to take medicine. It prevents the people from not taking medicine. The box comprises of numerous division to specify the time on which the particular tablets has to be consumed by the impairment people. The dosette box has numerous partitions. The elders owing to aging has to consume more number of tablets. The caregivers has to place the tablets in the
110
A. Sharmila et al.
partitions of the box upon the timings and dosage. Once the prescribed time to take the medicine is reached the system sends out the alert gesture and drop the tablet in the partition from which the patients has to take their medicine. Alert gestures is a remainder for the impairment people to consume tablets [1]. If the dependents are unable to consume the tablets or they missed the alert signal, the message will be send to the caregivers. Caregivers can alert them so that they can consume tablets with the prescription suggested. The ambient Assistive Living Technology is really a hand to hand assistance for the impairment and the dependents, their caregivers. Dependents becomes the self-dependents by deploying the technology. 8. Handy Robots: Robots can assist the impairment people in all their aspects. It lessen the requisite for caregivers and their dependencies of older people. Impairment people can go out and spend their time joyfully by the assistance of the robots. It fulfills the dependency of caregivers for elders. Dependent people can independently to do their routine by their assistance [2]. Ambient Assistive Living technologies monitors the fitness, action of the impairment people. It monitors the numerous specifications of the older and dependent by implementing sensor technologies. The data received from the sensors are verified by the elders by themselves and by the Caregivers. In case of exigency like fall down of impairment people owing to the serious health issues inspected from the video surveillance. The caregivers monitoring the impairment people remotely has to take adequate steps. In such cases the system has to analyze numerous parameters like pressure level, heart rate, sugar level, stress level etc. The deep learning techniques which is already trained with the daily activities of the impairment or elder people can identify the reason for the fall down of the person and convey it to the caregivers so that the caregivers can take steps [10]. The Ambient Assistive Living technology by incorporating the deep learning techniques saves the impairment people and lessen the tension of the caregivers. It employs a numerous deep learning techniques and got trained on the activities of the impairment people their routine, health status. The ambient living technology proffers the required services to them by incorporating automation using deep learning techniques.
2 Implementing AAL Technology The ambient assistive living technology by incorporating the sensor technologies and the wearable devices, video espionage to consistently inspect the actions of the older people and the impairment people. The system monitors the fitness status of the individuals by various sensors attached to the wearable devices [9]. The smart watches are able to record the wellbeing status of the personnel. It records the respiration rate, pulse rate, walking steps, sleeping rate inclusive of sleeping hours, wake up time,
AAL with Deep Learning to Classify the Diseases Remotely …
111
blood pressure, body temperature, detects the calorie level etc. the other fitness terms like sugar level, stress level, oxygen level can be. Inspected by attaching the sensors to be in connection with personnel or by the wearable technologies. All these fitness terms are examined by the ambient living technologies [9]. Whenever there is an emergency the technique sends the alert to the caregivers and the doctors of the respective dependents. The technique sends the records and the findings of the patients to the internet assisting the gateways, the records are enrolled in the cloud and is forwarded to the caregivers, doctors or other medical professionals. The records of the impairment peoples are constantly monitored by incorporating the remote monitoring techniques [10]. The Fig. 1 demonstrates the wearable sensors connected to the personnel to record the fitness status and the conveying of information to the internet and the application server through the gateways to monitor the actions of the dependents consistently and remotely. The specifications observed from the dependents are tested by incorporating deep learning techniques at the graphical user interface. The medical professionals can confirm the fitness state of the individuals from the deep learning techniques trained with the actions and the fitness status of the persons [10]. The techniques prevent the dependents from urging to the hospitals, worries regarding the fitness status of the individuals.
Fig. 1 AAL technology
112
A. Sharmila et al.
3 Deep Learning for Disease Classification Deep is procuring higher attention. Deep Leaning has its deployment in all fields. It seems to be inevitable as it is capable of proffering clarification to the distinct intricate controversies. Owing to its momentous accomplishment, it is universally employed roughly in all applications. Deep learning replicates the human brain [4]. It creates a model upon the human neural networks. Deep Learning has extensive aspects to deal with image recognition, image classification, language translation, Natural Language Processing, disease detection, computer vision problems, and Business investigations. It outperforms all the other traditional models of learning. Detection of a type of disease in earlier stages assists in preventing the outbreak of diseases and to safeguard from the immense loss [16]. Deep Learning framework empowers the machine to accomplish a task in the absence of human intervention. Deep Learning reduces the time and requirement of man power in accompanying a task. Deep learning to classify the disease remotely from image data involves the classification of image by the supervised learning methods. In supervised learning the images are trained with the labels that is the known data. Initially the images comprised in datasets are classified as training and testing sets [17]. The deep learning classify the given image as healthy or diseased upon the particulars learned while training the datasets by calculating the losses, by appropriate loss function and through the back propagation mechanisms to update the weights to improve the performance of a model and testing to evaluate the model. The proposed work in which the numerous architectures of deep learning are compared by the performance, computational speed and accuracy in predicting the diseases in the initial stages. The architecture of deep learning especially the convolutional neural networks seeks the attention in disease detection currently. Supervised Learning, Deep Learning, Disease detection, Training, testing, performance are key terms related to the ambient Assistive Living Technology incorporated with the deep learning techniques.
3.1 Architectures of Deep Learning The denomination Deep Learning was formulated in the University of Toronto by Rumelhart, Williams in 1986. The history of deep learning starts from 1943, when a model is developed to imitate the human brain. Even though the deep learning models arrived early they grasp the alertness in the recent decade after the evolution of Convolutional Neural network architectures. The architectures of deep learning are keeps on emerging. The ategories of deep learning is shown in Fig. 2. Deep Learning Architectures are broadly classified into 1. Supervised Learning 2. Unsupervised Learning
AAL with Deep Learning to Classify the Diseases Remotely …
113
Fig. 2 Architectures of deep learning models
The proposed work focus on the remote data to classify the type of the disease. Supervised learning models satisfies the disease detection from remote data by training the data using labels. Supervised Learning is subdivided into 1. Convolutional Neural Networks 2. Recurrent Neural Networks Convolutional Neural Networks are hired predominantly for the image data. Recurrent Neural Networks for the text data. The system requires to classify the image data upon the type of the disease. It’s time to getting into the convolutional Neural Networks. The Architectures of Convolutional Neural Networks are summarized as 1. 2. 3. 4. 5. 6. 7.
Alex Net (2012)—Lessen the over fitting by dropouts. VGGNet (2014)—Increases the channel number after pooling Layer Inception Net (2014)—Filters of diverse sizes to make the model wider. ResNet (2015)—Increase the convolutional Layers. DenseNet (2016)—reduce the vanishing gradient Problems. Mobile Net (2017)—Lessen the parameters with improved performance. NASNet—It doesn’t includes remote data. It is the reinforcement Learning.
4 Proposed Work The proposed Deep Learning architectures are supervised learning models as they employs the remote data to classify the type of diseases [18]. The dataset in which about seventy percentage is data is engaged for training and thirty percentage data for testing the accuracy of the system. The steps involved in the proposed work is shown in Fig. 4. It shows the process invoked by deep learning. Preprocessing is the
114
A. Sharmila et al.
Fig. 3 Layers of CNN models
initial step to be performed in deep learning in which the images are subjected to the prescribed size as to lessen the computational time. The layers of deep learning techniques are demonstrated in Fig. 3. Fig. 4 Overview of the proposed system
AAL with Deep Learning to Classify the Diseases Remotely …
115
4.1 Layers of Deep Learning Architectures 1. 2. 3. 4. 5. 6.
Convolution layer Pooling layers Fully connected layers Batch Normalization Transition layer Dense Block layer
The second step is the attribute draws out. It extracts the vital attributes from the images. The pivotal attributes are mandatory to classify the type of the disease in the given image. The feature extraction involves the convolution and pooling Layers [16–20]. The convolution layers are filters to draws out the important attributes from the image. It also involves the activation function which decides which attributes to pass on to the next layer and avoids the repetition of information. The convolutional Layer is followed by the pooling layer. Maximum pooling layer is preferable for extracting the immense attributes from the images. The striding and padding is occupied so as not to miss out the required attributes in the image data. Once the attributes are extracted then the model enters into the classification stage [3–5, 16–18, 20]. Flattening layer is employed to reduce the dimension of the image to single dimension before the information is loaded to the classification layer. It assists in increasing the computational speed of the CNN model. The classification layer will be the fully connected layer or dense block layer. The classification layer in turn occupied the softmax activation function to classify the type of the disease in the given image data. This mechanism is called feed forward neural network. After which the losses can be calculated by the cross entropy loss function. The Adam optimizer with the loss function is to renovate the weights by propagating backwards from output layer. This process of reduce the error rate is known as back propagation. It improves the performance of the model [3]. The layers varies from architecture to architecture. Then the accuracy of the model needs to be Predicted using True positive, True Negative, False Positive and False Negative values (Fig. 4). The Accuracy, Precision, Recall and F1 Scores can be predicted for all the models of CNN to evaluate their performance. The numerous CNN techniques are evolved to proffer the better performance in terms of accuracy to predict the diseases from the image data. The performance of the deep learning techniques are improved over the decade and has deployment in all the fields peculiarly in medical fields to diagnosis the diseases.
116
A. Sharmila et al.
5 Incorporating Deep Learning in AAL Technology The deep learning techniques employs numerous algorithm over the decades by improving the performance, incorporating the changes in the layers of deep learning techniques. From Alex net to NAS net the deep learning techniques has improved its performance to a huge extent. Once the fitness of the individuals are recorded by the ambient assistive living technology then the information is conveyed to the graphical user interface from the cloud assisting the gateways, the deep learning techniques are employed by the caregivers to know the status of the dependents as shown in Fig. 5, take the adequate measures on exigencies [6]. It lessen the visiting of doctors by the impairment people often. It reduces the burden of the doctors, in exigency the doctors can remotely treat the patients assisting the caregivers.
5.1 Alex Net Algorithm Lung cancer is a mortal disease. It is hard to find the cancer in its initial stages. Predicting the stage of the cancer in which the individual is prevailing is quite hard to find by employing traditional methods [7]. Deep learning model peculiarly the Alex net can predict the stage of cancer with immense precision. Alex net model by analyzing the image obtained from the individual can be able to predict the stage of cancer in the person is prevailing [7]. Alex net performs well in detecting the blood cancer from the image data [8]. The traditional methods of detecting blood cancer takes more time and is tiring in process to detect the disease. It requires the microscope, skilled labor, experts to identify the presence of cancer cells [40]. The employment of deep learning techniques disease detection detects the disease in seconds and it is more accurate [8]. The deep learning techniques are trained by the images obtained from the cancer patients. The deep learning can detect the cancer at its initial stages.
Fig. 5 Deep Learning Model
AAL with Deep Learning to Classify the Diseases Remotely …
117
Fig. 6 Alex net architecture
Alex net architecture comprises of convolutional layer to draws out the attributes of the images. The architecture employs five convolution layers and the activation function is the Rectified Linear Unit to decide the attributes to be forwarded and skipped. The pooling layers to draws the prime attributes. The fully connected layers to quit the unwanted attributes. Flattening layer to lessen the dimensions. The action layer employed in the fully connected layer is softmax. Softmax activation function is a categorical cross entropy. The finally the fully connected layer gives out output as normal or abnormal. Alex net proffers 98 percentage accuracy in predicting the cancer shown in Fig. 6. Alzheimer’s disease is a perilous disease affects the neurons in brains. The architectures of deep learning peculiarly the Alex net to detect the diseases in early stages by training the dataset. The Resnet detects the disease from the transfer learning techniques [21]. Alex net by incorporating the convolutional, pooling, flattening and fully connected layer with relu and softmax activation function detects the Alzheimer’s diseases with high accuracy. The Alex net performed well in comparison with Resnet. The datasets are draws out from the Magnetic Resonance Imaging scans available in the kaggle to train the Alex net model.
5.2 VGGNet 16 (2014) Cancer, Breast cancer etc. The cancer if detected in the initial stages may be cured. The deep learning techniques can detect the cancer cells even it is not much visible in the microscope [21]. These techniques predicts the diseases in lesser time and saves the life of humans. The deep learning model Alex net can detect the existence of lung cancer. VGG Net comprises of sixteen layers to procure precise results. VGG Net employs 13 convolution layers of numerous dimensions to extract the required attributes, proceeded by five pooling layers of various measures to draw the maximum attributes from the given image. Cancer is one of the deadly disease and solely responsible for the increased death rate in humans over the decades. There a variety of cancers inclusive of blood cancer, mind cancer, skin cancer. Three dense fully connected layers to predict the output from the attributes extracted from the input image. Fully connected layer comprises
118
A. Sharmila et al.
of thousand neurons. The architecture of VGG Net 16 exploits two activation functions [22]. The hidden layers uses the rectified linear unit as an activation function. Rectified linear unit outputs the value of zero for the negative values and reflects the positive values. Rectified Linear Unit is exploited to lessen the problem of vanishing gradient. Softmax is exploited as an Activation function in the full connected layer to detect the output shown in Fig. 7, from the input diseased image [21, 22]. VGG Net provides precise results in comparison with the other convolutional models in detecting lung cancer.
5.3 VGG Net 19 Diabetic retinopathy disease results in loss of sight and cardio vascular diseases. The disease finds difficult to detect in its initial states [23]. The diabetics if detected in its initial stages, prevents the damage of retina and other vision related riddles. Deep learning techniques are incorporated to detect the diabetics in the earlier stages. From the damages in the retinal images the deep learning techniques classify the stage of diabetic disease as mild, severe stage. The VGG Net 19 architecture comprises of sixteen convolutional layers to educe the prime attributes from the given input image. The pooling layer to extract the minimum, average and maximum attributes from the image upon the requisites. Three fully connected layers to decide the result. The activation function employed in the layers of VGG Net 19 is the ReLu Rectified Linear Unit and the softmax activation functions [23]. The activation functions decides which attributes to pass on to the next layer of the convolutional neural networks demonstrated in Fig. 8. The range of activation function differs from one activation function to another.
5.4 ResNet Detection of lung disease by examine the lung sounds is the traditional system of detecting defects in lungs by the doctors. At present deep learning techniques are exploited in the detection of lung diseases. The images of the diseased lung that is the lungs got affected by the cancer or any other diseases in the lungs are collected to form the dataset [24]. The dataset obtained are trained and tested to obtain the accuracy in the detection of diseases in lungs by employing the ResNet model of deep learning. The ResNet architecture in which initially the input image is mapped with zero padding. It is the process of adding zeroes to the border of the input image to avoid the problem of losing gestures in the borders. The input next fed to the convolution layer to draws out the required attributes [25]. Batch normalization to prevent the input from the riddle of over fitting. ReLu as an activation function. It allows the positive values and skip the negative values by zeroes. Max pooling to
AAL with Deep Learning to Classify the Diseases Remotely …
Fig. 7 VGG Net 16
119
120
A. Sharmila et al.
Fig. 8 VGG Net 19 architecture
draws the maximum attributes from the image. Pooling layer is proceeded by the convolution layer and the identity block. The identity block acts as a fake activation function. The identity block where the output is same as input. Proceeded by the average pooling. It clips the average attributes from the output of the identity Block. Flattening layer to reduce the extent of the image so as to lessen the computation time and increase the speed of computation. Fully connected layer is the output layer in which it comprises of Four thousand ninety six neurons for improving the accuracy in the prediction of the output. The activation function in the output layer will be softmax [26]. Softmax function can category the input image as normal, disorder. ResNet proffers high accuracy in lung disease detection demonstrated in Fig. 9. Tumor in the liver is identified by exploiting ResNet architecture of deep learning techniques. ResNet architecture comprises of numerous layer to preserve the
Fig. 9 RESNET architecture
AAL with Deep Learning to Classify the Diseases Remotely …
121
attributes in the input image and it avoids the riddles of over fitting. ResNet model finds precise results in tumor detection [27]. The architecture of Residual Neural Networks proffers an accuracy of Ninety eight percent in the detections of lung tumors. The ResNet model offers reasonable accuracy in predicting lung diseases and tumors in lungs. Breast cancer can be predicted by the ResNet architecture [28]. The ResNet model is combined with the Decision Tree to provide immense accuracy in cancer prediction. The residual neural networks works well in cancer prediction.
5.5 DenseNet Leukocyte prevents humans from diseases and fights with the infections. It acts a protective barrier. The count of the leukocyte should be stable to maintain the fitness. The increase or decrease in the count of the leukocytes has serious effects. Leukocytes are White Blood Cells. Imbalance in the Count of WBC results in numerous diseases [29]. Deep Learning techniques detects the diseases occurred in humans as a result of unstable WBC. Dense Net detects the disease with vast accuracy. Dense Net architecture comprises of convolutional layer, pooling layer, dense block layer, transition layer, dropout and the batch normalization layer. Initially the image from the dataset is fed into the convolution layer to educe the attributes from the input image. Pooling layer to lessen the measures in the input image [30]. Proceeded by the dense block layer. The dense block layer is composed of batch normalization layer, activation function which is Rectified Linear Unit, convolution layer and the drop out layer. Batch normalization is to lessen the over fitting. Rectified Linear Unit outputs the value of zero for negative values and the positive values are considered as it is. Drop out layer to avoid over fitting. It prevents the few neurons to carry forward to lessen over fitting [31]. Fully connected layer decides the output from the attributes gained in the previous layer. Dense Net architecture in which all the layers in the architecture are connected to one another as shown in Fig. 10. X rays are solely responsible for the prediction of certain diseases. Naked eye prediction may fail in some cases. The deep learning techniques as it has deployment in all fields, works well with the prediction of heart related diseases. Dense Net architecture with variation in the number of the layers is the right choice for the prediction of lung diseases with gigantic accuracy. Glaucoma results in loss of sight. It is curable in its initial stages. Predicting glaucoma in its initial stages has no symptoms so predication is hard [32, 33]. Routine eye checks should be done to get protection against glaucoma. From the images of retinal tissues the glaucoma can be predicted by the deep learning techniques. The Dense Net layer encloses numerous layers in the accurate prediction of glaucoma
122
A. Sharmila et al.
Fig. 10 DenseNet architecture
[34]. The Dense Net model of deep learning proffers an accuracy of nighty seven percent in the prediction of glaucoma.
5.6 MobileNet Diseases in foot, hand, mouth is an epidemic neonatal diseases shows symptoms inclusive of reckless in mouth, foot, hands, fever, peptic ulcer. Prediction of diseases in mouth, foot, hand is pretty intricate as the images are similar to cancer images [35]. Exploiting deep learning techniques in disease detection in hand, foot, mouth proffers immense precision and accuracy. Numerous deep learning techniques are incorporated in the proposed work to lessen the riddle of discrepancy in classifications. Multilayer Perceptron, Mobile Net to draws the attributes from the input images, allowing only the required attributes to pass through, drop out the unwanted requisites, and lessen the measures of the images, accuracy in classifications. Hybrid deep learning models to proffers ninety nine percentage of accuracy. Detection of disease in brain from the Magnetic Resonance Images MRI can be exploited with the deep learning models. The puzzle of collecting large number of images for dataset is quite arduous. The researchers employs the training learning to overcome the difficulties in dataset collection [36]. Transfer Learning works well with the small datasets. The architectures of MobileNetV1, V2 for educe the attributes from the input image. In Inclusion the proposed work employs bilinear methods to concatenate the distinguishing characteristics from the Mobile Net V1, V2 architectures. The system proffers an accuracy of ninety eight percentage in prediction. Alzheimer’s disease can also be predicted by incorporating Mobile Net model of deep learning. The disease is curable when predicted in the earlier stages [37].
AAL with Deep Learning to Classify the Diseases Remotely …
123
Proper mediation can affectively assists in overcoming the disease. The proposed work outputs two factors. One to detect the presence of disease, other will decide the severe levels of the disease. Mobile Net offers accuracy of ninety one percentage in the prediction of Alzheimer in the initial stages. Detection of skin cancer by employing the camera in the mobile phones. The deep learning models like Mobile Net and the faster RCNN are employed in the android mobiles to predict the skin cancer. The dataset comprises of six hundred datasets to predict two crucial kinds of cancer. The android mobile phone is incorporated with the Mobile Net architecture and the faster RCNN to Jupiter notebook to offer immense precision in predicting the skin cancer in real time [38]. The mobile Net architecture comprises of convolution, pooling layers and the fully connected layers to educe the attributes from the input image, draws out the required maximum attributes, skip the unrequired attributes to lessen the riddles in the deep learning models. The activation function decides which attributes to carry forward and which attributes to drop out. Sigmoid activation function to decide the type of the skin cancer which occurs in the human [39]. The activation function in the convolutional layers will be the Rectified Linear Unit and Sigmoid at the fully connected layer for accurate prediction. Deep Learning techniques hence predict the disease from the image data as shown in Fig. 11. Once the image data is fed as an input to the neural network then the numerous architecture employed can predict the disease. In the Graphical User Interface the image data is given as an input. The GUI where the neural network can be employed at the systems to predict the state of the patient related to the disease that is initial stage or under risk conditions. The medical professionals can start the treatment either the data collected from the deep learning techniques.
Fig. 11 Mobile Net architecture
124
A. Sharmila et al.
6 Conclusion Ambient Living Technology can proffer the sophisticated life for the impairment and the elder people by complete surveillance by the caregivers and the medical professionals. Disease in the impairment people can be identified by incorporating deep learning techniques with the ambient assistive living technologies. The technologies records numerous parameters like pressure level, pulse rate, oxygen level, sugar level, numerous categories of cancers, fall detection [42–43], temperature level etc. Disease detection by concatenating ambient assistive living technologies with the deep learning techniques can predict the disease with immense precision. Graphic User Interface where the state of the impairment or older people are recorded by collected the data from the sensor through the gateway to the cloud and to the GUI. The images received from the impairment people are produced as an input to the deep learning model to predict the disease in the absence of the medical professionals. The technologies employed prevent the impairment people to urge to the hospitals and lessen the tension of the personals and the caregivers. The impairment people becomes self-dependent and feel safe and secure by the employment of ambient assistive living technology.
References 1. Jovanovic, M., Mitrov, G., Zdravevski, E., Lameski, P., Colantonio, S., Kampel, M., Tellioglu, H., & Florez- Revuelta, F. (2022). Ambient assisted living: Scoping review of artificial intelligence models, domains, technology, and concerns. Journal of Medical Internet Research, 24(11), e36553. https://doi.org/10.2196/36553 2. Gams, M., Gu, I. Y.-H., Härmä, A., Muñoz, A., & Tam, V. (2019). Artificial intelligence and ambient intelligence. Journal of Ambient Intelligence and Smart Environments, 11(1), 71–86. https://doi.org/10.3233/AIS-180508 3. Dai, Y., Shen, L., Cao, Y., Lei, T., & Qiao, W. (2019). Detection of vegetation areas attacked by pests and diseases based on adaptively weighted enhanced global and local deep features. IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, 6495– 6498. https://doi.org/10.1109/IGARSS.2019.8898517 4. Marefat, M., & Juneja, A. (2019). Serverless data parallelization for training and retraining of deep learning architecture in patient-specific arrhythmia detection. IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), 1–4. https://doi.org/10.1109/BHI.2019. 8834566 5. Shu, M. (2019). Deep learning for image classification on very small datasets using transfer learning. Creative Components. 345. https://lib.dr.iastate.edu/creativecomponents/345 6. Shaheen, M., Khan, R., Biswal, R. R., Ullah, M., Khan, A., Uddin, M. I., Zareei, M., & Waheed, A. (2021). Acute Myeloid Leukemia (AML) detection using AlexNet model. Complexity, 2021(Article ID 6658192), 8. https://doi.org/10.1155/2021/6658192 7. Agarwal, A., Patni, K., & Rajeswari, D. (2021). Lung cancer detection and classification based on Alexnet CNN. 2021 6th International Conference on Communication and Electronics Systems (ICCES), 1390–1397. https://doi.org/10.1109/ICCES51350.2021.9489033 8. Alkafrawi, I. M. I., & Dakhell, Z. A. (2022). Blood cells classification using deep learning technique. International Conference on Engineering & MIS (ICEMIS), 1–6. https://doi.org/10. 1109/ICEMIS56295.2022.9914281
AAL with Deep Learning to Classify the Diseases Remotely …
125
9. Bastaki, B. B., Bosakowski, T., & Benkhelifa, E. (2017). Intelligent assisted living framework for monitoring elders. 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), 495–500. https://doi.org/10.1109/AICCSA.2017.213 10. Ziefle, M., Rocker, C., & Holzinger, A. (2011). Perceived usefulness of assistive technologies and Electronic services for ambient assisted living. 2011 5th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth) and Workshops, 585– 592. https://doi.org/10.4108/icst.pervasivehealth.2011.246044 11. Hartanto, C. A., & Wibowo, A. (2020). Development of mobile skin cancer detection using faster R-CNN and MobileNet v2 model. 2020 7th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), 58–63. https://doi.org/10. 1109/ICITACEE50144.2020.9239197 12. Saha, S., Bhadra, R., & Kar, S. (2021). Diagnosis of COVID-19 & Pneumonia from Chest x-ray Scans using modified MobileNet architecture. 2021 IEEE Mysore Sub Section International Conference (MysuruCon), 793–798. https://doi.org/10.1109/MysuruCon52639.2021.9641739 13. Naga Srinivasu, P., JayaLakshmi, G., Jhaveri, R. H., & Praveen, S. P. (2022). Ambient assistive living for monitoring the physical activity of diabetic adults through body area networks. Mobile Information Systems, Article ID 3169927. https://doi.org/10.1155/2022/3169927 14. Parvin, P., Paternó, F., & Chessa, S. (2018, June 25–28). Anomaly detection in the elderly daily behavior. In Proceedings of the 14th International Conference on Intelligent Environments. IEEE. 15. Forkan, A., Khalil, I., & Tari, Z. (2014). CoCaMAAL: A cloud-oriented context-aware middleware in ambient assisted living. Future Generation Computer Systems, 35, 114–127. 16. Yang, Y., Gu, H., Han, Y., & Li, H. (2020). An end-to-end deep learning change detection framework for remote sensing images. IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, 652–655. https://doi.org/10.1109/IGARSS39084.2020.932 4076 17. Latha, R. S., Sreekanth, G. R. R., Suganthe, R. C., & Selvaraj, R. E. (2021). A survey on the applications of Deep Neural Networks. 2021 International Conference on Computer Communication and Informatics (ICCCI), 1–3. https://doi.org/10.1109/ICCCI50826.2021.945 7016 18. Valarmathi, S., & Vijayabhanu, R. (2021). A survey on diabetic retinopathy disease detection and classification using deep learning techniques. 2021 Seventh International conference on Bio Signals, Images, and Instrumentation (ICBSII), 1–4. https://doi.org/10.1109/ICBSII51839. 2021.9445163 19. Muhammad, K., Khan, S., Ser, J. D., & Albuquerque, V. H. C. d. (2021, February). Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey. IEEE Transactions on Neural Networks and Learning Systems, 32(2), 507–522. https:/ /doi.org/10.1109/TNNLS.2020.2995800 20. He, Z. (2020). Deep learning in image classification: A survey report. 2020 2nd International Conference on Information Technology and Computer Application (ITCA), 174–177. https:// doi.org/10.1109/ITCA52113.2020.00043 21. Aziz, S., Bilal, M., Khan, M. U., & Amjad, F. (2020). Deep learning-based automatic morphological classification of leukocytes using blood smears. 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), 1–5. https://doi.org/10. 1109/ICECCE49384.2020.9179246 22. Al-Adhaileh, M. H. (2021). Diagnosis and classication of Alzheimer’s disease by using a convolution neural network algorithm. 2021 Soft Computing. https://doi.org/10.21203/rs.3.rs1021353/v1 23. Thanzeem Mohamed Sheriff, S., Venkat Kumar, J., Vigneshwaran, S., Jones, A., & Anand, J. (2021). Lung cancer detection using VGG NET 16 architecture. International Conference on Physics and Energy 2021 (ICPAE 2021). https://doi.org/10.1088/1742-6596/2040/1/012001 24. Zakaria, N., Mohamed, F., Abdelghani, R., & Sundaraj, K. (2021). Three ResNet deep learning architectures applied in pulmonary pathologies classification. 2021 International Conference on Artificial Intelligence for Cyber Security Systems and Privacy (AI-CSP), 1–8. https://doi. org/10.1109/AI-CSP52968.2021.9671211
126
A. Sharmila et al.
25. Budhiman, A., Suyanto, S., & Arifianto, A. (2019). Melanoma cancer classification using ResNet with data augmentation. 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 17–20. https://doi.org/10.1109/ISRITI48646.2019. 9034624 26. Sirco, A., Almisreb, A., Tahir, N. M., & Bakri, J. (2022). Liver tumour segmentation based on ResNet technique. 2022 IEEE 12th International Conference on Control System, Computing and Engineering (ICCSCE), 203–208. https://doi.org/10.1109/ICCSCE54767.2022.9935636 27. Praveen, S. P., Jyothi, V. E., Anuradha, C., VenuGopal, K., Shariff, V., & Sindhura, S. (2022). Chronic kidney disease prediction using ML-based Neuro-Fuzzy model. International Journal of Image and Graphics, 2340013. https://doi.org/10.1142/S0219467823400132 28. Zheng, Z., Zhang, H., Li, X., Liu, S., & Teng, Y. (2021). ResNet-based model for cancer detection. 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), 325–328. https://doi.org/10.1109/ICCECE51280.2021.9342346 29. Bing-jin, L., Jian, Y., Yan-jun, L., Liang, P., & Guo-xiong, L. (2020). Research and practice of X-ray chest film disease classification based on DenseNet. 2020 International Conference on Artificial Intelligence and Education (ICAIE), 241–244. https://doi.org/10.1109/ICAIE50891. 2020.00063 30. Lalitha, V., Raghul, G., & Premkumar, A. R. (2020). Leukocyte counting and reporting using densenet deep learning. 2020 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), 1–6. https://doi.org/10.1109/ICPECTS49113.2020.933 7022 31. Wang, Q., Yang, B., Liu, W., & Chen, G. (2021). X-ray images detection of COVID-19 based on deepwise separable DenseNet. 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP), 294–298. https://doi.org/10.1109/ICSIP52628.2021.9688876 32. Tiwari, R., Verma, M., & Sar, S. K. (2022). Detecting different thoracic disease using CNNmodel. 2022 International Conference for Advancement in Technology (ICONAT), 1–11. https:/ /doi.org/10.1109/ICONAT53423.2022.9725940 33. He, G., Ping, A., Wang, X., & Zhu, Y. (2019). Alzheimer’s disease diagnosis model based on three dimensional full convolutional DenseNet. 2019 10th International Conference on Information Technology in Medicine and Education (ITME), 13–17. https://doi.org/10.1109/ ITME.2019.00014 34. Ovreiu, S., Paraschiv, E.-A., & Ovreiu, E. (2021). Deep learning & digital fundus images: Glaucoma detection using DenseNet. 2021 13th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), 1–4. https://doi.org/10.1109/ECAI52376.2021. 9515188 35. Naga Srinivasu, P., Krishna, T. B., Ahmed, S., Almusallam, N., Khaled Alarfaj, F., & Allheeib, N. (2023). January 17). Variational autoencoders-basedself-learning model for tumor identification and impact analysis from 2-D MRI images. Journal of Healthcare Engineering, 2023, 1–17. https://doi.org/10.1155/2023/1566123 36. Verma, S., Razzaque, M. A., Sangtongdee, U., Arpnikanondt, C., Tassaneetrithep, B., & Hossain, A. (2021). Digital diagnosis of hand, foot, and mouth disease using hybrid deep neural networks. IEEE Access, 9, 143481–143494. https://doi.org/10.1109/ACCESS.2021.3120199 37. Francis, A., & Pandian, I. A. (2021). Early detection of Alzheimer’s disease using ensemble of pre-trained models. 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), 692–696. https://doi.org/10.1109/ICAIS50930.2021.9395988 38. Rumala, D. J., et al. (2021). Bilinear MobileNets for multi-class brain disease classification based on magnetic resonance images. 2021 IEEE Region 10 Symposium (TENSYMP), 1–6. https://doi.org/10.1109/TENSYMP52854.2021.9550987 39. Ahmed, S., Srinivasu, P., Alhumam, A., & Alarfaj, M. (2022, November 9). AAL and internet of medical things for monitoring type-2 diabetic patients. Diagnostics, 12(11), 2739. https:// doi.org/10.3390/diagnostics12112739
AAL with Deep Learning to Classify the Diseases Remotely …
127
40. Prawira, R., Bustamam, A., & Anki, P. (2021). Multi label classification of retinal disease on fundus images using AlexNet and VGG16 architectures. 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 464–468. https://doi. org/10.1109/ISRITI54043.2021.9702817 41. Badgujar, S., & Pillai, A. S. (2020, July 1–3). Fall detection for elderly people using machine learning. In Proceedings of the 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE. 42. Sarabia, D., Usach, R., Palau, C., & Esteve, M. (2020). Highly-efficient fog-based deep learning AAL fall detection system. Internet Things, 11, 100185. https://doi.org/10.1016/j.iot.2020. 100185 43. Srinivasu, P. N., Bhoi, A. K., Jhaveri, R. H., Reddy, G. T., & Bilal, M. (2021, July 17). Probabilistic deep Q network for real-time path planning in censorious robotic procedures using force sensors. Journal of Real-Time Image Processing, 18(5), 1773–1785. https://doi.org/10. 1007/s11554-021-01122-x
Heart Failure Prediction Using Radial Basis with Metaheuristic Optimization Varshitha Vankadaru, Greeshmanth Penugonda, Naga Srinivasu Parvathaneni, and Akash Kumar Bhoi
Abstract Heart failure is a major factor in morbidity and death. Worldwide. Early detection of heart failure is critical to improving patient outcomes. Machine learning techniques have been used for heart failure detection in the past few years. This study suggests a novel approach for heart failure detection using a Radial Basis Function (RBF) neural network with Genetic Algorithm (GA) optimization. The suggested approach is applied to the publicly available heart failure dataset, and its performance is evaluated using various performance metrics. The proposed method uses data from 12 different features to train the RBF network and GA to find the optimal network parameters. The efficacy of the suggested method was assessed using a dataset of heart failure patients and healthy individuals. According to the results, the accuracy of the proposed method was achieved, which is 92.6%. These results outperformed several existing methods, demonstrating the potential of the proposed approach for heart failure detection. The proposed method can provide a reliable, non-invasive, and cost-effective tool for the early detection of heart failure, which can help reduce the burden of this disease on individuals and healthcare systems. Keywords Heart Failure Prediction · Radial basis function · Ambient Assisted Living · Metaheuristic Optimization · Correlation Heatmap
V. Vankadaru · N. S. Parvathaneni (B) Department of Computer Science and Engineering, Prasad V Potluri Siddhartha Institute of Technology, Vijayawada 520007, India e-mail: [email protected] G. Penugonda Department of Computer Science and Engineering, V.R. Siddhartha Engineering College, Vijayawada 520007, India A. K. Bhoi KIET Group of Institutions, Ghaziabad 201206, India Directorate of Research, Sikkim Manipal University, Gangtok 737102, Sikkim, India A. K. Bhoi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_6
129
130
V. Vankadaru et al.
1 Introduction Artificial intelligence is making significant strides in the medical field [1]. It is now being used extensively for disease diagnosis and prognosis, often with astounding accuracy, in cases when traditional methods are ineffective. One of the major branches of artificial intelligence is machine learning, which has a wide range of components. Machine learning in healthcare has gained immense popularity over the years, with a wide range of applications that enhance the quality of medical services. Machine learning algorithms can analyze vast medical data, including electronic health records, lab results, and demographic information [2–4]. This leads to improved diagnosis and treatment, personalized patient care, and identifying high-risk patients. One of the most important applications of machine learning in healthcare is in the field of predictive analytics. Predictive models can predict the likelihood of a patient developing certain conditions, enabling healthcare providers to take preventative measures. For diagnosis, we will use deep learning methods, fairly profound neural networks, which process even the minutiae and consider them when computing results, making them extremely effective. Machine learning has a promising future in heart failure, a prevalent and complex medical condition. Machine learning algorithms can analyze vast amounts of patient data to identify individuals at high risk of developing heart failure. Predictive models can also predict the likelihood of readmission or hospitalization for patients with heart failure, allowing for proactive interventions. Machine learning algorithms can improve heart failure diagnosis accuracy by analyzing ECG signals, medical images, and lab results [5]. Machine learning can also monitor and track heart failure’s progression, enabling medical professionals to provide more effective treatment. Integrating machine learning in heart failure can potentially improve patient outcomes and reduce the burden on healthcare systems. Congestive heart failure, also known as heart failure, happens when the heart muscle cannot pump blood as efficiently as needed. As a result, blood often accumulates, and fluid builds up in the lungs, frequently leading to breathing difficulties [6–8]. The following are the most common symptoms of heart failure . Shortness of breath: This is a common symptom of heart failure, especially during physical activity or while lying down. As a result of the heart’s inability to pump blood efficiently, fluid builds up in the lungs, causing shortness of breath. . Fatigue: Heart failure can cause a decrease in blood flow, leading to fatigue and weakness. . Swelling: Fluid buildup in the body can cause swelling in the legs, ankles, and feet, making it difficult to move. . Rapid or irregular heartbeat: Heart failure can cause an irregular heartbeat, leading to fainting and dizziness. . Chest pain: In some cases, heart failure can cause chest pain due to fluid buildup in the lungs. . Coughing: Heart failure can cause a persistent cough that produces pink, frothy mucus.
Heart Failure Prediction Using Radial Basis with Metaheuristic …
131
. Decreased appetite and weight gain: Fluid buildup in the body can cause weight gain and a decrease in appetite. Radial basis function (RBF) networks are a type of machine learning algorithm that can be used to diagnose and treat heart failure [9]. Figure 1 represents the picture for the usage of RBF over IOT. RBF networks can analyze large amounts of medical data, including electronic health records, demographic information, and lab results, to identify the risk factors associated with heart failure. RBF networks can be used to develop predictive models that estimate the likelihood of a patient developing heart failure, enabling healthcare providers to take preventative measures. Additionally, RBF networks can be used to analyze electrocardiograms (ECGs) and other medical images to detect early signs of heart failure and improve diagnostic accuracy. RBF networks can also be used to develop decision support systems that provide physicians with real-time guidance during the treatment and diagnosis of heart failure. The ability of RBF networks to analyze vast amounts of data and make predictions based on this information makes them a valuable tool in managing heart failure. In conclusion, RBF networks have the potential to revolutionize the diagnosis and treatment of heart failure, enabling healthcare providers to identify high-risk patients, improve diagnosis accuracy, and provide personalized patient care. With continued
Fig. 1 Diagram representing RBF over IoT
132
V. Vankadaru et al.
advancements in RBF networks [10], the future of heart failure management looks promising. The objectives of the current study are listed below. . Our proposed model aims to address the limitations identified in the continuous monitoring of patients by studying similar technologies. . Any anomalies in the values of the features in the dataset are recognized and specific to particular locations. . An architecture incorporating future perspective models is used to demonstrate the role of RBF. . We evaluate the performance of the classification model by examining the outcomes of the suggested model.
2 Role of Ambien-Assistive Living in Heart Failure Prediction AAL (AmbientAssisted Living) [11] is a specialized nursing technology that enhances the health of elderly people by utilizing ambient intelligence. By employing AAL technologies, it is possible to prevent, treat, and manage the health issues of the elderly, allowing them to live independently and safely. Alert systems and daily activity tracking are two more AAL options that might be helpful. The Internet of Things (IoT), which enables numerous items to interact, analyze, perceive, and act, is a natural step forward in contemporary communication. IoT sensors in an AAL environment can support senior citizens’ independence and safety. Advanced technologies such as Keep In Touch (KIT) are utilized to ensure that seniors are as secure and autonomous as possible. Closed-loop medical services may evaluate crucial data using KIT technologies and create lines of communication with senior patients’ surroundings, diverse groups of carers, and healthcare professionals. In AAL scenarios, KIT technology and Closed-Loop Healthcare Services collaborate to build an IoT infrastructure. The three words “who, where, and why” may summarise various factors to consider while developing an AAL system. The technology must monitor people with various ailments and elderly people to assist them me daily activities. The fact that an AAL system performs various tasks, from straightforward warnings to more thorough psychological profiling, is also crucial to highlight (why). Using the IoT framework and biosensors for continuous monitoring and alarming abnormal levels, the suggested solution is compatible with the present healthcare situations of elderly people with heart failure. In conclusion, AAL technology and IoT solutions can potentially improve the quality of life of the elderly. IoT solutions such as KIT and location-based services can make aging in one’s residence easier and promote functional independence. Developing an AAL system involves considerations such as whom the technology is designed for, where it will be deployed, and why it is needed. Overall, AAL and IoT solutions are valuable tools to provide better care for the elderly and enhance their quality of life.
Heart Failure Prediction Using Radial Basis with Metaheuristic …
133
3 Dataset A heart failure clinical records dataset is a collection of medical data related to patients diagnosed with heart failure. The dataset typically includes demographic information, medical history, laboratory results, and medication records [12]. The dataset may also include electrocardiogram (ECG) results, other medical imaging data, and patient outcomes and follow-up information. A heart failure clinical records dataset can be used to develop predictive models and decision support systems to help healthcare providers diagnose and treat heart failure more accurately. The dataset can also be used in the analysis of trends and risk factors associated with heart failure, enabling the identification of high-risk patients and the development of preventative measures [13]. The attributes of the dataset are explained in Table 1 as follows: The distributions of the attributes (12 predictors and 1 target) have been represented in the form of a graph as in Fig. 2.
4 Correlation Heatmap Correlation between dataset attributes refers to the relationship between two or more variables, and how they change. In other words, it measures the extent to which two variables are related. When two variables have a positive correlation, they tend to rise together when one increases. When two variables are negatively linked, on the other hand, it indicates that as one variable rises, the other variable tends to fall. The Pearson correlation coefficient, Spearman’s rank correlation coefficient, and Kendall’s rank correlation coefficient are just a few methods used to assess the relationship between two variables. The coefficient of Pearson’s correlation, which ranges from −1 to 1, quantifies the linear link between two variables. A perfect negative correlation is represented by a value of −1, a perfect positive correlation is represented by 1, and no correlation is represented by 0. In data analysis, understanding the correlation between variables is important because it provides insight into the relationships between variables, and can help to identify which variables are important to include in a model or analysis. For example, if two variables are highly correlated, including both in a model may not be necessary, as one variable may provide redundant information. Additionally, if two variables are negatively correlated, they may be important to consider together in a model, as they may have opposing effects on the outcome being studied. Figure 3 shows the correlation matrix of attributes. In conclusion, the correlation between attributes of a dataset is an important concept in data analysis, as it provides insight into the relationships between variables, and helps to identify important variables to include in a model or analysis. It is also useful in identifying confounding variables, and avoiding incorrect inferences about relationships between variables (Table 2). The Pearson’s correlation coefficient
134
V. Vankadaru et al.
Table 1 Summary of the dataset Attributes Mean
STD
Heart failure (true)
Heart failure (false)
Mean
Mean
STD
STD
60.833
11.894
58.761906
10.63789
65.215281
13.214556
–
–
–
–
–
–
581.849
970.287
540.054187
753.7995
670.197917
1316.58064
–
–
–
–
–
–
38.083
11.834
40.266010
10.85996
33.468750
12.525303
–
–
–
–
–
–
263358.02
97804.23
266657.489
97531.20228
256381.044792
98525.6828
1.393
1.034
1.184877
0.654083
1.835833
1.46856
4.412
137.216749
137.21674
3.982923
135.375000
5.00157
–
–
–
–
–
–
–
–
–
–
–
–
130.260
77.614
158.33990
67.742872
70.885417
62.378281
No
Symbol
Description
Measurement
Type
Data Range
1
Age
Patient’s age
Years
Numeric
Numeric
2
Anaemia
Reduce in number Boolean of red blood cells or hemoglobin
Binary
0 = No; 1 = yes
3
Creatinine_ phosphokinase
Amount of CPK in blood
mcg/L
Numeric
[23–7861]
4
Diabetes
If the patient has diabetes or not
Boolean
Binary
0 = No; 1 = yes
5
Ejection_ fraction
Percentage of blood leaving the heart at each contraction
Percentage
Numeric
[14–80]
6
High_blood_ pressure
Whether the patient is hyper-tensed or not
Boolean
Binary
0 = No; 1 = yes
7
Platelets
Platelets in the blood
Kilo platelets/mL
Numeric
[25100–850000]
8
Serum_ creatinine
Level of serum creatinine in the blood
mg/dL
Numeric
[0.5–9.4]
9
Serum_sodium
Level of serum sodium in the blood
mEq/L
Numeric
[113–148]
10
Sex
Woman or man
Binary
Binary
0 = female; 1 = male
11
Smoking
If the patient smokes or
Boolean
Binary
0 = No; 1 = yes (continued)
Heart Failure Prediction Using Radial Basis with Metaheuristic …
135
Table 1 (continued) No
Symbol
Description
Measurement
Type
Data Range
12
Period
Follow-up period
Days
Numeric
[4–285]
Fig. 2 Graph for distributions of attributes (Rows in Dataset—299. Death Event True—96. Death Event False—203)
formula is shown in Eq. 1. r=
sum((x − xmean ) × (y − ymean )) ( ) n × stdx × std y
where: . x and y are two variables being compared . xmean and ymean are the mean values of x and y, respectively
(1)
136
V. Vankadaru et al.
Fig. 3 Correlation Matrix of attributes of the dataset Table 2 Attributes and correlation coefficient relation
Attribute Age
Correlation coefficient 0.25
ejection_fraction
−0.27
serum_creatinine
0.29
serum_sodium
−0.20
Time
−0.53
“Age”, “ejection_fraction”, “serum_creatinine”, “serum_ sodium”, and “time” all significantly correlate with the characteristic “DEATH EVENT”
Heart Failure Prediction Using Radial Basis with Metaheuristic …
137
. n is the sample size. . stdx and std y are the standard deviations of x and y, respectively.
5 Proposed Methodology The proposed methodology has been explained in Fig. 4, and we will look into every aspect in more detail in this section. The primary components of the methodology will be (1) preprocessing the data, (2) RBF with metaheuristic optimization (3) The output predictions where we will have the binary classification of heart failure.
Fig. 4 A proposed methodology for RBF with Genetic algorithm for Heart failure prediction
138
V. Vankadaru et al.
5.1 Pre Processing 5.1.1
Missing Data
Missing data can pose a significant challenge in preprocessing data for a machinelearning model. When data is missing, it can impact the quality of the model and its predictions. Handling missing data involves identifying the missing data, analyzing the pattern of the missing data, and then either imputing the missing data or removing the records with missing data. We can observe that none of the columns in the dataset have any missing values.
5.1.2
Balancing Dataset
Imbalanced datasets can lead to biased model predictions and poor performance on minority class samples. This can result in poor accuracy and increased false negatives, leading to a skewed understanding of the data distribution. Additionally, imbalanced datasets can be challenging to effectively train machine learning algorithms. We can reproduce the balanced dataset by following three methodologies, . Undersampling the majority class: It can result in the loss of important information and lead to overgeneralization. It can also increase the variance of the model, making it less robust and potentially biased results toward the minority class. . Oversampling the minor class can lead to overfitting, decreased diversity in the training set, and an unrealistic representation of the true class distribution. . Oversampling the minor class using smote: SMOTE (Synthetic Minority Oversampling Technique) is a popular oversampling technique to balance imbalanced datasets [14]. Instead of just reproducing instances that already exist, it functions by creating synthetic samples of the minority class, such as represented in Fig. 5. The synthetic samples are generated by interpolating between existing minority samples in feature space, using nearest neighbors. The advantage of using SMOTE is that it can increase the size of the minority class without introducing duplicates, reducing overfitting, and maintaining diversity in the training set. SMOTE can also help improve the performance of machine learning algorithms in imbalanced datasets. 5.1.3
Outlier Analysis
An observation in a dataset that differs considerably from most data points constitutes an outlier. These observations can be due to measurement errors, incorrect data entry, or simply represent a rare occurrence. Outliers can significantly impact statistical analysis and machine learning algorithms, leading to incorrect results or conclusions [15]. It’s important to identify and handle outliers carefully to ensure the accuracy of the results. DBSCAN does not require the specification of the number of clusters
Heart Failure Prediction Using Radial Basis with Metaheuristic …
139
Fig. 5 Oversampling the minority class using smote
in advance, which makes it well-suited for outlier detection in large and complex datasets. A density-based clustering approach that may be used for outlier identification is called DBSCAN (Density-Based Spatial Clustering of Applications with Noise). The advantage of using DBSCAN is that it can identify arbitrarily shaped clusters, including clusters with irregular or elongated shapes. The algorithm also does not require knowledge of the distribution of the data, making it well-suited for non-normally distributed data. DBSCAN is an effective and efficient method for identifying outliers in large and complex datasets. The parameter issue exists for every task. Each parameter has a unique impact on the algorithm. An unsupervised machine learning method called DBSCAN finds clusters of various forms in a data collection. Epsilon (ε) and MinPts. are the two most crucial parameters that must be specified for DBSCAN. (a) Selection of EPS using silhouette_score In a clustering technique, the silhouette score measures how well-separated the clusters are. A value between −1 and 1 compares a sample’s similarity to its own cluster to those of other clusters. The silhouette score can be used to evaluate the performance of a clustering algorithm and choose the right number of clusters or parameters for the algorithm. A high silhouette score indicates that the samples in a cluster are similar and different from the samples in other clusters. Low silhouette scores suggest that samples within a cluster are either not sufficiently distinct from samples within other clusters, or that samples within a cluster are comparable to samples across several
140
V. Vankadaru et al.
clusters. A single score for the entire dataset is obtained by calculating the silhouette score for each sample in the dataset and then averaging the results. Equation 2 shows the silhouette score for a single s=
(b − a) max(a, b)
(2)
where a is the mean distance between the sample and all the other samples in its cluster, and b is the mean distance between the sample and all the samples in the nearest cluster. A score of 1 means that the sample is well-separated from the samples in other clusters, and a score of −1 means that the sample is poorly separated from the samples in other clusters. The silhouette scores are then calculated against the eps values to get the relationship between the eps value and the silhouette score, which can be used to choose the right eps value. A higher silhouette score indicates the better separation of the clusters, so the eps value corresponding to the highest silhouette score is a good choice. And hence, Eps = 2.25 is selected. (b) Selection of midpoints The number of dimensions D—attributes in our case—in the data collection may be used as a general guideline to determine the lowest MinPts. The low number of MinPts = 12 is illogical since every single point would constitute a cluster independently. The outcome will be the same as hierarchical clustering using the single link metric, with the dendrogram truncated at height εif MinPts is less than 24. Consequently, MinPts must be selected with a minimum of 13. Larger numbers, however, often perform better for data sets with noise and will produce more meaningful clusters. MinPts = 2*attributes can be used as a general guideline, although it can be essential to pick greater values for huge data, noisy data, or data that contains many duplicates. The clusters from DBSCAN have been formed, as shown in Fig. 6. By executing DBSCAN, a total of 10 outliers were detected are removed. Fig. 6 DBSCAN clustering for Heart Failure (eps = 2.25, min_samples = 24)
Heart Failure Prediction Using Radial Basis with Metaheuristic …
5.1.4
141
Feature Scaling and Selection
Normalization is a data preprocessing technique used in machine learning and data analysis to scale the values of variables to a similar range. In the context of a heart failure dataset, normalization can be used to bring the values of different variables into a similar scale, allowing the algorithms to treat all variables equally. There are several advantages of normalization in the heart failure dataset: 1. Improved Algorithm Performance: Normalization helps improve the performance quality of machine learning algorithms by reducing the influence of large-scale variables on the results. 2. Reduced Computational Complexity: Normalization helps to reduce the computational complexity of algorithms by reducing the number of computations required to process the data. This is because normalizing the data to a smaller scale reduces the number of digits required to represent the variables, reducing memory requirements and computation time. 3. Improved Data Interpretability: Normalization makes interpreting the results of machine learning algorithms easier by removing the influence of large-scale variables on the results. This allows the results to be more easily compared and understood. Min-Max Normalization: This technique scales the values of each variable to a range of 0 to 1 by subtracting the minimum value of the variable and dividing it by the range of the variable. In conclusion, normalization is an important preprocessing step in machine learning and data analysis, and it is particularly useful in the context of the heart failure dataset. Normalization helps improve the performance of algorithms, reduce computational complexity, and improve data interpretability, making it an important step in analyzing heart failure data. We have normalized the values of platelets attributed to normal values to reduce computations and increase the algorithm’s efficiency.
5.1.5
Data Splitting
The ratio in which the heart failure dataset should be split into training and testing sets depends on several factors, including the size of the dataset, the desired level of accuracy in the model, and the complexity of the problem being solved. A commonly used ratio is 70:30, where 70% of the dataset is used for training, and 30% for testing. This ratio provides a good balance between the size of the training set, which is necessary for the model to learn, and the size of the testing set, which is necessary for evaluating the model’s performance. However, in some cases, it may be necessary to adjust the ratio to better fit the specific needs of the problem. For example, if the dataset is small, it may be necessary to use a larger testing set to ensure that the model is being evaluated on a sufficient
142
V. Vankadaru et al.
amount of data. In contrast, if the dataset is very large, it may be possible to use a smaller testing set without sacrificing the accuracy of the evaluation. In addition to the ratio, it is also important to consider the method used for splitting the dataset. Stratified sampling, for example, can be used to ensure that the training and testing datasets accurately reflect the original dataset’s class distribution. This can be particularly important in imbalanced datasets, where some classes may be underrepresented, to ensure that the model is trained and tested on a representative sample of the data. In conclusion, the ratio in which the heart failure dataset should be split into training and testing sets will depend on several factors, including the size of the dataset, the desired level of accuracy in the model, and the complexity of the problem being solved. A commonly used ratio is 70:30, but this ratio may need to be adjusted depending on the specific needs of the problem. It is also important to consider the dataset’s splitting method to ensure that the training and testing sets represent the original data.
5.2 RBF Artificial neural networks that employ radial basis functions as activation functions are known as radial basis function (RBF) neural networks. It is a type of feedforward neural network, meaning that information only flows in one direction through the network. RBF networks have an input layer, a single hidden layer, and an output layer. In this explanation, we will provide a detailed explanation of the working of RBF neural networks, including their functions. The input layer is the network’s first layer, consisting of input neurons that receive the input data. The input data is usually normalized to ensure all inputs are on the same scale. Normalization is important because some input features may have a much larger range of values than others, which can cause the network to become biased towards the features with the larger range of values. The single hidden layer is the most significant layer of the network, and it contains the radial basis functions. Every neuron in the hidden layer computes the distance between its input and a center vector, and applies a radial basis function to this distance to produce its output. The center vectors are usually determined by clustering the input data using algorithms such as k-means clustering. Once the center vectors are determined, the activation of each neuron in the hidden layer is given by the radial basis function, shown in Eq. 3. ϕi = e
( ) X¯ −μi 2 − 2 2×σi
. X represents the input vector . μi is the prototype vector of the ith neuron
(3)
Heart Failure Prediction Using Radial Basis with Metaheuristic …
143
. σ as the bandwidth of the ith neuron . ϕ as the output of the ith neuron The parameters μi and σ , are learned unsupervised, such as using some clustering algorithm. The output layer is the network’s final layer, consisting of output neurons that produce the network’s output. The output neurons are usually linear neurons that compute a weighted sum of the activations of the hidden neurons, is shown in Eq. 4. y=
n ∑
wi φi
(4)
i
where . wi as the weight of connections . φ as hidden layer’s output of it’s ith neuron . y as the predicted result. Training Training an RBF neural network involves selecting the center vectors and adjusting the weights. The center vectors are selected using clustering algorithms such as kmeans clustering. The k-means algorithm is used to cluster the input data into k clusters, and the center vector of each cluster is used as a center vector for an RBF neuron in the hidden layer. After selecting the center vectors, the network weights are adjusted using a linear regression algorithm. The output of each neuron in the hidden layer is combined with the weights of the output neurons to produce the final output of the network. The weights are adjusted using the least-squares method to minimize the sum of the squared errors between the output of the network and the desired output. RBF neural networks have several advantages over other neural network architectures. Its capacity to approximate any continuous function to arbitrary accuracy is one of its key advantages. This property makes them very useful in function approximation and data modeling applications. RBF neural networks are also less sensitive to overfitting than other neural network architectures, such as multilayer perceptrons (MLPs). This is because they have a small number of weights, which reduces the risk of overfitting. Another advantage of RBF neural networks is their ability to handle noisy data. They are more robust to noise than MLPs, which makes them more suitable for real-world applications. However, RBF neural networks also have some disadvantages. One of the main disadvantages is their computational complexity. Training an RBF neural network requires the solution of a linear system of equations, which can be computationally expensive for large datasets. Another disadvantage of RBF neural networks is their sensitivity to the number and placement of the center vectors. Choosing the right number of center vectors and their placement is crucial for the network’s performance.
144
V. Vankadaru et al.
Fig. 7 Network architecture of RBF
The architecture of the RBF network has been depicted in Fig. 7 with three layers input, hidden and output layers.
5.3 RBF with Metaheuristic Optimization Radial basis function (RBF) is a popular activation function used in artificial neural networks that maps inputs to outputs in a smooth and non-linear manner. RBF activation functions are commonly used in various problems, including regression, classification, and function approximation. The RBF activation function approximates a target function, allowing the neural network to learn complex relationships between inputs and outputs. Metaheuristic optimization, on the other hand, is a class of optimization algorithms inspired by natural processes such as evolution, swarm behavior, and others. One of the popular metaheuristic algorithms is the genetic algorithm (GA) [16], which is used to solve optimization problems that are difficult to solve using traditional optimization methods. Integrating RBF with metaheuristic optimization has proven to be a powerful combination for solving various optimization problems. In this approach, RBF is used as a basis function to approximate the target function in the optimization problem, and the metaheuristic algorithm optimizes the parameters of the basis function to minimize the objective function. In the case of GA [17], the RBF parameters are optimized using a genetic representation and evolution operations such as selection, crossover, and mutation. One of the advantages of using RBF with GA is that it helps the optimization algorithm to converge faster to the global optimum. The smooth mapping provided
Heart Failure Prediction Using Radial Basis with Metaheuristic …
145
by the RBF activation function reduces the risk of getting stuck in local optima, as the GA can explore the entire search space more effectively. Additionally, using RBF helps reduce the computational cost associated with evaluating the objective function, as the basis function can be used to approximate the objective function rather than evaluate it explicitly. Another advantage of using RBF with GA is that it enables the optimization algorithm to effectively handle noisy and uncertain data. The smooth mapping provided by the RBF activation function reduces the impact of noise and uncertainty on the optimization process, making it more robust and reliable. This is especially important in real-world applications where data is often noisy and uncertain. In GA, the optimization problem is modeled as a search for the optimal set of parameters for the RBF basis function that minimize the objective function. The GA algorithm generates a population of candidate solutions, and each candidate solution is represented as a set of parameters for the RBF basis function. The objective function is evaluated for each candidate solution, and the fitness of each candidate solution is determined based on its objective function value. The GA then applies evolution operations such as selection, crossover, and mutation to generate new candidate solutions for the next generation. Selection determines which candidate solutions will survive and contribute to the next generation. The selection operation is typically implemented using a fitnessproportionate method, such as roulette wheel selection, in which candidate solutions with higher fitness values are more likely to be selected. Crossover generates new candidate solutions by combining the parameters of two or more selected parent solutions. Crossover can be performed using various techniques, such as single-point crossover, in which a single point is randomly selected along the parameter vector, and the parameters on either side of the point are swapped. The mutation is used to introduce random changes into the candidate solutions. Mutation can be performed using various techniques, such as Gaussian mutation, in which the parameters of a candidate solution are perturbed by adding random noise drawn from a Gaussian distribution. The GA continues to iterate through these evolution operations until a stopping criterion is met. The stopping criterion can be based on a maximum number of generations, a minimum change in the objective function value, or other criteria. The candidate solution with the best fitness value is then considered the solution to the optimization problem.
6 Experimental Setup and Results The diagnosis of heart failure will be binary form where 0 represents that heart failure is not true, and 1 for stating true. The training and testing data have been divided in the ratio of 70:30 of the total data. The results have been observed, and those are
146
V. Vankadaru et al.
used to calculate various metrics [18] that are used to measure the efficiency of the model: Accuracy Accuracy is a measure of the correctness of a prediction or classification model. It is defined as the ratio of the number of correct predictions or classifications to the total number of predictions or classifications made by the model. The formula for accuracy is shown in Eq. 5. Accuracy =
TP +TN T P + T N + FP + FN
(5)
Recall It measures the proportion of actual positive cases correctly identified by the model, mainly in a binary classification problem. The equation for the recall is shown in Eq. 6. Recall =
TP T P + FN
(6)
Precision Precision is a metric that evaluates the number of accurate positive predictions. Therefore, it measures the accuracy specifically for the minority class. The calculation divides the count of correctly predicted positive instances by the ratio of correctly anticipated positive examples. Equation 7 shows the formula for the precision Precision =
TP T P + FP
(7)
F1-Score Model’s prediction ability by combining accuracy and recall, which are typically conflicting measures. The F1-score is considered a more valuable metric than accuracy since it represents the harmonic mean of precision and recall. It ranges from a perfect score of 1 to a poor score of 0, making it a valuable tool for assessing a model’s performance. The formula to calculate the F1 score is shown in Eq. 8. F1score = 2 ×
Recall × precision Recall + precision
(8)
The Confusion matrix of the model is shown in Fig. 8. The accuracy of the model is 0.926829268292683. The classification report of the proposed methodology has been explained in Table 3.
Heart Failure Prediction Using Radial Basis with Metaheuristic …
147
Fig. 8 Confusion matrix
Table 3 Classification report 0.0
Precision
Recall
F1-score
Support
0.89
0.94
0.92
35
0.91
0.93
1.0
0.96
Accuracy
0.93
47
Macro avg
0.92
0.93
0.93
82
Weighted avg
0.93
0.93
0.93
82
82
ROC Curve The ROC curve can be used to evaluate the performance of a model that predicts whether a patient will develop heart failure. The model will predict a probability of heart failure for each patient. A classification threshold is set to convert these probabilities into binary or no heart failure predictions. The ROC curve is then used to assess the ability of the model to discriminate between patients who develop heart failure (positive samples) and those who do not (negative samples). The ROC curve is important in heart failure prediction because it comprehensively evaluates the model’s performance, considering both true positives (correctly identified patients with heart failure) and false positives (incorrectly identified patients as having heart failure). By plotting sensitivity against specificity, the ROC curve allows the model to be evaluated at different classification thresholds, helping to choose the best threshold for a given situation. The ROC curve has been drawn between the true positive rate and the false positive rate, which is shown in Fig. 9.
148
V. Vankadaru et al.
Fig. 9 Graph between true positive rate and false positive rate
Performance Comparison To compare our model’s performance with other models, we have compared it with other existing technologies, and the report has been labeled in Table 4. Table 4 Performance comparison between different models with the proposed methodology
Model name
Accuracy
Random forest
0.740
Decision tree
0.737
Gradient boosting
0.738
Linear regression
0.730
Naive Bayes
0.729
RBF
0.852
Proposed methodology
0.926
Heart Failure Prediction Using Radial Basis with Metaheuristic …
149
7 Conclusion In conclusion, the proposed approach for heart failure detection using a Radial Basis Function (RBF) neural network with Genetic Algorithm (GA) optimization is a promising method for improving the early detection of heart failure. The results demonstrate that the proposed approach outperforms other machine learning techniques reported in the literature, achieving high accuracy. GA optimization enhances the performance of the RBF neural network by finding the optimal values for the network parameters. The proposed approach has the potential to be used as a productive early identification tool for heart failure, which can help improve patient outcomes and reduce healthcare costs. Early detection of heart failure can lead to early intervention and management, improving patient prognosis and quality of life. Future research can explore the use of the proposed approach in a clinical setting to evaluate its effectiveness in real-world scenarios. Furthermore, the approach can be extended to include other relevant clinical variables to improve the accuracy of heart failure detection. Overall, the proposed approach shows promise in improving the early detection of heart failure, leading to better patient outcomes and reduced healthcare costs.
References 1. Allahabadi, H., et al. (2022, December). Assessing trustworthy AI in times of COVID-19: Deep learning for predicting a multiregional score conveying the degree of lung compromise in COVID-19 patients. IEEE Transactions on Technology and Society, 3(4), 272–289. https:// doi.org/10.1109/TTS.2022.3195114 2. Holzinger, A., Kargl, M., Kipperer, B., Regitnig, P., Plass, M., & Müller, H. (2022). Personas for Artificial Intelligence (AI) an open source toolbox. IEEE Access, 10, 23732–23747. https:/ /doi.org/10.1109/ACCESS.2022.3154776 3. Wigan, M. (2022, September). Cyber security and securing subjective patient quality engagements in medical applications: AI and vulnerabilities. IEEE Transactions on Technology and Society, 3(3), 185–188. https://doi.org/10.1109/TTS.2022.3190766 4. Ju, L., et al. (2022, June). Improving medical images classification with label noise using dualuncertainty estimation. IEEE Transactions on Medical Imaging, 41(6), 1533–1546. https://doi. org/10.1109/TMI.2022.3141425 5. Spanakis, E. G., Psaraki, M., & Sakkalis, V. (2018). Congestive Heart failure risk assessment monitoring through internet of things and mobile personal health systems. 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2925–2928. https://doi.org/10.1109/EMBC.2018.8513024 6. Alnosayan, N., et al. (2014). MyHeart: An intelligent mHealth home monitoring system supporting heart failure self-care. 2014 IEEE 16th International Conference on e-Health Networking, Applications, and Services (Healthcom), 311–316. https://doi.org/10.1109/Hea lthCom.2014.7001860 7. Jin, B., Che, C., Liu, Z., Zhang, S., Yin, X., & Wei, X. (2018). Predicting the risk of heart failure with EHR sequential data modeling. IEEE Access, 6, 9256–9261. https://doi.org/10. 1109/ACCESS.2017.2789324
150
V. Vankadaru et al.
8. Zhang, P., Zhou, X., Pelliccione, P., & Leung, H. (2017). RBF-MLMR: A multi-label metamorphic relation prediction approach using RBF neural network. IEEE Access, 5, 21791–21805. https://doi.org/10.1109/ACCESS.2017.2758790 9. Wang, B., et al. (2019). A multi-task neural network architecture for renal dysfunction prediction in heart failure patients with electronic health records. IEEE Access, 7, 178392–178400. https://doi.org/10.1109/ACCESS.2019.2956859 10. Yang, L., Mingyong, L., Xiaojian, Z., & Xingguang, P. (2018, August). Global approximation based adaptive RBF neural network control for supercavitating vehicles. Journal of Systems Engineering and Electronics, 29(4), 797–804. https://doi.org/10.21629/JSEE.2018.04.14 11. Ahmed, S., Srinivasu, P. N., Alhumam, A., & Alarfaj, M. (2022). AAL and internet of medical things for monitoring type-2 diabetic patients. Diagnostics, 12, 2739. https://doi.org/10.3390/ diagnostics12112739 12. Chicco, D., & Jurman, G. (2020). Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making, 20, 16. https://doi.org/10.1186/s12911-020-1023-5 13. Ahmad, T., Munir, A., Bhatti, S. H., Aftab, M., & Raza, M. A. (2017). Survival analysis of heart failure patients: A case study. PLoS One, 12(7), e0181001. https://doi.org/10.1371/jou rnal.pone.0181001 14. Chawla, N. V., et al. (2002). Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953 15. Monalisa, S. (2018). Analysis outlier data on RFM and LRFM models to determining customer loyalty with DBSCAN algorithm. 2018 International Symposium on Advanced Intelligent Informatics (SAIN), 1–5. https://doi.org/10.1109/SAIN.2018.8673380 16. Qian, Y.-l., Zhang, H., Peng, D.-g., & Huang, C.-h. (2012). Fault diagnosis for generator unit based on RBF neural network optimized by GA-PSO. 2012 8th International Conference on Natural Computation, 233–236. https://doi.org/10.1109/ICNC.2012.6234708 17. Li, H.-b., Hao, S., Zhang, X.-l., Lai, Y.-j., & Qi, Q. (2016). Nonlinear identification of triple inverted pendulum based on GA-RBF-ARX. 2016 35th Chinese Control Conference (CCC), 1975–1980. https://doi.org/10.1109/ChiCC.2016.7553656 18. Srinivasu, P. N., Shafi, J., Krishna, T. B., Sujatha, C. N., Praveen, S. P., & Ijaz, M. F. (2022). Using recurrent neural networks for predicting type-2 diabetes from genomic and tabular data. Diagnostics, 12(12), 3067. https://doi.org/10.3390/diagnostics12123067
Healthcare Management and Prediction of Future Illness Through Autonomous Intelligent Advisory System Using AAT Computational Framework Haritha Akkineni , Madhu Bala Myneni, Y. Suresh, Siva Velaga, and P. Phani Prasanthi
Abstract The various technological evolutions in IT healthcare systems have transformed from Electronic Health Records (EHR) to personalized health support systems with ambient assistive technologies (AAT). This chapter focuses on Healthcare management tools and techniques with a major focus on technological innovations and disruptions; services and tools; Recommendation systems in health care; The proposed Autonomous Intelligent Advisory System termed as H-Pilot using the Social Internet of Things (SIOT) computational framework. SIoT framework in HPilot is an integrated application that supports the flow of app information from interconnected apps in various verticals. It provides various AAT services like Alerts and recommendations, activity monitoring with reports, communication among various devices, feedback support, navigation, and emergency health monitoring through various autonomous existing apps. The information must be blended to generate overall actions to increase efficiency and effectiveness to satisfy the tasks and needs of the individuals with personalized recommendations for improving the QOL. To make the study’s contributions more evident, a sample user story is depicted to show the usage of the H-Pilot enabling person-centric healthcare using Ambient Assistive Technologies.
H. Akkineni (B) · Y. Suresh Department of IT, PVP Siddhartha Institute of Technology, Vijayawada, India e-mail: [email protected] M. Bala Myneni Department of CSE, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India S. Velaga Sri Ramanuja Junior College, Visakhapatnam, India P. Phani Prasanthi Department of Mechanical Engineering, PVP Siddhartha Institute of Technology, Vijayawada, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_7
151
152
H. Akkineni et al.
1 Introduction 1.1 The Role of Information Technology Services in Healthcare Technological advancements and the influence of technology on healthcare management have been a great advantage in recent times. Technology has not only helped in reducing the cost of health management of an individual, but it has also helped in bringing efficiency in the diagnosis and treatment of individual ailments, besides, supporting the smooth functioning of healthcare services. The inception and usage of technologies in healthcare have been delayed historically but once incepted; technology has been a key driver in introducing various ground-breaking procedures in critical surgeries and treatments of various ailments. This has not only improved efficiency in hospital care but has also got a direct influence on various support systems and governing/regulatory bodies in healthcare. Traditionally, medical facilities are dependent on the individual expertise of super specialists, experienced doctors, nurses, scientists, etc. To build robust tools and applications that can support a high-quality patient healthcare experience, the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009, Congress in the US compelled medical facilities to use electronic health records (EHR) thereby revolutionizing their IT systems. EHRs with well-maintained data on patients can be deciding factors for various activities performed by healthcare which has got a huge dependency on the data. Due to the continuous evolution of modernization and digitization of medical facilities, the need for robust and responsive IT support systems is increasing. The quality and accuracy of EHR data have been able to assist in making many key decisions made by the hospital management at various levels. According to the survey report published in Mayo Clinic Proceedings, conducted for 6375 doctors in the United States, found that 84.5% of them used EHRs. With progressing time, it has been identified that the EHRs which are used across the medical landscape, there are quite a lot of issues that have to be addressed so that the EHRs can stand up to their expectations. Many hospital-based doctors use EHR systems, and most of these doctors are calling for change. 63% of doctor’s recognize that EHRs help in improving patient care. 71% of actors think that EHRs are a cause of physician burnout. And 69% of the doctors indicated that most of their time were spent writing and reading EHRs and that they have less time for patients. This has been a significant and underlying factor that is the demand for change in available platforms and inventions in robust IT applications and other ICT products are essential [1]. At its outset, IT in healthcare is all about communication means between team members, devices, patients, their medical providers, etc. This has been implemented by establishing secure communication channels that are integrated with various supporting applications and can share critical patients’ various decision
Healthcare Management and Prediction of Future Illness Through …
153
support systems. These components are scalable and flexible so that efficiency and effectiveness will improve in delivering healthcare services. The effective usage of the technology is prevalent only when it is designed with an ergonomic sense having robust interfaces with a focus on human factors, to support the implementation and maintenance activities. These IT services help the systems to become independent, autonomous, and self-driven. With rapid evolutions, IT in Healthcare focuses on the development of datacentric desktop or web-based and mobile applications that support data collection, analysis, and visualization. These applications meet the demand in Hospital care, quality delivery, and cost efficiency. But this approach has not addressed the major problem of delivering healthcare services on time when demanded. This has led to the need for applications that are robust and has support from real-time and complex algorithms, leading to the path of Artificial Intelligence (AI) based systems and processes. Medical data processing through machine learning gives important insights, improving health outcomes, and patient experiences. IBM Watson incorporates new technology in the medical domain to comprehend a medical condition, analyze data and identify the result. It assists clients with digital solutions and professional advising enabling health providers to become more efficient, resilient, and robust. The services extend with diagnostic and treatment assistance, and evidence-based insights, and offer collaborative medicine in the field of healthcare. These AI-based systems support improving healthcare to boost efficiency by navigating between structured and unstructured data, quality standards, and the most recent evidence-based medicine by generating individualized care plans and providing treatment recommendations [2]. The services include accelerating the drug recovery process and issues of a long wait or the clinical trials of medicine; reducing pre-screening wait times; helping healthcare transformation and value-based care [3]. As the next era of evolution of IT in healthcare creates digital health trends like virtual reality surgical training, telemedicine, and IoT devices continue to attract massive investment, assisting the industry in improving health equity globally. This led to the evolution of multi-agent systems (MAS) [4]. The individual agent represents an autonomous entity that has the capability of managing complex problems through interaction with other agents. This has been made possible that the usage of Ambient Intelligence (AmI). AmI is compatible with the idea of a “disappearing computer” [5]. This is closely associated with domains of human-centric computer interaction design, pervasive computing, and ubiquitous computing. These systems exhibit features like being embedded, bringing awareness to context, making personalized adaptiveness, being anticipatory, being unobtrusive, and being non-invasive. The inclusion of Ambient Intelligence facilitated the ease of access and usage of ambient assistive technologies. In earlier days the major focus was on the doctors having access to patients’ health records through EHR, which has been transformed into personalized health support systems with the usage of AAT. The various technological evolutions in IT healthcare systems are depicted in Fig. 1.
154
H. Akkineni et al.
Fig. 1 Technological evolutions of IT systems in healthcare
1.2 Ambient Assistive Technologies Due to Smartphones or Virtual Personal Assistants dependency on technology is increasing. The technological innovations focus on improving quality of life with the features of adaptivity, context awareness, embeddedness, etc. Hence the innovations in Ambient Assistive Technologies (AAT) take the support of ambient intelligence to achieve these features to provide older adults allowing them to stay active, independent life and get socially connected in old age. The AAT market is becoming a vital component for providing health and social care to the elderly to promote active aging. It becomes challenging to engage older adults in these technologies. Research states the number of senior citizens requiring assisted living will increase by 75% in the US as the world population lying in the senior citizen categories considering the age demographic is going to exceed 2 billion by 2050. The aging of society, higher life expectancies, low fertility, and fewer caregivers can be considered the main contributors to such a high reliance on older people. There is a dire need for the advent of human-centric technologies or devices that intelligently supports people in their day-to-day living and provides necessary care for the elderly [6]. To address the issues related to multimorbidity in older adults, geriatric medicine has shown considerable reliance on ambient assistive technologies. The advancement
Healthcare Management and Prediction of Future Illness Through …
155
in modern medical technology ranging from complex health supporting systems, to diagnostic and drug-delivery devices, and surgical tools for older adults have changed the face of assistive environments and as such geriatric medicine. The delivery of assistive products and services with the usage of ambient and mobile systems enabling independent living and maintenance of the elderly promoting their health and well-being is known as assistive technology [7]. Old people like to spend their retired life in their familiar home environment rather than at old aged care centres due to the feeling of loss of their independence. The trend of aging in place is becoming more familiar due to giving them the feeling that they can still exercise their decisionmaking skills. Assistive technologies support easier care at home. Such inclusion and reliance on Ambient Assistive Technologies can not only be beneficial to older adults but also to their caretakers.
1.2.1
Generations of Ambient Assistive Technologies
The categorization of technologies based on generations is shown in Fig. 2. The First Generation of AAL Technologies: These technologies mainly focus on personal response systems by using various wearable devices consisting of an alarm. In the situation of emergencies like falls, the older adult presses the button and raises the alarm, and calls for help [8]. This will alert the personnel at the call center and further steps will be initiated. One such example is the Life Alert, a medical alert system specifically designed for senior citizens to enable independent living by protecting them during home health emergencies. User initiation is the key enabler in these technologies [9]. These technologies proved to be advantageous in providing safety and security requirements for older adults in terms of reducing stress levels; reduction in hospital admissions etc. These technologies have their limitations in situations where the person is not in a state to trigger the alarm or is not wearing the device, etc. Second-Generation Technologies:
Fig. 2 Categorization based on generations
156
H. Akkineni et al.
The downsides were evident in observing and extended usage of 1st generation technologies. They mostly rely on user initiation in contrary to 2nd generation technologies which integrate electronic components, called sensors to monitor senior citizens. The core principle of these technologies is Self-detection and autonomous actions. The emergencies were detected directly without user involvement. One such scenario is the sensors could directly sense a gas leakage in a home where the elderly is in an incapacitated state trying to seek help [10]. This proves to be more usable and beneficial to those older adults with Mild Cognitive Impairments. Such people face a lot of difficulties in using the household appliances such as forgetting to switch off electrical appliances etc. The acceptance of these systems is limited as users feel intrusive. Third-Generation Technologies: The advancement of ICT leads to the emergence of AAL Technologies. The third-generation technology goes far beyond self-detection and autonomous actions; employs the concept of ambient intelligence to focus on the preventing mechanisms. AAL systems possess ambient intelligence if they are sensitive and responsive and are unobtrusively embedded into the daily environment [11]. One such system is the smart home that employs non-instructive methods and reduces the dependency on manual supervision. It uses various components like sensors, household appliances, actuators, and human–machine interfaces for communication and analyses the collected data about the activities of the residents. The system can integrate assistive devices with cloud technology into everyday living, making the cloud of care concept realized. It supports old age in the management of daily activities by observing the changes in activity patterns and any situation that might need attention will be monitored. Intelligent systems and remote services are being increasingly adopted in this domain with a potential benefit in the reduction of stigma associated with monitoring and assistance devices by embedding the technology invisibly within everyday objects.
1.2.2
AAT Categorization Based on Features
The assistive technologies can also be categorized based on the activity they take up which aims at delivering comprehensive solutions to improve quality of life. The categorization of AAT is shown in Fig. 3. Social inclusion and communication: Getting socially connected with the activities like communication and interaction with family and friends can greatly help in preventing functional decline. Assistive Technologies that support social inclusion are very useful for older adults to improve their well-being. One such example is MobiAssist [12] which supports social interaction and increases empowerment using serious games, and simultaneously builds intergenerational relationships between the carer and the individual to improve the quality of care [13].
Healthcare Management and Prediction of Future Illness Through …
157
Fig. 3 Feature-based categorization of AAT
Telemedicine, Telehealth and Telecare: To support the concept of Aging in Place, elderly people could be monitored remotely regarding their health problems. Such a service termed telehealth or telecare improves the comfort of life, management, and autonomy and eliminates the constraints of travel and distance in access to appropriate healthcare facilities. MEDIBUDDY is one such example. It provides end to end digital platform connecting doctors, hospitals, healthcare providers, pharmaceuticals, and insurance companies to create a world of possibilities for people looking for healthcare assistance. Online doctor consultations, lab test bookings, medicine delivery, and corporate health, and wellness services make sure every healthcare need is taken care of. Entertainment and Media: Physical activity and cognitive training are essential components for the elderly. Ambient Assistive technologies provide this component by integrating serious games and smart objects, which combine multimedia, entertainment, and training, to deal with chronic diseases like dementia therapy can bring additional benefits. Exercise gaming at home can delay the occurrence of symptoms like forgetfulness or disorientation [14, 15]; has positive effects on pain perception, health and overall well-being of the elderly. One such example is “Cook it Right”, a game designed to promote executive functioning by incorporating display surfaces and sensors in AAL [16]. Sensing and Interacting: Considering the active aging paradigm, assistive technologies can enable the elderly to assume a familiar intelligent home environment enabled with monitoring,
158
H. Akkineni et al.
from localization and fall detection to non-intrusive healthcare services providing necessary and needed Healthcare at critical times. One such case is a non-invasive glucose level sensing approach [17], for in-time delivery of patient data to healthcare professionals [18].
1.3 Ambient Assistive Living Ambient Assistive Living (AAL) aims to make aging in place by ensuring the elderly have a safer home stay using Ambient assistive technologies such as smart devices, wireless networks, software applications, sensors, etc. It can be regarded as a boon to the older adult as it supports their wellness based on prevention and cure. The demographic change has created a huge demand for the advent of AAT and as such AAL. It can be a path breaker in providing superior quality of life by enabling the elderly to live by themselves without anyone’s support. The basic aspects that need to be focused on during AAL system development are shown in Fig. 4. The basic aspects that need to be focused on during the design and development of AAL systems are 1. Targeted users: Targeted users are categorized into three groups such as youngest-old, middle-old, and oldest-old. All elderly and people with different abilities who have to be monitored and prolong their autonomous life will generally be the primarily targeted group. According to medical research, the basic categories of the targeted group based on age are termed as youngest-old who Fig. 4 Fundamental aspects of AAL systems
Healthcare Management and Prediction of Future Illness Through …
159
are in the 65- and 74-years age group; middle- those who are in the 75 and 84 age group; oldest-old those aged above 85 years. 2. Application environments: The application environments are considered both indoor and outdoor spaces. In the context of indoor spaces, new systems and methods have been developed to access data from heterogeneous sources for constant monitoring of behavioural, and environmental data and assessing their outcomes [19]. To support active aging, the elderly are stimulated to be spiritually and socially active. As a part of this they are guided to spend time in the outdoor spaces with the AAL systems which will assist them in checking routes in conditions of becoming confused and not knowing the sense of place; systems that recognize anomalous behaviors such as fall; systems for evaluating motion activities etc. 3. Features: The features can span across different domains like Vital Sign Monitoring, Disease Management, Behavioural change management, Environment Monitoring, Social Activity Monitoring, and so on. It can be from a simple alerting system to one with a user-centered design; nonobtrusive; human-like components in smart homes to improve the resident’s safety in predicting situations of falls. Ambient Assistive Living from a socio-economic perspective focuses on factors like the aging population, increasing dependencies, a decreasing supply of healthcare, higher costs for care, and aging at home. As health ailments along with dependency increase with age, demand for living concepts and supply measures have risen. Aging in place is the “Ability to live in one’s own home and community independently, safely and comfortably, regardless of age, income, morbidity level” [20]. AAL supports aging in place with help of various verticals of providing life satisfaction, well-being, sensing, and supporting housing solutions that are adaptive, sensitive and responsive to human needs. The concept of AAL has seen a gradual improvement in using AAT. It evolved from the usage of devices like wheelchairs, blood pressure monitoring, vital parameter monitoring, floor sensors for fall detection, etc. [21]. As such Ambient Assistive Technologies are striving to satisfy the psychophysiological needs and demands of the elderly and thereby aiming at creating a “cloud of care” [22]. AAT and AAL are becoming topics of global interest leveraging the socioeconomic needs of the nation. AAL is evolving interdisciplinary field emerging to provide solutions and handle the challenges of the growing aging population. It involves knowledge and communication technology in health and telemonitoring systems. Today’s lifestyle is integrating the digital environment into everyday activities. The advancements the mobile, portable, and digital sensor technology, mobile networks, and health-tracking apps [23] have been the main contributors to realizing the vision of AAL. AAL makes use of ambient intelligence strategies to show its mark in providing comfortable living for the elderly ensuring their protection and well-being despite their behavioural problems. However, it enables, independent
160
H. Akkineni et al.
monitoring of their medical ailments with the support of AAT as such reduces the number of unseen costs that are likely to occur in future healthcare.
1.4 Ambient Assistive Technologies for Healthcare Ambient Assistive Technologies for the Healthcare industry is initiated with the Internet of Things based solutions. The series of technological and cultural revolutions in the medical industry led to the invention of products that supports the monitoring diagnosis and treat health ailments. The products equipped with connected medical devices are known as the Internet of Medical Things (IoMT). The increase in connected medical devices, paved a path to the rise of IoMT with Ambient Assistive Technologies to transform healthcare. It bridges between physical and the digital world by generating, collecting, and transmitting health data to healthcare providers’ networks, cloud, and internal servers. With the amalgamation of these technologies, it is possible to monitor patients’ behavior in real time; manage patient chronic conditions; can also streamline various clinical processes and information flows, and bring people, processes, data, and enablers together to improve healthcare delivery. Few Prominent Frameworks of AAT in Healthcare: The 4P medicinal framework addresses, predictive, preventive, personalized, and participatory solutions with AAT embedding IoMT. The data generated from these processes will definitely create new models of healthcare and deliver improved patient outcomes efficiently. Some of the patients’ outcomes can be noted as improved drug management, decreased costs, improved diagnosis and treatment, enhanced patient experience, and improved disease management. These frameworks consider various environments or devices like Smart Things, Telehealth, Wearable, Remote Health Monitoring, and Medication Management connected by the physicians, family, and caretakers to support the elderly. A general framework of Ambient Assistive technologies (AAT) in healthcare is shown in Fig. 5. Drug management solutions: AAT has emerged as a provider of innovative drug management solutions, including medical dispensers that detect whether or not a drug has been taken. Devices with remainder functions for water and medication intake are being monitored which would be extremely useful in geriatric care. Behavior Monitoring and modification: The elderly is monitored for their physical activity and behavior so that the caretakers could optimize the supportive tasks according to the needs and habits of the individuals. AAT has explored various prediction methods with environmental changes and intelligent mechanisms for accurate real-time monitoring of basic Activities of Daily Living (ADLs). The behavior of the elderly is monitored so that the right care can be taken at the right time. Hand Hygiene Compliance: Some real-time vision-based systems were designed to support people with dementia in washing their hands. This system captures the video input and provides visual or verbal assistance for sequential activities like
Healthcare Management and Prediction of Future Illness Through …
161
Fig. 5 A general framework of Ambient Assistive technologies (AAT) in healthcare
tracking hands and towels etc. Based on the user’s psychological states, such as awareness and responsiveness the AAL system can adapt accordingly [24]. Remote Patient Monitoring: AAT provides people with remote assistance for chronic disease prevention. It has the capability of detecting anomalous daily behaviors, such as the inability to stand up, a loss of consciousness, and prolonged inactivity periods that need immediate medical attention in elderly people. Telehealth: Telehealth solutions provide a specific voice-controlled communication and home care VAssist. It provides telemedical and communication features using multilingual natural voice interaction thereby improving the quality of homecare and mechanisms for interaction that facilitate aging in place [25]. Assistive Living: AAT integrates modern technologies into the lives and homes of the elderly ensuring safety and security by using devices to track emergency situations and health monitoring. They take the support of television and telephone platforms to monitor vital signs and connect individuals with emergency services to allow easy access to health information [26].
1.5 Benefits of AAL Technology Usage Benefits of AAL technology usage mainly focus on Inhabitants and caretakers. The benefits of AAL Technology usage are shown in Fig. 6. It addresses activities of daily living from the perspective of safety, independence, and emergencies and also helps the caretakers or family members handle the elderly with minimal effort.
162
H. Akkineni et al.
Fig. 6 Benefits of AAL technology
1.6 Challenges at the User and Caretaker Level Various user characteristics impact the acceptance and AAL technology, the evaluation dimensions including benefits, challenges, technologies, usage and access of data, and storage duration. Various challenges or barriers to Ambient Assistive Learning (AAL) technology usage are given in Fig. 7. User Acceptance and Privacy: Adoption of AAT in Healthcare is facing numerous challenges. Acceptance of these technologies by the elderly will be the major challenge that has to be focused on. The elderly may think that constant monitoring by various sensors will generally be too intuitive into the personal life data might be misused. Such worries and
Fig. 7 Various challenges included in the usage of Ambient Assistive Learning (AAL) technology
Healthcare Management and Prediction of Future Illness Through …
163
misconceptions must be handled properly with a perfect balance between privacy and protection. Behavior Recognition in Complex Scenes: Occlusions to the sensor view in Ambient Assistive Living scenarios for the elderly pose the main challenge; mainly the health-critical activities might not be detected. The main purpose of behavior recognition is not achieved. It remains a challenge to be dealt with such situations arises. Data Abuse by Third Parties: In general, medical datasets have been biased, which adversely affects clinical outcomes. If an individual is missing specific attributes, whether owing to datacollection constraints or societal factors, algorithms could misinterpret their entire record, resulting in higher levels of predictive error. Confrontation with New Technology: Confrontation with new technology is usually a challenge that has to be addressed especially for the elderly. User acceptance of the new technologies will be the deciding factor as it imposes difficulty on seniors. The development of new technologies should be based on end users’ perspective, wants, and needs. Handling Seems to Be Too Complex: Ambient Assisted Living devices achieved overall positive feedback serving the elderly and the caretakers. The improvement in willingness towards these products depends on a high cost–benefit ratio, high accessibility, reliable functioning, and a rise in assistance. The acceptance rate is lower for professional caretakers as these technologies are considered anti-human in the care context. To overcome all these issues a thorough learning about product handling is essential.
1.7 Motivation The support of IT services has great prominence in the usage of Ambient Assistive Technologies, especially in the medical landscape. Ambient assistive technologies help to enable ambient assistive living and have evolved through generations to provide services in multiple domains. Different devices and apps were developed to make it convenient for the user to use ambient assistive technologies. Selfadvocacy and recommender systems have gained importance in the healthcare sector. There are many apps that are available related to various verticals of Health like activity tracking, smart drugs, telehealth, socialization, and communication etc. Each application is developed for a certain scenario; it might be for smart daily activity tracking, smart medication, health monitoring etc. This led to the initiation of the Autonomous Intelligent Advisory System (H-Pilot) which works on multiple verticals for improving the QOL of the elderly with active aging. SIoT framework in H-Pilot as an integrated application provides various AAT services like Alerts and recommendations, activity monitoring with reports, communication among various devices, social and feedback support, navigation, and emergency health monitoring provided through various autonomous existing apps.
164
H. Akkineni et al.
1.8 Contribution of the Current Study . Insights into IT healthcare technologies and how the AAT has evolved through generations from EHRs to Ambient Assistive Technologies . Categorization of AAT based on the features . Extensive analysis of Different tools (Apps) and devices embedding Ambient Assistive Technologies in Health care . Challenges faced in using Ambient Assistive Technologies . Proposing an Autonomous Intelligent Advisory System (H-Pilot) using the SIoT framework which works on multiple verticals. . Sample user story depicting the usage of the H-Pilot enabling person-centric healthcare using Ambient Assistive Technologies.
1.9 Organization of the Chapter The chapter is organized into totally six sections. Section 1, Introduction covers the evolution of technologies in healthcare to challenges and concluded with the motivation and contribution of this study. In Sect. 2, Healthcare management tools and techniques are discussed with a major focus on technological innovations and disruptions; services, and tools. Section 3 illustrates the recommendation systems in health care. Section 4 presents Autonomous Intelligent Advisory System using SIOT computational framework in Sect. 5 a sample application in health scenarios for person-centric health care using Ambient Assistive technologies is depicted. Section 6 is the conclusion.
2 AAT Common Services, Tools/ Devices in Healthcare 2.1 Categorization of Services in Assistive Living The various AAT services like Alerts with recommendations, activity monitoring with reports, communication in various devices, social support and feedback support, navigation, and emergency health monitoring are essential in support of an assistive living environment. The various categories of services in healthcare are shown in Fig. 8. Activity Monitoring: The process where in which the elderly persons’ activity and the behavioral pattern is monitored over a period of time with the help of Environmental and wearable sensors. The changes in health status can be easily assessed. Environmental sensors like door sensors, motion capture systems like Kinet, and sensors to measure physical aspects like light, temperature, etc. are utilized for getting
Healthcare Management and Prediction of Future Illness Through …
165
Figure 8 Categorization of services in healthcare
activity monitoring. A chair occupancy sensor can notify whether the chair is being used or not [27]. Alerts: As AAT targets the elderly to have their own independent living, they have to use various home appliances for their daily chores. The dangerous or risky situations are detected and the user is alerted. In case of any behavioral changes or anomalies or when the system detects an abnormal situation, it alerts the elderly and alerts the caretakers. Communication: The user’s intentions must be perfectly captured through different interaction mechanisms like speech and gestures. AAT enables communication in the form of medical consultations, communication with family and keeping in touch with friends. AAT change the meaning of peoples abilities for Communication with family and friends to maintain social life and with devices that enable assistive living. Emergency: To support old aged and patients having potential health-related emergencies and to live independently emergency detection and monitoring is an important component. Detection of events like falls, the distinction between safe and unsafe events, physiological signals and physical illness, functional aspects like general activities, meal intake etc., and safety aspects need to be monitored. Feedback Support: AAT integrates many adaptable methodologies based on the characteristics of the users and provides feedback on their actions. The usage of wearable devices and smart phones provides new opportunities in the collection of health and disease-related data. The older adults ‘daily activities are enhanced by providing mobile-assisted recommendations and advice. Navigation: The traveling abilities and usage of transportation should be easy to use for old aged people. Assistance, recommendations, and advice can be given on their mobile or their wearable devices [28]. The positioning software and location tracking methods could be used to have a watch on the elderly during their outdoor
166
H. Akkineni et al.
activities. To make street navigation easier for the elderly, mobile phone with integrated sensors acts as intelligent navigators. It functions to remember the route taken and guides the user back. Health Monitoring: Day-to-day health monitoring is essential for the elderly living independently with the support of ambient assistive technologies. There are devices to track vitals, get connected to health systems like healthcare providers, services, and information. The patients discharged from hospitals could also be monitored in case of chronic diseases. Social Support: Social Inclusion of the aging population and interactive social relationships makes them active. Active aging is made possible by enabling social connectedness with social network services and thereby integration into society. Combining real-life and virtual social network elements to prevent and overcome loneliness in the aging population is necessary [29].
2.2 AAT Tools/devices The technological innovations or developments using ambient assistive technologies are made possible using various tools /devices that are broadly grouped as health tracking, Smart care, Telehealth, Mobility and navigation, Smart hygiene and drugs, activity tracking, social inclusion, etc. Health Tracking: Tool/Device
Description
Motiva [30]
Old people suffering from chronic diseases are connected to a health channel for health monitoring using a TV platform
Simply Home System [31]
It uses sensors to track daily living activities and behavior changes
Companion Able [32]
It uses AI and robotic technology for cognitive simulation and health management of the older adults
EDLAH [33]
Along with keeping track of the health status, provides recommendations in task completion and provides medication reminders via tablet
INHOME [34] A multipurpose monitoring platform based on a sensor network for e-Health applications, varying from after-discharge care to elderly monitoring eHealthCOM [35]
A health platform tailored to meet localized health services to monitor the physical and emotional well-being of the aged people
U-Health [36]
Ubiquitous health (U-Health) provides customized medical services for diet, e-health monitoring, and exercise services in the ubiquitous environment
vAssist [37]
vAssist targets seniors with chronic and fine-motor problems with multi lingual natural speech interaction for the use of home care and communication services (continued)
Healthcare Management and Prediction of Future Illness Through …
167
(continued) Tool/Device
Description
VirtualECare [38]
Adapts the problem-solving strategy for decision-making to offer off-site health monitoring
Smart Care: Some of the devices and tools that assist in Smart care for specific management of health conditions. Health@Home [39]
Focuses on the work of health information management in households
DIA [40]
Aged people with diabetes are assisted with insulin therapy dosage and the right infusion calculation
CAMMInA [41]
Uses usability evaluation to motivate Elders to exercise
BEDMOND [42]
Assists in self-management for people suffering from a chronic obstructive pulmonary disease (COPD)
Smart Dispenser [43]
Which supports older people in storing, reminding, and dispensing medications in timely intervals
Tele Health: Tool/Device
Description
Keep-In-Touch (KIT) [44]
The elderly people are connected in a closed loop with different groups of caregivers with the use of smart objects and technologies for tele-monitoring
I-Living Project [45]
The home of the elderly is equipped with integrated sensors to work autonomously by communicating with Bluetooth-enabled medical devices
Project HOME SWEET HOME [46]
Concentrates on telemonitoring of the elderly to assist them in their daily living activities by collecting data provided by sensors, video conferencing, and support services
AiperCare [47]
It is an integrated solution for a person in need of care. It detects critical situations and notifies the caregivers. Offers telemonitoring and connects to emergency service
TeleStation [48]
Chronic disease monitoring is done remotely by sending vital sign data to care providers
Smart Hygiene and Drugs: Tool/Device
Description
BelAmI [49]
Used in evaluating bathroom activities, tasks associated with hygiene, and preparing meals
Smart pill box [50]
Timings, pill dosage, and service times is being continuously being under the scanner. Based on the time set reminders will be sent (continued)
168
H. Akkineni et al.
(continued) Tool/Device
Description
Medication event monitoring systems (MEMS) [51]
A medication event monitoring that looks at the transfer of information between the patient and the caregiver in form of a simulator that simulates a pharmacy vial. The attributes related to opening like date and time etc. are displayed and alert the patient to take medication
Automated medication dispensers [52]
It Reminds of medications in form of voice, video, or telephone calls, dispenses medication, and medication caps equipped with digital alarm clocks
HealthWatch, Dispense medication at preprogrammed intervals. Caregiver/family member Beep N Tel [53] notifications are sent reminders are send in audible and visual forms
Mobility and Navigation: Tool/Device
Description
ELDERHOP [54]
This is mostly indented for providing retail assistance to the elderly and facilitates mobility within venues so that the task could be just a pleasurable interaction rather than considering it as a burdensome activity
MobileSage [55]
A device incorporating location awareness providing geographical advice related to commuting issues
WayFiS [56]
Assist older adults to plan their personal routes; move in public transport and walking paths guided by the positioning software
IWalkActive [57]
Is a mechanically powered walking aid that is attractive and opens the walker platform assists the mobility of the user
Activity Tracking: Tool/Device
Description
(UbiFit) [58]
It is targeted at improving physical mobility with the help of Ubiquitous Fitness Influencing Technology, focusing on general fitness or targeted on a specific time basis
EMBASSI [59]
It uses a dedicated interface to offer assistance in daily activities
cAALyx-mv [60]
Consists of a wearable sensor to track vital signs, location etc. which connects to emergency services
Mobiserv [60]
It supports interior daily living situations with a social robot that provides prompts for health, communication, and nutrition
Exergaming [61]
A combination of gaming and exercise. To make older people involve in physical activity and make them exercise with help of various games
Healthcare Management and Prediction of Future Illness Through …
169
Social Inclusion: Tool/ Device
Description
3rD-LIFE [62]
It is a 3D virtual environment, that allows the elderly to establish social relationships and learn new things
SoMedAll [63]
Social media for elders equipped with easy-to-use User Interfaces to maintain and share online information
Co-living [64]
The development of an ICT-based Virtual Collaborative Social Living Community for Elderly (Co-LIVING) people, aiming to stimulate and prolong their independent living in an outward environment through advancement in elderly people’s social interaction, contributing thus positively to their well-being is an ICT-based virtual collaborative social living community for aging people To fosters community life and enabling them to interact socially enhancing active living in elderly through a virtual social network
2.3 Disrupting Technologies in AAL The term ambient assisted living is used frequently to refer to a variety of AI-enhanced applications that aid people in need with a variety of tasks, including daily living and health. AAL can offer a range of solutions for enhancing people’s quality of life. In order to personalize their services and improve outcomes, these systems employ a variety of techniques to gather information about their customers and create automated judgments, or AI models. Systems of healthcare are intricate and varied. The global pandemic hastened the implementation of many digital projects to improve daily living, but it has also left many health systems in disarray and painfully conscious of their digital limitations. As a result, the healthcare industry is primed for digital disruption, and the range of applications for digital solutions in areas like clinical decision support, telehealth, healthcare IT systems, and more have dramatically expanded. The transformations noted in AAL with digital technologies are shown in Fig. 9. A scenario of Clinical staff maintaining patient records: With the help of AI technology, patient care is transformed from human-centric to technology-centric.
Fig. 9 Transform patient care with the latest in Ambient Intelligence
170
H. Akkineni et al.
Now automatically and effectively record patient contacts at the point of care, an ambient clinical intelligence (ACI) solution (Nuance DAX) powered by AI. In order to provide physicians and patients with improved healthcare experiences, DAX goes beyond the capabilities of a virtual or on-site scribe.
3 Self-Advocacy or Recommendation Systems in Healthcare The continuous growth in the usage of Smartphones has generated considerable moves in the development of mobile apps in the health sector. Ambient Assistive Living to support senior citizens involves the use of many technologies like the Internet of Medical Things, Internet of Health Things etc. Ambient assistive living solutions cannot propose and designed without the involvement of Mobile apps. The healthcare solutions using Smartphones are spread across different verticals from diagnosis, clinical communication, medical education, etc. Many apps are developed to monitor the activities, smart drugs, telehealth, and smart care, and provide assistance in daily activities to give cloud care and enable active aging. These apps mainly aim at improving the cognitive functions of the elderly with chronic diseases while maintaining social interactions. Some of these apps can help to reduce the anxiety of family caretakers by monitoring the person in and around the home in real-time, estimating the probability of wandering using geolocation, as well as facilitating care management and services by healthcare professionals [65]. Apps for self-advocacy in Healthcare are shown in Fig. 10. Fig. 10 Apps for recommendation systems in Healthcare
Healthcare Management and Prediction of Future Illness Through …
171
3.1 Apps in Health Monitoring The existence of various apps for health monitoring with specific functionalities like monitoring glucose levels, managing heart diseases, monitoring sleep, fall detection, and maintaining health logs are available. Dexcom Continuous Glucose Monitor: This app is to monitor the changing parameters in the glucose levels of the elderly suffering from diabetes and advice them to take timely actions. Kardia: The elderly can easily record EKG with the help of wearable devices for managing their heart diseases. The recorded information can be communicated immediately to doctors for analysis. Sleep Monitor: The sleeping patterns and habits have an indirect impact on the health of the individual. The change in sleeping habits and its correlation to the conditions like heart disease, depression, obesity, etc. are monitored through the app. Fade: Fall Detector—The main purpose is to alert the caregivers when a person suffers a fall. Health Log: This app keeps a log of health-related information from general to extreme conditions. The severity of the condition, the kind of pains and causes, its location and medication intake, and the extent of relief the patient has got, etc. will be monitored.
3.2 Apps for Smart Care The existence of various apps for smart care with specific functionalities like monitoring activity, training yoga, a reminder for medications, and alerts for fall indication, are available. My Fitness Pal: It focuses on maintaining logs of exercise, and calorie intake, and guides in changing dietary habits to meet health goals. No calorie goes uncounted is the main policy. Everything you consume can be recorded. It integrates with other apps to sync all of your workouts. Pocket Yoga: One of the best exercise choices for older aged is yoga. This app provides instructor-led videos and poses of performing yoga so that the person can practice their personalized yoga at their convenience without going to the instructor. Sensorfall: It detects and senses the acceleration caused by the fall through an accelerometer sensor of a Smartphone. Medisafe: It acts as a medication reminder. It uses Just in Time interventions technology and promotes a patient-centric approach to manage medication engagement. CareClinic: It keeps track of the complete health statistic from tracking weight, sleep, blood pressure, energy levels, weight, pulse, pain, temperature, and mood. Track the medications and symptoms in seconds. It helps the person to manage one’s health and check in with the current state.
172
H. Akkineni et al.
Pill remainder: It is a kind of medication remainder on regular basis, passing instructions on the amount and time of taking medicines.
3.3 Apps for Telehealth The existence of various apps for smart care with specific functionalities like monitoring activity, training yoga, a reminder for medications, and alerts for fall indication, are available. Apollo 24|7: It is an everything-at-one-place platform for the health domain embedding features of a pharmacy, doctor consultations, and diagnostic lab tests. With Apollo 24/7 one can avail personalized solutions for any health problem with prompt online doctor consultation. MDLive: Online doctor consultation is the primary focus of this app. The elderly can get care from doctors or therapists, can schedule or change appointments, and provides telehealth focused training so that every medical institution can get involved, thereby reducing patient wait times and saving care giving costs. Babylon Health: It provides medical consultations with face-to-face appointments and personalized healthcare 24/7. The questions related to the medical condition of the elderly are answered by an AI BOT which uses a mix of AI and human expertise in giving solutions. Teladoc: It connects patients to doctors through phone visits. The prescriptions can be directly sent to the pharmacy. Doctor on Demand: It facilitates online consultations; provides chronic care management; multi-language consultations; preventive health; behavioural health etc.
3.4 Apps for Smart Drugs The existence of various apps for smart drugs with specific functionalities like medical store facility, lab tests, first aid instructions are available. 1 mg: A single integrated app for all healthcare needs. It is an online medical store where people can view medicine details, buy medicines, book lab tests and also get online doctor consultations. PharmEasy: It is a one-stop destination for medicines and healthcare essentials. It also accommodates the concept of lab tests at your home. For those who wish for active aging in place, this is essential as one need not step out for their healthcare essentials. First Aid: This app gives step-by-step instructions for handling common first-aid emergencies. It provides an option to call emergency services at any time. It focuses on safety and preparedness tips before reaching a doctor.
Healthcare Management and Prediction of Future Illness Through …
173
3.5 Apps for Activity Tracking / Activities of Daily Living: The existence of various apps for managing activities of daily living with specific functionalities like monitoring health aspects and wellness, inculcated healthy cooking and dietary habits managing finances independently and taking care of safety of the elderly are available. Fitbit -This app is generally connected to the Fitbit device to input the data. It lets seniors monitor many aspects of health and wellness. HealthifyMe: The platform offers a variety of workout genres, including Strength and Conditioning, special sessions for senior citizens with healthy cooking habits diet charts, and meal planners. This is a comprehensive fitness app that has customized programs for every diet preference. The seniors can get expert guidance from trainers and nutritionists. Coin Calculator: The senior citizens can press the photograph of the coin along with the + and – buttons to get the total. This app allows them to keep hold of their finances independently. iDressforWeather: It acts like a virtual clothing closet with photos of the clothing. Allows the senior people to choose appropriate clothing based on the weather. Senior Safety App: Allows the seniors to stay connected with their families. The caregivers can monitor the inactivity alerts of the elderly who live alone. Receive alerts when the phone has been immobile for an extended period which can be customized based on individual lifestyle. It also monitors the battery life of the phone.
3.6 Apps for Communication and Socialization: The existence of various apps for communication and socialization with specific functionalities like dialling their family and friends, emotional wellbeing of the elderly, sending emergency text to the predetermined contacts, and getting connected with their family and friends are available. SpeedDial, Voice Dial andPhoto Dial -Allows phone to be turned into customizable photo/voice dialling system on pressing the photo of the person call is dialed to the person. It will be ease of use to senior citizens to get social with family and friends. Evergreen Club: A digital platform for the elderly has been launched for the emotional well-being of the elderly. It assists seniors in regaining a sense of belonging by attending a variety of exciting events and making new meaningful connections. Seniors can ‘e-meet’ and socialize with people who share their interests. They can also take part in interactive sessions such as Antakshari, Tambola, Open Mic, Karaoke, and so on, as well as expert-led learning sessions such as Gardening, Yoga, Dance Therapy, Crafts, and so on. They can develop creative hobbies and nurture new talents, engaging in fun and recreational activities.
174
H. Akkineni et al.
ReactMobile: Sends emergency text and GPS information to the predetermined contacts at the touch of a large button. So that it would be handy for aged adults. Dragon Dictionary: A voice recognition app allows seniors to speak and send text texts emails. Skype: Connect with family and friends through video calling. All the existing apps will serve specific functionality. Hence, a user has to get connected with individual apps and check notifications from time to time and those are general. To overcome this difficulty a unified framework is essential which connects all existing apps.
4 Autonomous Intelligent Advisory System Using Siot Computational Framework In existence, many apps are available related to various verticals of Health like activity tracking, smart drugs, telehealth, socialization, communication, etc. The existing apps are specific to one vertical. Each application is developed for a certain scenario; it might be for smart daily activity tracking, smart medication, health monitoring, etc. These are highly efficient autonomous units making enormous features possible to support day-to-day activities. The main drawback is that the units cannot communicate (talk) among social things (apps). This leads to inefficient exploitation of the services offered by existing applications. The above said drawbacks can be overcome through the Social Internet of Things (SIoT) framework by considering mobile apps as things in IoT. Through this communication is established among various vertical-related applications existing in her/his mobile. The SIoT framework aims to develop an Autonomous Intelligent Advisory System (H-Pilot) which works on multiple verticals for improving the QOL of the elderly with active aging. The implications in health scenarios with the proposed framework are . The personalized recommendations and notifications are based on health issues and metadata collected from various connected mobile apps, including information related to exercise, food, health history, sleep patterns, EHR records, diagnostic test reports, and other information such as headaches, coughs, colds, allergies, etc. Through this, up-to-date electronic health records can be accessed for online/offline doctor consultations. . The H-Pilot establishes semantic interoperability between various verticals like In-Mobile, On-Site, or in the cloud. Hence, this aims to establish a flow of information from intra-connected and interconnected apps among various verticals.
Healthcare Management and Prediction of Future Illness Through …
175
. The outcomes of H-Pilot assure a high impact on individual data to enhance the efficiencies of health-related emergency services and support assistive living. Consider an individual health scenario, the recommendations and notifications are based on health issues and metadata collected from various connected mobile apps, including information related to exercise, food, health history, sleep patterns, EHR records, diagnostic test reports, and other information such as headaches, coughs, colds, allergies, etc. Through this, up-to-date electronic health records can be accessed for online/offline doctor consultations. The H-Pilot establishes semantic interoperability between various verticals like In-Mobile, On-Site, or in the cloud. Hence, this aims to establish a flow of information from intra connected and interconnected apps among various verticals. The outcomes of H-Pilot assure a high impact on individual data to enhance the efficiencies of health-related emergency services and support assistive living. The Social IoT (SIoT) framework consists of various layers to establish a deeply interconnected system as shown in Fig. 1. The concept of exploiting SIoT for recommendation services among various verticals mentioned in the framework consists of 5 layers. The interoperability between various verticals which are in In-Mobile (HealthifyMe, 1 mg, Apollo 24/7, etc.), On-Site (e-health-kiosks, health monitoring devices, wearables etc.), or in the cloud(EHR databases, collaborative care, etc.). The proposed SIoT Computational Framework is given in Fig. 11. Sensor layer: It contains a varied range of heterogeneous devices like sensors, actuators, RFID, cameras,etc., responsible for sensing and collecting information from the apps present in the users’ mobile. Mobile app layer: Itcontains apps in various verticals like health, travel, wealth, entertainment, safety, and family. These apps communicate with the sensor layer for information and establish social relationships and friendship circles. The actuated information is forwarded to the network layer. Network layer: It is composed of private wireless networks, public mobile networks, satellite networks, wifi, and the Internet. The data received is forwarded to the next layer for processing. All the apps in the mobile app network, at present are working in one vertical. Each app is application specific; it might be for activity tracking or fall detection etc. These are highly efficient autonomous units making enormous features possible to support day-to-day activities. The SIoT-based H-Pilot requires data sharing among different verticals to provide a service. Semantic interoperability Layer: To share the data among various verticals semantic interoperability is required. oneM2M and FIWARE are the interoperability platforms that can be used to make data sharing among multiple verticals. The overall development of H-Pilot, an autonomous intelligent advisory system has many phases which include the identification of existing Mobile apps with inbuilt sensors in multiple verticals, semantic interoperability with appropriate communication protocols, services through H-Pilot like automated recommendations and assistance, and development of the cloud-based mobile application.
176
H. Akkineni et al.
Fig. 11 The proposed SIoT computational framework
5 Sample Application in Health Scenario A sample health scenario is depicted in Fig. 12. Consider a person named Ram who uses a socialIoTbased H-Pilot system for his health-related recommendations and notifications. In this scenario, H-pilot system collects all his personal information related to exercise, food intake, health history, sleep patterns, EHR records, diagnostic test reports, and health-related information
Healthcare Management and Prediction of Future Illness Through …
177
Fig. 12 Health scenario
such as headaches, coughs, colds, cramps, allergies, etc. from various connected apps in his mobile. By considering metadata from various apps, it provides recommendations on his health issues and provides up-to-date digital health records for doctor online/offline consultations. It notifies Ram about the usage of medicines at regular intervals according to the prescription. 1. Ram wakes up and does exercise which is taken care of by the FitnessPal. It keeps track of Ram’s exercise stats and records calories burnt. Based on personal health goals it recommends Ram change his eating habits. 2. While doing a complicated workout Ram feels a mild pain in his chest. Immediately Apollo 24/7 tracks his health vitals through the wearable devices he has. The symptom tracker advises Ram based on his vitals and symptoms and books for lab tests and preventive health check-ups for a home pick-up. 3. His socialIoT Health Networks activates the Health Log App and any healthrelated information such as headaches, coughs, colds, allergies, etc. are noted. The severity as well as the kind of pain is also recorded. 4. The diagnostic test reports are taken by Ram Online doctor consultation feature in Apollo 24/7 connects the call to top doctors in India via video/audio call or chat. 5. Up-to-date digital health records for online consultations can be retrieved using EHR/EMR Health Records App. 6. On observing the patterns from the Sleep Monitor App and HealthLog App the reports taken from the EHR Records app, the doctor prescribes medicines in the online consultation. 7. The medicines are made available through the 1 mg app and the SocialIoT health Recommendation system notifies Ram of the usage of medicines at regular intervals according to the prescription. 8. HealthifyMe gets inputs from all these apps through the SocialIoT network and recommends specialized diet plan, which lets him easily manage dietary health conditions & help the overall immune system fight viral & bacterial infections.
178
H. Akkineni et al.
It also helps Ram to find health advice, and recipes and motivates him towards a healthy lifestyle.
6 Conclusions The proposed SIoT computational framework in the development of H-Pilot is an integrated application that supports the flow of app information from interconnected apps in various verticals like health, smart care, activity, telehealth, communication, and socialization. It provides various AAT services like Alerts and recommendations, activity monitoring with reports, communication among various devices, social and feedback support, navigation, and emergency health monitoring are provided through various autonomous existing apps. The information must be blended to generate overall actions to increase efficiency and effectiveness to satisfy the tasks and needs of the individuals with personalized recommendations for improving the QOL. It ensures all the services are provided at a personalized level with extended cloud care in support of an assistive living environment. To make the study’s contributions more evident, a sample user story is depicted to show the usage of the H-Pilot enabling person-centric healthcare using Ambient Assistive Technologies.
References 1. Collier, R. (2017, November 13). Electronic health records contributing to physician burnout. CMAJ, 189(45), E1405–E1406. https://doi.org/10.1503/cmaj.109-5522. PMID: 29133547; PMCID: PMC5687935. 2. High, R. (2012). The era of cognitive systems: An inside look at IBM Watson and how it works. IBM Corporation, Redbooks, 1, 16. 3. Lee, K. Y., & Kim, J. (2016). Artificial intelligence technology trends and IBM Watson references in the medical field. Korean Medical Education Review, 18(2), 51–57. 4. Sanchez-Pi, N., Mangina, E., Carbo, J., & Molina, J. (2010). Multi-agent System (MAS) Applications in Ambient Intelligence (AmI) Environments (pp. 493–500). https://doi.org/10.1007/ 978-3-642-12433-4_58 5. Gams, M., Gu, I. Y.-H., Harma, A., Munoz, A., & Tam, V. (2019). Artificial intelligence and ambient intelligence. Journal of Ambient Intelligence and Smart Environments, 11, 1. 6. Living with Ambient Intelligence: So at Home with Technology, Retrieved from https://www. infosys.com/insights/ai-automation/ambient-intelligence.html, Accessed 11 November 2022. 7. World Health Organization. Assistive Technology. (2018). Retrieved from: https://www.who. int/news-room/fact-sheets/detail/assistive-technology, Accessed 20 July, 2020. 8. Sixsmith, A., Gibson, G., Orpwood, R. D., & Torrington, J. M. (2007). New technologies to support independent living and quality of life for people with dementia. Alzheimer’s Care Today, 7, 194–202. 9. Sixsmith, A. (2000). An evaluation of an intelligent home monitoring system. Journal of Telemedicine and Telecare, 6, 63–72. https://doi.org/10.1258/1357633001935059 10. Sixsmith, A. (2006). New technologies to support living and quality of life for older people with dementia. Alzheimer’s Care Today, 7, 194–202.
Healthcare Management and Prediction of Future Illness Through …
179
11. Kleinberger, T., Becker, M., Ras, E., Holzinger, A., & Muller, P. (2007). Ambient intelligence in assisted living: enable elderly people to handle future interfaces. In C. Stephanidis (Ed.), Universal access in HCI (Part II, pp. 103–112). Springer. 12. Unbehaun, D., Vaziri, D. D., Aal, K., Wieching, R., Tolmie, P., & Wulf, V. (2018). Exploring the potential of exergames to affect the social and daily life of people with dementia and their caregivers. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems; Montreal, QC, Canada. 21–26 April 2018 (pp. 62–77). ACM. 13. Unbehaun, D., Vaziri, D. D., Aal, K., Li, Q., Wieching, R., Wulf, V. (2018). Video-game based exergames for people with dementia and their caregivers. In Proceedings of the 2018 ACM conference on supporting groupwork; Sanibel Island, FL, USA. 7–10 January 2018 (pp. 401–405). ACM. 14. Dietlein, C., Eichberg, S., Fleiner, T., & Zijlstra, W. (2018). Feasibility and effects of serious games for people with dementia: A systematic review and recommendations for future research. Gerontechnology, 17, 1–17. 15. Sayago, S., Rosales, A., Righi, V., Ferreira, S. M., Coleman, G. W., Blat, J. (2016). On the conceptualization, design, and evaluation of appealing, meaningful, and playable digital games for older people. Games and Culture, 11, 53–80. 16. Wittland, J., Brauner, P., & Ziefle, M. (2015). Serious games for cognitive training in ambient assisted living environments—A technology acceptance perspective. In J. Abascal, S. Barbosa, M. Fetter, et al. (Eds.), Proceedings of the 15th INTERACT 2015 conference, LNCS volume 9296 (pp. 453–471). Springer International Publishing. 17. Istepanian, R. S. H., Hu, S., Philip, N. Y., & Sungoor, A. (2011). The potential of internet of mhealth things m-IoT for non-invasive glucose level sensing. In Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 18. Marcelino, I., Laza, R., Domingues, P., Gómez-Meire, S., Fdez-Riverola, F., & Pereira, A. (2018). Active and assisted living ecosystem for the elderly. Sensors, 18, 1246. 19. Cicirelli, G., Marani, R., Petitti, A., Milella, A., & D’Orazio, T. (2021). Ambient assisted living: A review of technologies, methodologies and future perspectives for healthy aging of population. Sensors, 21(10), 3549. https://doi.org/10.3390/s21103549 20. Centers for Disease Control and Prevention. Healthy Places Terminology. 2018. https:// www.cdc.gov/healthyplaces/terminology.htm, Accessed 31 October 2022, WebCite Cache ID 6zziV6N5A. 21. Biermann, H., Offermann-van Heek, J., Himmel, S., & Ziefle, M. (2018, December). Ambient assisted living as support for aging in place: Quantitative users’ acceptance study on ultrasonic whistles. JMIR Aging, 12, 1(2), e11825. https://doi.org/10.2196/11825. PMID: 31518245; PMCID: MC6715023. 22. Praveen, S. P., Ali, M. H., Jaber, M. M., Buddhi, D., Prakash, C., Rani, D. R., & Thirugnanam, T. (2022). IOT-enabled healthcare data analysis in virtual hospital systems using industry 4.0 smart manufacturing. International Journal of Pattern Recognition and Artificial Intelligence. https://doi.org/10.1142/S0218001423560025 23. Rodrigues, J. J., Segundo, D. B. D. R., Junqueira, H. A., Sabino, M. H., Prince, R. M., AlMuhtadi, J., & De Albuquerque, V. H. C. (2018). Enabling technologies for the internet of health things. IEEE Access, 6, 13129–13141. https://doi.org/10.1109/ACCESS.2017.2789329 24. Hoey, J., Poupart, P., Bertoldi, A., Craig, T., Boutilier, C., & Mihailidis, A. (2010). Automated handwashing assistance for persons with dementia using video and a partially observable markov decision process. Computer Vision and Image Understanding, 114, 503–519. https:// doi.org/10.1016/j.cviu.2009.06.008 25. Srinivasu, P. N., Ijaz, M. F., Shafi, J., Wo´zniak, M., & Sujatha, R. (2022). 6G driven fast computational networking framework for healthcare applications. IEEE Access, 10, 94235– 94248. https://doi.org/10.1109/ACCESS.2022.3203061
180
H. Akkineni et al.
26. Ahmed, S., Srinivasu, P. N., Alhumam, A., & Alarfaj, M. (2022). AAL and internet of medical things for monitoring type-2 diabetic patients. Diagnostics, 12(11), 2739. https://doi.org/10. 3390/diagnostics12112739 27. Haque, A., Milstein, A., & Fei-Fei, L. (2020). Illuminating the dark spaces of healthcare with ambient intelligence. Nature, 585, 193–202. https://doi.org/10.1038/s41586-020-2669-y 28. Ge, C., Yin, C., Liu, Z., Fang, L., Zhu, J., & Ling, H. (2020). A privacy preserve big data analysis system for wearable wireless sensor network. Computers and Security, 96, 101887. https://doi.org/10.1016/j.cose.2020.101887 29. Tantinger, D., & Braun, A. (2011). Virtual coach reaches out to me: The V2me-project. ERCIM News, 87, 34–35. 30. Koninklijke Philips Electronics N. V. (2022). Telehealth, Retrieved March 1, 2022, from http:/ /telehealth.philips.com/. 31. Simply Home. (2013). How it works, Retrieved March 1, 2013, from http://www.simplyhome. com/HowItWorks.html. 32. CompanionAble. (2013). Integrated cognitive assistive and domotic companion robot systems for ability and security, Retrieved March 1, 2013, from companionable.net. 33. Grguric, A., Gil, A. M. M., Huljenic, D., Car, Z., Podobnik, V. (2016). A survey on user interaction mechanisms for enhanced living environments. In ICT Innovations 2015 (pp. 131– 141). Springer International Publishing. 34. Junnila, S., Kailanto, H., Merilahti, J., Vainio, A. M., Vehkaoja, A., Zakrzewski, M., & Hyttinen, J. (2010). Wireless, multipurpose in-home health monitoring platform: Two case trials. IEEE Transactions on Information Technology in Biomedicine, 14, 447–455. 35. Byrne, C. A., Collier, R., & O’Hare, G. M. P. (2018). A review and classification of assisted living systems. Information, 9(7), 182. https://doi.org/10.3390/info9070182 36. Jung, E. Y., Kim, J. H., Chung, K. Y., et al. (2013). Home health gateway based healthcare services through U-health platform. Wireless Personal Communications, 73, 207–218. https:// doi.org/10.1007/s11277-013-1231-8 37. Schlögl, S., Chollet, G., Milhorat, P., Deslis, J., Feldmar, J., Boudy, J., Garschall, M., Tscheligi, M. (2013). Using wizard of Oz to collect interaction data for voice controlled home care and communication services. In Proceedings of the international conference on signal processing, pattern recognition and applications, Innsbruck, Austria, 12–14 February 2013. 38. Costa, R., Novais, P., Lima, L., Carneiro, D., Samico, D., Oliveira, J., Machado, J., & Neves, J. (2009). Virtualecare: Intelligent assisted living. In Electronic healthcare (pp. 138–144). Springer. 39. Moen, A., & Brennan, P. F. (2005, November). Health@Home: The work of Health Information Management in the Household (HIMH): Implications for Consumer Health Informatics (CHI) innovations. Journal of the American Medical Informatics Association, 12(6), 648–656. https:/ /doi.org/10.1197/jamia.M1758 40. Jara, A. J., Zamora, M. A., & Skarmeta, A. F. G. (2011). An internet of things–based personal device for diabetes therapy management in ambient assisted living (AAL). Personal and Ubiquitous Computing, 15, 431–440. https://doi.org/10.1007/s00779-010-0353-1 41. Rodríguez, M. D., Roa, J. R., Morán, A. L., et al. (2013). CAMMInA: A mobile ambient information system to motivate elders to exercise. Personal and Ubiquitous Computing, 17, 1127–1134. https://doi.org/10.1007/s00779-012-0561-y 42. Bedmond. (2010). Behaviour pattern-based assistant for early detection and management of neurodegenerative diseases, Retrieved March 1, 2013, from http://www.aladdin-project.eu/ home.aspx 43. Antoun, W, Abdo, A., Al-Yaman, S., Kassem, A., Hamad, M., & El-Moucary, C. (2018). Smart Medicine Dispenser (SMD). In 2018 IEEE 4th Middle East Conference on Biomedical Engineering (MECBME) (pp. 20–23). https://doi.org/10.1109/MECBME.2018.8402399
Healthcare Management and Prediction of Future Illness Through …
181
44. Rzepka, A., Modre-Osprian, R., Drobics, M., Hayn, D., Schreier, G. (2010). The internet of things for ambient assisted living. In Third international conference on information technology: New generations (804–809). https://doi.org/10.1109/ITNG.2010.104 45. Wang, Q., Shin, W., Liu, X., Zeng, Z., Oh, C., Al Shebli, B. K., Sha, L. (2006). I-Living: An open system architecture for assisted living. In Proceedings of the 2006 IEEE international conference on systems, man and cybernetics, Taipei, Taiwan, 8–11 October 2006 (pp. 4268– 4275). 46. Messens, L., Quinn, S., Saez, I., & Squillace, P. (2013). Home Sweet Home: Health monitoring and sOcial integration environMent for Supporting WidE ExTension of independent life at HOME. ICT PSP—Health, Ageing and Inclusion Programme. 47. Aipermon. (2013). Products, Retrieved March 1, 2013, from http://www.aipermon.com/pro dukte-aipercarenutzung.htm. 48. Koninklijke Philips Electronics N. V. (2013). Telehealth, Retrieved March 1, 2013, from http:/ /telehealth.philips.com/. 49. Environment for Supporting Wide Extension of Independent Life at HOME. (2013). ICT PSP—health, ageing and inclusion programme; Grant Agreement No 250449; Document D7.3; Intermediate Trial Evaluation Report; ICT PSP. 50. Minaam, D. S. A., & Abd-ELfattah, M. (2018). Smart drugs: Improving healthcare using smart pill box for medicine reminder and monitoring system. Future Computing and Informatics Journal, 3, 443–456. 51. Sterns, A., Hughes, J., Masstandrea, N., & Smith, J. (2012). Medication event monitoring system. U.S. Patent 14,357,052, 12 November 2012. 52. Marek, K. D., Stetzer, F., Ryan, P. A., Bub, L. D., Adams, S. J., Schlidt, A., Lancaster, R., & O’Brien, A. M. (2013). Nurse care coordination and technology effects on health status of frail elderly via enhanced self-management of medication: Randomized clinical trial to test efficacy. Nursing Research, 62, 269–278. 53. MacLaughlin, E. J., Raehl, C. L., Treadway, A. K., Sterling, T. L., Zoller, D. P., & Bond, C. A. (2005). Assessing medication adherence in the elderly. Drugs and Aging, 22, 231–255. 54. Mayer, C., Morandell, M., Gira, M., Hackbarth, K., Petzold, M., Fagel, S. (2012). AALuis, a user interface layer that brings device independence to users of AAL systems (pp. 650–657). Springer. 55. Røssvoll, T. H. (2013). The European MobileSage project—Situated adaptive guidance for the mobile elderly: Overview, status, and preliminary results. In Proceedings of the Sixth International Conference on Advances in Computer-Human Interactions (ACHI), Nice, France, 24 February–1 March 2013. 56. Flagships, Fet, MárioCampolargo and ElisabettaSonnino. “WayFiS : Way finding for Seniors.” (2018). 57. Morandell, M., Rumsch, A., Biallas, M., Kindberg, S., Züsli, R., Lurf, R., & Fuxreiter, T. (2013). iWalkActive: An active walker for active people. Assistive Technology: From Research to Practice, 33, 216–221. 58. Consolvo, S., McDonald, D. W., Toscos, T., Chen, M. Y., Froehlich, J., Harrison, B., Klasnja, P., LaMarca, A., LeGrand, L., Libby, R., et al. (2008). Activity sensing in the wild: A field trial of ubifit garden. In Proceedings of the SIGCHI conference on human factors in computing systems, Florence, France, 5–10 April 2008 (pp. 1797–1806). 59. Hildebrand, A., Sá, V. (2000). Embassi: Electronic multimedia and service assistance. In Proceedings of the IMC 2000, London, UK, 26–31 July 2000 (pp. 50–59). 60. Smart Homes. (2013). Complete ambient assisted living, Retrieved March 1, 2013, from http:/ /www.smarthomes.nl/Innovatie/Europees-Onderzoek/Caalyx.aspx 61. Brox, E., & Hernandez, J. E. G. (2011). Exergames for elderly: Social exergames to persuade seniors to increase physical activity. In Proceedings of the 5th IEEE International Conference on Pervasive Computing Technology for Healthcare, Dublin, Ireland, 23–26 May 2011 (pp. 546– 549).
182
H. Akkineni et al.
62. Castro, M. D., Ruiz-Mezcua, B., Sánchez-Pena, J. M., García-Crespo, Á., Iglesias, A., & Pajares, J. L. (2012). Tablets helping elderly and disabled people, smart homes (pp. 237–244). Dutch Expert Centre on Home Automation, Smart Living & E-Health. 63. Lattanzio, F., Abbatecola, A. M., Bevilacqua, R., Chiatti, C., Corsonello, A., Rossi, L., Bustacchini, S., & Bernabei, R. (2014). Advanced technology care innovation for older people in Italy: Necessity and opportunity to promote health and wellbeing. Journal of the American Medical Directors Association, 15, 457–466. 64. Mathisen, B. M., Kofod-Petersen, A., & Olalde, I. (2012). Co-Living social community for elderly (pp. 38–46). 65. Moreira, H., Oliveira, R., Flores, N. (2013). STAlz: Remotely supporting the diagnosis, tracking and rehabilitation of patients with Alzheimer’s. In IEEE 15th international conference on ehealth networking, applications and services (Healthcom); 2013, October 9–12 (pp. 584–1). IEEE.
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based on ECG Dataset Ojaswa Yadav, Ayush Singh, Aman Sinha, Chirag Vinit Garg, and P. Sriramalakshmi
Abstract The ECG is a critical component of computer-aided arrhythmia detection systems since it helps to reduce the rise in the death rate from disorders of the circulatory system. However, due to the intricate changes and imbalance of electrocardiogram beats, this is a difficult problem to solve. This study provides an innovative and enhanced ResNet-50 model using a Conv-1D model with Long Short Term Memory (LSTM) based on Convolution Neural Network (CNN) approach for arrhythmia identification using ECG data, including proper parameter optimization and model training. The results of applying the proposed model to the MIT-BIH arrhythmia database demonstrates that the model performs better, having an accuracy of 98.7% and a MSE of 0.06 when compared to other classification methods. Keywords Convolution Neural Network (CNN) · Long Short Term Memory (LSTM) · ResNet-50 · Cardiovascular disease (CVD) · Electrocardiogram (ECG)
1 Introduction In recent years, as the economy is picked up, Cardiovascular disease (CVD) incidence and deaths are continued to rise, and the trend is becoming more visible, particularly among young people. Worldwide, (CVDs) constitute the most prominent reason of mortality [1]. According to the WHO, In 2016, 17.9 million individuals were killed, accounting for 31% of all deaths due to CVD. Cardiovascular diseases are the leading cause of mortality in the globe. Arrhythmia is a frequent condition that can result in cardiac arrest or death [1]. The majority of patients with acute CVDs manifest loss of consciousness shortly after onset of symptoms which may also lead to death if not treated within 24 hours [2]. As a result, detecting irregular heartbeats in electrocardiograms (ECGs) in a timely and precise manner has become a major challenge in the medical industry. Worldwide, (CVDs) constitute the leading cause of O. Yadav · A. Singh · A. Sinha · C. Vinit Garg · P. Sriramalakshmi (B) School of Electrical Engineering, Vellore Institute of Technology, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_8
183
184
O. Yadav et al.
Fig. 1 ECG signal example
death [3]. Arrhythmia is caused by abnormal intracardiac conduction which changes the structure of the heart or causes it to beat irregularly [4]. An electrocardiogram is a visual representation of the human heart’s electrical activity. Collecting signals to get the exact physiological condition of different sections of your heart is an essential aspect of clinical diagnosis. Arrhythmia has a clinical diagnostic reference value for automatic analysis and diagnosis based on Electrocardiogram dataset [5]. As some arrhythmia forms are extremely uncommon [6] patients must be watched for an extended period of time to determine, kind of arrhythmia. The electrocardiogram has long been the primary tool for diagnosing Cardiovascular diseases [7], and it is especially useful for detecting arrhythmia Fig. 1 shows the ECG signal, which is made up of three waves [8]. A PQRST complex is part of the ECG. The P wave is produced by the sinoatrial node (SA), which is the heart’s pacemaker. The atrioventricular node generates the QRS wave (AV). Atrial depolarization is shown by the P wave in an ECG complex. Ventricular depolarization is caused by the QRS, while ventricular repolarization is caused by the T wave. In clinical practice, clinicians use visual inspection and manual interpretation methods to identify changes in ECG values in order to identify cardiovascular diseases. CVD markers, on the other hand, can happen at random times due to the non-stationary and non-linear character of electrocardiogram signals [9]. Because of this and other considerations, classifying arrhythmic heartbeats in electrocardiogram data is difficult and time-consuming and that is practically impossible to complete manually. As a result, automatic classification systems that examine recordings and categorize pulse types have become increasingly significant. Researchers in field of AI, created several machine learning as well as deep learning algorithms to categorize arrhythmias during the last few decades. ANNs, for example, are widely used for pattern recognition, classification, and other tasks. Neural networks come in a variety of shapes and sizes. Before implementing our process, we surveyed numerous different existing models. After the survey we introduced new concepts to our model so as to give the most efficient and accurate results. In recent years, artificial intelligence was employed in the field of electrocardiogram waveform analysis. Various ML and, in
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
185
particular, DL algorithms have shown to be useful in identifying abnormal ECG events, improving the accuracy of a variety of heart disorders. Many approaches for classifying ECGs automatically are proposed in the literature. The time-domain [10], wavelet transform [11], genetic algorithm [12], SVM [13], Bayesian [14], and different approaches are used to differentiate the kind of ECG beat. Although the approaches above produce great accuracy on the experiment based data, the performance is strongly dependent on the extraction features of manual design procedures. In terms of data processing, treating the electrocardiogram signal as 1D dataset and processing it using typical methods for plain text [15] is one option. The dataset was separated into a separate train set and a testing set by Chazal et al. [17], making the testing results more realistic. Saini et al. [18] classified an electrocardiogram signal’s pulsation into four categories. Thomas et al. [19] used a dualtree wavelet transform on an electrocardiogram dataset to achieve automated feature extraction. The newly developed deep learning models are strong analyzing models which considerably reduce the use of artificial features, though being computationally costly [9]. Deep Neural Networks (DNNs), which are further categorized into CNNs, Recurrent Neural Network (RNN), and LTSM, used in DL models (LSTM). CNNs are frequently employed in a variety of disciplines. Deep learning is at the forefront of pattern recognition and machine learning. It establishes a framework in which feature extraction and categorization are carried out simultaneously [20]. Image categorization [21], target recognition [22], and illness prediction [23] are only a few of the applications of deep learning. It’s also useful for deciphering bioinformatics signals [24–28]. To automatically detect five ECG beat types, Acharya et al. [24] suggested a 9 layer CNN. For arrhythmia identification, Yildirim et al. [26] developed an end-to-end 1D-CNN model. A deep neural network (DNN) is built by Hannun et al. [27] to recognize 12 rhythm ECG classifications. The U-Net auto encoder was utilized by Oh et al. [28] to detect five arrhythmias. Xu et al. [29] employed a DNN to categorize ECG data from beginning to end, indicating the feasibility of comprehensive ECG analytical intelligence. To differentiate the R-R interval, Yande et al. [30] presented a double-layer CNN (the tie difference between two consecutive R waves of the QRS complex wave. A blockbased neural network [31] was particularly constructed by Jiang and Seong Kong. A detailed and deep CNN with batch weighted loss was proposed by Sellami and Hwang [32]. Although the findings of the aforementioned studies are impressive, deeper information cannot be recovered due to the amount of layers in a neural network. The learning and understanding ability of a neural network naturally rises as the number of layers grows. Deepening and detailing the network, on the other hand, may cause gradient dissipation, limiting the model’s performance [8] and stopping it from converging. The ResNet structure [33] appears to be a good choice used by researchers in this particular area to deal with this. Before the implementation of the proposed methodology, numerous existing models are reviewed. After the intensive survey, new concepts are introduced to the proposed model so as to give the most efficient and accurate results.. In this paper, a CNN technique is used to classify electrocardiogram heartbeats for the aim of diagnosing arrhythmia situations.
186
O. Yadav et al.
The major contributions of the article are as follows: • Removal of null values and proper visualizing of our dataset with its division into five different classes, based on the type of condition. • A better version of the Res-Net model which is explained and proposed for electrocardiogram heartbeat classification, which makes use of LSTM and CNN based techniques to give better overall accuracy results. • Using the MIT-BIH arrhythmia database, performance comparison of the proposed model with other models is presented, revealing that our proposed model achieved better accuracy when compared to other models. (3) The organization of the article is as follows: Sect. 2 elaborates the proposed methodology and Sect. 3 explains the overall process of the proposed system. The Sect. 4 discusses the result and inferences. The last section concludes the article and presents future scope of work. (5).
2 The Proposed Methodology This work is divided into two different parts. The first was applying data preprocessing and making it balanced and ready for use. The second part consisted of using Convolution Neural Network classification techniques to detect arrhythmia. Figure 1 given below tries to explain the roadmap for the entire methodology and how it was completed.
3 The Overall Process of the Proposed System 3.1 Data Acquisition In this detailed study, the dataset was obtained from MIT-BIH Arrhythmia data records. The arrhythmia database is a publicly available dataset that contains standard material for cardiac arrhythmia detection. It is utilized for basic research, evaluation and development of medical devices on heart rhythm and related illnesses since 1980. The aim of the database is to produce artificial and automated arrhythmia detectors that read signal diversity and can perform automated cardiac diagnosis based on that information. The complexities of the ECG, such as changes in the heartbeat waveform and matching cardiac pulse, as well as the significant effects of artifacts and noise, make signal processing challenging. Due to which, it is easy to automate the recording of ECG signals, and there variety of publicly available databases that save the recorded ECG signals for future medical use.
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
187
3.2 Data Preprocessing and Visualization Even though ECG data taken from the MIT-BIH dataset is unlikely to have as many null and redundant values as data received directly from a patient or a hospital, it still contains noise which needs to be aided for the further steps of the system. Firstly, dataset is examined and null values are removed. Now, the dataset is divided into two parts for training and testing the model in a ratio of 80:20. 80% is considered as training dataset and 20% is the testing dataset. It is then stored in two different csv files. The dataset consists of 5 different classes namely, normal beats, Superventricular premature beats, Premature beats, Mixture of ventricular normal beats and unclassified beats. Data visualization is performed on the two csv files to get a better idea of the five classes and the number of cases available in each five classes. It is visualized in the form of graph to represent in a better way as shown in Fig. 2. The complete data is represented in Table 1. Similarly, the testing dataset is visualized and described in Fig. 3 and Table 2. Then, to understand ECG better for different classes, 1-beat ECG for every category of classes is prepared and represented as shown in the Fig. 4. Fig. 2 Flowchart for methodology
188
O. Yadav et al.
Table 1 Training classes with different ECGs
Normal beats
72,471
Super ventricular premature beats
2223
Premature ventricular contractions
5788
Mixture of ventricular and normal beats
641
Unclassified beats
6431
Fig. 3 Training dataset distribution among 5 classes
Table 2 Testing classes with different ECGs
Normal beats
18,118
Super ventricular premature beats
556
Premature ventricular contractions
1448
Mixture of ventricular and normal beats
162
Unclassified beats
1608
3.3 ECG Classification A novel CNN classification technique is developed for the classification and detection of Arrhythmia using ECG signal. The model is created by using ResNet, LSTM and Conv1-D model as its backbone. The model is able to outperform various other machine learning as well as deep learning classification models, with an accuracy of 98.6%. (a) CNNs, ResNet and LSTM CNNs have two benefits over standard neural networks: sharing of weight and local connection, which improves its capacity to extract features while minimizing the number of training parameters. An input layer, a convolution layer,
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
189
Fig. 4 Testing dataset distribution among 5 classes
a pooling layer, a fully connected layer, and an output layer make up the basic structure of a CNN, with the output of 1 layer functioning as the input for the next layer in the network as depicted in Fig. 5. In most cases, the convolution and pooling layers are employed in the architecture in that order. Convolution layer, which is at the heart of the CNN, is made up of numerous feature maps, each of which comprises various neurons. CNN is used to classify images, layer scans the picture using the convolution kernel after which it extracts image features using the information from neighboring regions in the image. Figure 6 shows the structure of the CNN:
Fig. 5 1 beat ECG for each category
190
O. Yadav et al.
Fig. 6 Structure of CNN
⎛ l+1
Xj
= f⎝
⎞ l+1 l+1 X il · ki j + bi ⎠
i∈M j
where X l+1 depicts the (j)th feature of the (l + 1)th convolution layer, depicts the j input characters, f depicts the activation function. When used for image classification, the pooling layer’s job is to mimic a humans visualizing system in order to minimize data dimensions and depicts the picture with higher-level characteristics like follows: i+1
Xj
= X lj ⊗ k l+1 + bl+1 j j
(x) represents pooling operation. CNN’s performance improves as it goes deeper. However, when the network gets deepened, two key issues emerge: (1) the gradient evaporates, causing network convergence to suffer, and (2) the accuracy tends to come to a stand-still. ResNets, which is easier to tune and may increase the accuracy from significantly increasing depth. Figure 7 taken from [32], represents the ResNet building blocks with x as input. A time-recurrent neural network is a LSTM. The delay interval is rather lengthy, making it suited for time-series prediction of critical occurrences [29]. The neural network is capable of successfully retaining past data and learning long-term text dependency information. To update and maintain historical information, the long short-term memory network comprises an input, forget, output gate, and cell unit. An LSTM block is depicted in Fig. 8.
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
191
Fig. 7 Res-net building blocks
Fig. 8 Structure of LSTM
(b) Proposed Model The proposed model makes use of three different models explained earlier to get a novel model with an increased accuracy of 98.6%. The suggested model can extract various ECG dataset features from the same input provided, resulting in effective representation of the ECG data’s internal structural characteristics and hence improved classification accuracy. Based on the MIT-BIH database, the updated Res-net model is employed to achieve high-precision detection and classification of the 5 heartbeat categories. Firstly the ResNet section is used to classify the data, then it is passed to the novel section. It is made using different CNN networks combined with LSTM, which is used as its backbone to finally classify and detect the presence of Arrhythmia in ECG. The model must be compiled before preprocessing and
192
O. Yadav et al.
training the CNN. Various blocks such as Max Pool, Batch Normalization, Convolutional block, identity block which are added together to make the model efficient. The optimizing parameter, the loss function, and the learning rates are all declared as parameters to be calculated during training. The optimizer and the loss function are critical components which allow the convolution neural network to effectively handle database. The learning rate of the neural network is determined by the optimizer’s settings. During the compilation of the model the optimizer is set to Adam, which involves a combination of two gradient descent methodologies. Loss is set to categorical cross entropy to quantify deep learning model errors in the proposed classification model. The model is represented in Fig. 9.
Fig. 9 Structure of proposed model
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
193
Fig. 10 Model accuracy for test data
4 Results The model proposed here is simulated using Jupyter Notebook. The system hardware specification consists of Intel Core i5CPU, a GTX 1650 graphical processing unit, and 16 gigabytes of system memory. The computer OS is Windows 11, and the coding environment consists of Python 3.8.6 with Tensorflow. Kaggle is also used as a source of dataset. ReLu and Sigmoid are used as activation function. TensorFlow is used as framework. In the experiments, the model is trained for 50 epochs and parameters such as accuracy and mean squared error were calculated for all the epochs. Figures 10 and 11 illustrate loss and accuracy, as a function of number of epochs. From these Figs. 10, 11, it is observed that as the number of test epochs increases, there is an increase in the classification accuracy and a significant reduction in model loss. In Figs. 10 and 11 it is observed that after the 45th epoch, the curve stabilizes and no longer there is an increment or decrement in a significant manner. Thus it is reaching a stable classification accuracy and a minimal model loss at the 50th epoch. Table 3 shows that the modified model described in this research and compares it with other state of the art models, which have used the same dataset. The propose model surpasses other models used for heartbeat classification in terms of overall accuracy. Figure 12 depicts the confusion matrix of the proposed model.
5 Discussions The study employed precision, recall, F1-score, and a confusion matrix with multiple values such as true positive denoted as tp, false positive denoted as fp, true negative denoted as tn, and false negative denoted as fn to evaluate the model’s performance based on different metrics. Precision: reflects how well the model performed on the
194
O. Yadav et al.
Fig. 11 Model loss for test data
Table 3 Shows that the modified model described in this research surpasses (overall accuracy)
Models
Accuracy (%)
Ensemble learning [16]
94.20
BbNNs [31]
94 [18]
End-to-end DNN [29]
94.70 [15]
1D-CNN [15]
95.13
Improved ResNet-18 (the proposed model) [34] 96.50 Proposed model Fig. 12 Confusion Matrix of the model
98.4
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
195
Fig. 13 Classification Report for the model
test data. It displays the total number of models that were successfully predicted across all positive classes. t=
tp tp + f p
(1)
The recall is defined as the proportion of total relevant results correctly categorized by the algorithm r=
tp tp + f n
.F1 − score : Mean of Accuracy and Recall. F1 − score = 2 ∗ (Precision ∗ Recall/(Precision + Recall))
(2)
(3)
Figure 13, gives the classification report for the proposed model for all the five different arrhythmia classes. The report consists of scores for precision, recall and f1-score along with the final accuracy. Other than that our model test MSE of 0.06 and a test accuracy of 0.987.
6 Conclusion In this research, a novel RES-CNN-LSTM based model for ECG heartbeat classification is proposed, with the combination ofCNNs and RNNs. The acquired results and findings of the experimentation, it is clear that the proposed model can be used to detect arrhythmia successfully. Furthermore, the results show that the suggested model outperforms other models in terms of accuracy, with a rate of 98% and precision of 99%. Five ECG beat types are studied in this article. Various amounts of noise to ECG readings are introduced in order to discuss how the performance is changing. Further, the studies can include several kinds and multiple beats to generalize the findings.
196
O. Yadav et al.
References 1. Guleria, P., Naga Srinivasu, P., Ahmed, S., Almusallam, N., & Alarfaj, F. K. (2022). XAI framework for cardiovascular disease prediction using classification techniques. Electronics, 11(24), 4086. https://doi.org/10.3390/electronics11244086 2. Saya, S., Hennebry, T. A., Lozano, P., Lazzara, R., & Schechter, E. (2008). Coronary slow flow phenomenon and risk for sudden cardiac death due to ventricular arrhythmias: A case report and review of literature. Clinical Cardiology, 31(8), 352–355. 3. World Health Organization. (2019). Cardiovascular Diseases (CVDs). WHO. 4. National Heart, Lung, and Blood Institute. (2019). Arrhythmia, National Heart, Lung, and Blood Institute. 5. Min, S., Lee, B., & Yoon, S. (2017). Deep learning in bioinformatics. Briefings in Bioinformatics, 18, 851–869. 6. Sanders, R. A., Kurosawa, T. A., & Sist, M. D. (2018). Ambulatory electrocardiographic evaluation of the occurrence of arrhythmias in healthy Salukis. Journal of the American Veterinary Medical Association, 252(8), 966–969. 7. Sannino, G., & de Pietro, G. (2018). A deep learning approach for ECG-based heartbeat classification for arrhythmia detection. Future Generation Computer Systems, 86(Sep.), 446– 455. 8. Wang, J., Ye, Y., Pan, X., & Gao, X. (2015). Parallel-type fractional zero-phase filtering for ECG signal denoising. Biomedical Signal Processing and Control, 18, 36–41. 9. Xie, L., Li, Z., Zhou, Y., He, Y., & Zhu, J. (2020). Computational diagnostic techniques for electrocardiogram signal analysis. Sensors, 20(21), 6318. 10. Katircioglu-Öztürk, D., Güvenir, H. A., Ravens, U., & Baykal, N. (2017). A window-based time series feature extraction method. Computers in Biology and Medicine, 89, 466–486. 11. Jung, Y., & Kim, H. (2017). Detection of PVC by using a wavelet-based statistical ECG monitoring procedure. Biomedical Signal Processing And Control, 36, 176–182. 12. Naga Srinivasu, P., Srinivas, G., & Srinivas Rao, T. (2016). An Automated Brain MRI image segmentation using a Generic Algorithm and TLBO. International Journal of Control Theory and Applications, 9(32), 233–241. 13. Raj, S., Ray, K. C., & Shankar, O. (2016). Cardiac arrhythmia beat classification using DOST and PSO tuned SVM. Computer Methods and Programs in Biomedicine, 136, 163–177. 14. Casas, M. M., Avitia, R. L., Gonzalez-Navarro, F. F., Cardenas-Haro, J. A., & Reyna, M. A. (2018). Bayesian classification models for premature ventricular contraction detection on ECG traces. Journal Of Healthcare Engineering, 2018, Article ID 2694768, 7 pages. 15. Kiranyaz, S., Ince, T., & Gabbouj, M. (2016). Real-time patient-specific ECG classification by 1-D convolutional neural networks. IEEE Transactions on Biomedical Engineering, 63(3), 664–675. 16. Dózsa, T., Bognár, G., & Kovács, P. (2020). Ensemble learning for heartbeat classification using adaptive orthogonal transformations. In Computer aided systems Theory–EUROCAST 2019. Lecture notes in computer science (vol. 12014). Springer. 17. deChazal, P., O’Dwyer, M., & Reilly, R. B. (2004). Automatic classification of heartbeats using ECG morphology and heartbeat interval features. IEEE Transactions on Biomedical Engineering, 51(7), 1196–1206. 18. Saini, I., Singh, D., & Khosla, A. (2014). Electrocardiogram beat classification using empirical mode decomposition and multiclass directed acyclic graph support vector machine. Computers & Electrical Engineering, 40(5), 1774–1787. 19. Thomas, M., Das, M. K., & Ari, S. (2015). Automatic ECG arrhythmia classification using dual tree complex wavelet based features. AEU-International Journal of Electronics and Communications, 69(4), 715–721. 20. Srinivasu, P. N., Shafi, J., Krishna, T. B., Sujatha, C. N., Praveen, S. P., & Ijaz, M. F. (2022). Using recurrent neural networks for predicting Type-2 diabetes from genomic and tabular data. Diagnostics, 12(12), 3067. https://doi.org/10.3390/diagnostics12123067
ResNet-50-CNN and LSTM Based Arrhythmia Detection Model Based …
197
21. Maggiori, E., Tarabalka, Y., Charpiat, G., & Alliez, P. (2017). Convolutional neural networks for large-scale remote-sensing image classification. IEEE Transactions on Geoscience and Remote Sensing, 55(2), 645–657. 22. Russakovsky, O., Deng, J., Su, H., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. 23. Lu, P., Guo, S., Zhang, H., et al. (2018). Research on improved depth belief network-based prediction of cardiovascular diseases. Journal of Healthcare Engineering, 2018, Article ID 8954878, 9 pages. 24. Acharya, U. R., Oh, S. L., Hagiwara, Y., et al. (2017). A deep convolutional neural network model to classify heartbeats. Computers In Biology and Medicine, 89, 389–396. 25. Li, W., & Li, J. (2018). Local deep field for electrocardiogram beat classification. IEEE Sensors Journal, 18(4), 1656–1664. 26. Yıldırım, O., Pławiak, P., Tan, R.-S., & Acharya, U. R. (2018). Arrhythmia detection using deep convolutional neural network with long duration ECG signals. Computers in Biology and Medicine, 102, 411–420. 27. Hannun, A. Y., Rajpurkar, P., Haghpanahi, M., et al. (2019). Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature Medicine, 25(1), 65–69. 28. Oh, S. L., Ng, E. Y. K., Tan, R. S., & Acharya, U. R. (2019). Automated beat-wise arrhythmia diagnosis using modified U-net on extended electrocardiographic recordings with heterogeneous arrhythmia types. Computers in Biology and Medicine, 105, 92–101. 29. Xu, S. S., Mak, M. W., & Cheung, C. C. (2019). Towards end-to-end ECG classification with raw signal extraction and deep neural networks. IEEE Journal of Biomedical and Health Informatics, 23(4), 1574–1584. 30. Xiang, Y., Luo, J., Zhu, T., Wang, S., Xiang, X., & Meng, J. (2018). ECG-based heartbeat classification using two-level convolutional neural network and RR interval difference. Ice Transactions on Information & Systems, E101.D(4), 1189–1198. 31. Wei Jiang, & Seong Kong, G. (2007). Block-based neural networks for personalized ECG signal classification. IEEE Transactions on Neural Networks, 18(6), 1750–1761. 32. Sellami, A., & Hwang, H. (2019). A robust deep convolutional neural network with batchweighted loss for heartbeat classification. Expert Systems with Applications, 122(May), 75–84. 33. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA. 34. Enbiao Jing, Haiyang Zhang, ZhiGang Li, Yazhi Liu, Zhanlin Ji, Ivan Ganchev. (2021). ECG heartbeat classification based on an improved ResNet-18 model. Computational and Mathematical Methods in Medicine, 2021, Article ID 6649970, 13 pages. https://doi.org/10.1155/ 2021/6649970
A Review of Brain-Computer Interface (BCI) System: Advancement and Applications Bishal Kumar Gupta, Tawal Kumar Koirala, Jyoti Rai, Baidyanath Panda, and Akash Kumar Bhoi
Abstract Brain-Computer Interface (BCI) is a cutting-edge and diverse area of ongoing research based on neuroscience, signal processing, biomedical sensors, and hardware. Numerous ground-breaking studies have been conducted in this area over the last few decades. However, the BCI domain has yet to be the subject of a thorough examination. As a result, this study provides an in-depth analysis of the BCI issue. In addition, this research supports this field’s importance by examining several BCI applications. Finally, each BCI system component is briefly explained, including procedures, datasets, feature extraction techniques, evaluation measurement matrices, current BCI algorithms, and classifiers. A basic overview of BCI sensors is also presented. Next, the study describes some unsolved BCI issues and possible remedies. Keywords Brain-computer interface · BCI sensors · Classifiers · BCI domain · Diverse area
B. K. Gupta · T. K. Koirala · J. Rai Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Majhitar, Sikkim 737136, India e-mail: [email protected] T. K. Koirala e-mail: [email protected] J. Rai e-mail: [email protected] B. Panda LTIMindtree, 1 American Row, 3Rd Floor, Hartford, CT 06103, USA A. K. Bhoi (B) Directorate of Research, Sikkim Manipal University, Gangtok, Sikkim 737102, India e-mail: [email protected] KIET Group of Institutions, Delhi-NCR, Ghaziabad 201206, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_9
199
200
B. K. Gupta et al.
1 Introduction For scientists and academics, pursuing the direct connection between a human and a machine has long-held appeal. BCI technology connects the brain to the world. Brainwaves connect the BCI system to the computer. Brain activity can control external gadgets. It’s an interesting study topic and a probable brain-technology link. Numerous research and development initiatives have exploited this notion, one of the fastestgrowing scientific domains. Several scientists tried various BCI types of humancomputer interaction. Modern signal detection, recording, and analysis systems are complicated, having evolved from a basic notion in the early digital age. Hans Berger [1] invented the first electroencephalogram (EEG) [2] in 1929 to monitor brain electrical activity through a human scalp. EEG signals have since been used therapeutically to find abnormalities in the brain. The phrase “brain-computer interface” was originally coined in 1973 by Vidal [3], who attempted the first human-computer interaction using EEG. The author thoroughly broke down every component required to create an effective BCI. People have always been intrigued by the idea of integrating technology with brainpower. This idea has recently come to fruition because of developments in engineering and neurology that have made it possible to mend and perhaps improve human physical and mental capabilities. Deaf people can have cochlear implants [4], and Parkinson’s patients can get deep brain stimulation. Every BCI-based application employs a distinct strategy and methodology. Each technique offers a unique combination of advantages and disadvantages. BCI technology’s future depends on performance improvements. To identify the most promising BCI strategies, objectively assessing and comparing BCI methods is necessary. BCI’s fundamental components include signal capture, preprocessing, feature extraction, categorization, and device control. Sign acquisition is needed to link a brain to a computer and extract information from signals. The associated signal is more usable after preprocessing, feature extraction, and classification [5] (Table 1). Abiri et al. evaluated different EEG-based BCI system experimental paradigms [6]. Each experiment examined multiple EEG decoding and classification systems. Tiwari et al. [7] outlined the development of BCI and introduced brain activity. The human brain, BCI, its phases, and signal extraction techniques and algorithms for utilizing data have been extensively reviewed. The authors described BCI stages, including signal acquisition, feature extraction, and categorization. Brain complexity makes human thoughts and signals non-stationary and nonlinear. Hence, eliciting deeper brain comprehension is difficult. These deeper insights will improve BCI application. Vasiljevic et al. did an SLR on BCI games using standard technology [8]. The authors examined the data to describe the industry and HCI’s issues in BCI-based games that employ consumer-grade technology. The results show that more BCI games, particularly those geared towards pleasure, have more user-friendly controls and were designed for study. However, the process’s search and categorization phases are covered here.
A Review of Brain-Computer Interface (BCI) System: Advancement …
201
Table 1 Current research and reviews on BCI technology Ref. No
Purposes
Challenges
[6]
The EEG-based BCI paradigm’s benefits, drawbacks, decoding algorithms, and classification approaches are assessed
Signal processing, new decoders, closed-loop supervisory control, training, and fatigue
[7]
A thorough analysis of BCI phases, signal extraction techniques, classifiers, and brain anatomy
Human thought is non-stationary, and the signals it generates are nonlinear
[8]
A detailed examination of BCI’s challenges and current EEG-based BCI gaming research
Biased search and categorization
[9]
A thorough review of BCI sensors that detect brain signals
Neurosurgery is required to implant brain sensors
[10]
An overview of current features, classifiers, and the usual invasive and non-invasive BCI approaches
To develop high-resolution, low-density electrode brain signal-capturing devices
[11]
The use of BCI and neurofeedback about haptic technology is briefly described in this work
Only a limited portion of BCI is covered in this research (haptic technology)
[12]
This work employs EEG-based BCI to recognize emotion
Literature couldn’t portray contradicting sentiments since there are no real-life event databases
[13]
This work discusses BCI experiments connected to deep learning using purely non-invasive approaches
This study only focuses on non-invasive brain signals
[14]
This study’s subject was popular methods, including deep learning models and developments in signal sensing technology
There is no discussion or evaluation of popular feature extraction techniques
As BCI technology evolved, [9] investigated how sensory modalities may transmit views. For system applications, the BCI circuit sensor evaluates brain pattern recognition. Sensors differed in geographical, temporal, and practical factors, including portability, pervasiveness, and upkeep. EEG, ECoG, MEG, and MRI were used to assess brain responses [10]. Each application needs machine learning and pattern recognition to understand such responses. This article briefly overviews brain data feature extraction and classification methods. Fleury et al. examined SMR, P300, and SSVEP haptic interface paradigms and device design [11]. The study’s authors identified key tendencies in applying haptics to BCIs and NF, and they assessed the efficacy of alternative methods. Productivity and the usefulness of feedback might benefit from haptic interfaces, particularly in SMR-based motor rehabilitation. Torres et al. [12] reviewed the 2015–2020 scientific literature. It compares BCI implementation methodologies and trends. Emotions, classification algorithms, datasets, and performance assessments were explored. Zhang et al. [13] studied deep learning and non-invasive brain signal classification. This study discusses brain
202
B. K. Gupta et al.
signals and deep learning for BCI research. Gu et al. [14] reviewed computational intelligence and EEG signal-detecting BCI research to address gaps (2015–2019). EEG data was improved using advanced signal identification and augmentation methods. This paper investigated BCI’s present concerns. In addition, it suggested future research topics. The study’s contributions are: it presents a BCI taxonomy, describes a few classic BCI systems with workflow and architectural concepts, and summarises brain-computer interface advancements and technology (BCI).
2 Application of BCIs BCI design relies on intent. Nijholt [15] states that BCI apps allow users to watch or follow others. Most command applications use electrodes to change brain impulses to control an external device. Observation-based applications emphasize the need to understand a subject’s mental and emotional condition to react to their surroundings. The following is a list of some BCI [16] usability applications.
2.1 Potential Medical Uses BCIs are generally utilized in medicine to replace or restore brain function lost due to injury. BCIs benefit diagnostics, treatment, and motor rehabilitation for biological goals. By empowering people with mobility issues and providing monitoring and protection, biomedical technology and apps may reduce long-term illness and aid in rehabilitation. However, creating precise technology that can handle unexpected brain responses that might arise from conditions like brain strokes is a serious difficulty in creating such platforms [17]. More information on each of these applications is provided in the following subsections.
2.1.1
Substitute for CNS
These substitutions signify the ability to restore or replace CNS functionality lost due to conditions like paralysis and spinal cord damage brought on by a stroke or other traumatic event. Additionally, people with such disorders may suffer from altered brain functioning, and it might be challenging to design such technology. A motor action potential, also known as myoelectric, is a technology in many robotic prostheses today. It records electrical impulses in muscles. Bousseta et al. [18] developed technology to experimentally maneuver a four-directional robotic prosthetic arm.
A Review of Brain-Computer Interface (BCI) System: Advancement …
2.1.2
203
Evaluation and Diagnosis
BCIs may improve healthcare assessment and diagnosis. Perales et al. [19] proposed a BCI for gauging cerebral palsy children’s attention spans during gameplay. Another study [20] investigated the possibility of diagnosing schizophrenia by utilizing BCI to record EEG features. There are several diagnostic techniques as well, including those for detecting brain tumors [21], locating breast cancer [22], identifying Parkinson’s disease [23], etc. Children may have epilepsy, neurological problems, motor impairments, inattentiveness, or various forms of ADHD [24], among other ailments. Diagnostic tools are vital to patient health. Changing their operation assures industry-standard security, acceptability, and accuracy.
2.1.3
Rehabilitation or Treatment
BCI is used for neurological applications, prosthetics, and therapy [25]. Among the various uses for BCI, post-stroke motor rehabilitation yields encouraging outcomes. Stroke is a sickness that leaves the body permanently disabled and prevents motor or strenuous action by obstructing blood flow [26]. Through the employment of a robot or other forms of gear, stroke therapy applications have claimed to support these activities or user imaginations [27]. Tinnitus, cluster headaches, and Parkinson’s disease are also treated. DBS sends electrical impulses to a brain region that produces PD symptoms [28].
2.2 Non-biomedical Uses BCI technology, especially in non-biomedical applications, has economic promise. Most of these applications are games, emotional computations, or entertaining programs. Medical and military applications research focuses on robustness and high efficiency. Here are a few of the most common types of entertainment as examples:
2.2.1
Gaming
Gaming-focused BCIs are being studied. BCIs can’t replace gaming controllers [34]. To improve game usability, BCI requires further study. Dynamic Difficulty Adjustment (DDA) is triggered when players’ enthusiasm declines [29]. BCI games benefit from EEG data. Building such systems requires fine-tuning game algorithms. BCI powers various games with antiquated visuals.
204
2.2.2
B. K. Gupta et al.
Industry
Industrial robots may use EEG-based BCIs to keep employees safe. These technologies may replace the laborious button and joystick-controlling industrial robots. They may stop machinery if a person is too tired or sick [30].
2.2.3
Artistic Application
BCIs recognize passive, selective, direct, and collaborative creative uses. Passive artistic BCIs deliver pre-programmed brain activity responses without user involvement. Select systems provide limited process control. They won’t control creative production. Creative BCIs provide several brush types and stroke motion options [31].
2.2.4
Transport
BCI is utilized in transportation monitoring to gauge driver fatigue and improve airline pilot performance, which monitors consciousness. When these technologies are used in crucial applications in the BCI system, errors may have a serious financial and human impact on the parties concerned [32].
3 Principles of the BCI’s Operation Each user activity does generate feedback of some type. For example, a command may instruct a robotic arm to move based on a fictitious hand motion. Various internal systems are at work in this arm’s apparently simple action. Its extensive network of synapses and nerves collaborate to communicate. Figure 1 shows how brain impulses are translated into usable commands. 1. Signal acquisition: Brain activity signals are collected and converted into instructions for a virtual or real-world application in BCI. 2. Preprocessing: when the signals have been captured, preprocessing is required.. The signals that the brain produces are often noisy and distorted by artifacts. This stage uses various techniques, including filters, to remove the noise and artifacts. 3. Feature extraction: This stage involves assessing the signal and obtaining data. Getting relevant information from the complex brain activity signal through simple analysis is challenging. Therefore, processing algorithms that extract mental qualities like intention are essential. 4. Classification: The signal free of artifacts is then subjected to classification algorithms. The categorization helps to identify the kind of mental job the individual is doing.
A Review of Brain-Computer Interface (BCI) System: Advancement …
205
Fig. 1 BCI system architecture
5. Device control: The application or feedback device receives a categorization command. In a computer, the signal moves the cursor; in a robotic arm, it moves the arm. We may categorize BCI according to several factors, including reliability, invasiveness, and autonomy, as shown in Fig. 1. 1. Dependability: There are two types of BCI - dependent and independent. Dependent BCIs need particular sorts of motor control, such as gaze control, from the operator or healthy participants. Contrarily, independent BCIs prevent the user from using any motor control; these BCIs are suitable for stroke patients or those with severe disabilities. 2. BCI invasiveness is divided into three categories: partly and non-invasive. Since they directly monitor brain cell activity, invasive BCIs are more accurate. During neurosurgery, invasive BCIs are surgically implanted within the brain. Invasive BCIs pick up brain impulses from many regions. Semi-invasive BCIs employ electrodes on the brain’s accessible edge for electrocorticography (ECoG) to monitor cerebral cortex electrical impulses. Non-invasive BCIs sense externally. EEG, MEG, PET, fMRI, and fNIRS are non-invasive brain studies. EEG is the most popular technology due to its affordability and mobility. 3. Autonomy: BCIs may function in synchronous or asynchronous mode. Synchronous BCI is a system that finishes an interaction in a set period after a system triggers. Asynchronous BCI may need a mental task to interact. Synchronous BCIs are simpler to develop but less user-friendly.
206
B. K. Gupta et al.
3.1 Invasive Invasive BCI is brain-implanted through neurosurgery. Invasive BCIs are more accurate since they monitor every neuron. Instead of sections, Invasive BCI has two units. Multi-unit BCIs can detect signals from several brain cell zones, whereas single-unit BCIs can only detect one [31]. Nevertheless, neurosurgical therapy may cause scar tissue. The foreign object causes the body to create a scar around the electrodes, reducing signal strength. Blind and paraplegic patients often use invasive BCI.
3.2 Partially-Invasive ECoG monitors BCIs by inserting electrodes into the brain’s cortex. For instance, blinking releases electrical activity in your brain. However, because they interfere with our quest for signals, these involuntary behaviors are often not interesting when examining signals. It consists of noise. Compared to non-invasive BCI, ECoGs are less affected by noise, making interpretation simpler [33].
3.3 Electrocorticography (ECoG) Electrocorticography is a mildly invasive brain activity test (ECoG). After cleaning the participant’s skull, electrodes are placed in front of the brain. The skull houses this electrode. EEG signals are far less exact. Being closer to brain activity improves the signal-to-noise ratio. Blinks and eye movements have little effect on ECoG signals. Unfortunately, outside of a surgical context, ECoG is highly difficult to employ and is only useful in the accessible parts of the brain [34].
3.4 Non-invasive Non-invasive neuroimaging technologies have been employed in human studies. Most BCI research involves non-invasive EEG BCIs. Non-invasive EEG technologies and interfaces have many additional uses. Since they don’t need brain surgery, noninvasive applications and technologies have become popular recently (Fig. 2). The non-invasive technology uses an electrode resembling a helmet to monitor brain electrical activity outside the skull. EEG, MEG, fMRI, fNIRS, and PET can assess these electrical potentials (PET). Following is a detailed explanation of BCI techniques:
A Review of Brain-Computer Interface (BCI) System: Advancement …
207
Fig. 2 Basic BCI control signals
3.4.1
Electroencephalography
EEG tracks scalp electrical activity caused by a few brain neurons. These electrical activities are swiftly recorded using several head electrodes, especially on the cortex. Electroencephalography (EEG), with the great temporal resolution, is the most common, safe, and economical brain activity recording technology. Uses active or passive electrodes. Active electrodes amplify signals internally, whereas passive electrodes need an external amplifier. Internal or external amplifiers reduce cable movement-induced background noise and signal errors. One issue with EEG is that to lower skin-electrode contact resistance, gel or saline solutions must be utilized. However, the signal quality is poor, and background noise skews it. The International 10–20 system involves implanting electrodes all over the scalp area for recording purposes [35]. The electrical activity that spans many frequency bands is often called EEG.
3.4.2
Magnetoencephalography
MEG is used to measure the magnetic fields produced by the passage of electricity in the brain (Magnetoencephalography). Because electric fields flow through the skull more intermittently than magnetic fields, they offer better spatial resolution than EEG. The brain’s magnetic field is measured and analyzed using a functional neuroimaging method. MEG acts on the exterior of the skull and is now often used in therapeutic therapy. MEG has grown in importance, particularly for individuals with epilepsy and brain malignancies. In those with epilepsy, tumors, or other mass lesions, it could help spot brain areas with normal function. To provide more data to the EEG, MEG uses magnetic waves instead of electrical ones to function. MEG may also record high temporal and spatial resolution signals. Scanners must thus be placed closer to the brain’s surface to detect cerebral activity that generates minute magnetic fields. Consequently, specialized sensors are needed for MEG, such as superconducting quantum interference (SQUID) sensors [36].
208
3.4.3
B. K. Gupta et al.
fMRI
Non-invasive functional magnetic resonance imaging (fMRI) measures brain activity-related blood oxygen levels. fMRI can detect brain activity due to its high spatial resolution [37]. fMRI has a limited temporal resolution of 1–2 s [38]. Low head movement resolution may cause artifacts. fMRI was invented in the 1990s. Non-invasive, radiation-free, and easy to use, it provides high spatial and temporal resolution. Neurons get oxygen from capillary red blood cells’ hemoglobin. Oxygen requirement enhances blood flow. Oxygenated hemoglobin changes magnetic characteristics. Because of this difference, MRI equipment, a cylindrical tube with a powerful magnet, can identify active brain areas. Diffusion-weighted magnetic resonance imaging (DWI or DW-MRI) images vary based on brain water particle diffusion. Particles move stochastically in diffusion. The temperature, microenvironmental structure, and particles under examination determine brain diffusion [39]. DTI examines the three-dimensional diffusion tensor. It is a strong MRI modality that gives voxel water motion directions. It shows tiny tissue characteristics noninvasively [40].
3.4.4
fNIRS
The infrared radiation is directed into the brain using functional near-infrared spectroscopy (fNIRS) technology [41] to track changes in certain wavelengths when the light is reflected. FNIRS often detects regional blood volume and oxygenation. When a certain section of the brain is active, it needs more oxygen, which is delivered to the neurons by capillary red blood cells—the increased blood flow in the portions of the brain that would be most active at that moment. Monitoring changes in oxygen levels brought on by different activities is done using fMRI. Images with low temporal resolution (>2–5 s) but a high spatial resolution (1 cm) were then collected, comparable to functional magnetic resonance imaging standards.
4 Signal Preprocessing and Signal Enhancement Datasets are frequently corrupted by noise. The recorded data may deteriorate due to human activity, like heartbeats and eye blinking. These sounds are removed during the preprocessing stage to provide clean data that can then be processed for feature extraction and classification. Due to its role in BCI signal cleaning, this preprocessing component is also known as a signal enhancement unit. In the next subsections, several techniques for signal augmentation in the BCI system are thoroughly discussed.
A Review of Brain-Computer Interface (BCI) System: Advancement …
209
4.1 Independent Component Analysis The sound and EEG signals are distinguished in ICA by treating them as independent entities. Additionally, when noise is removed, the data are kept. The EEG signals are split up using this approach into components fixed in space and independent of time. As a result, the ICA exhibits greater efficiency in demonstrated computing and noise [42].
4.2 Common Average Reference It is most typically used as a basic dimensionality reduction approach. This approach minimizes noise across all recorded channels but ignores channel-specific noise and may bring noise into a channel that would otherwise be noise-free. In addition, it is a spatial filter that, by excluding shared EEG activity, leaves just the inactive activity of each EEG electrode [42].
4.3 Adaptive Filters An adaptive filter is a tool for calculating mathematical operations. Signals from the input and output of the adaptive filter are linked periodically. Some filters use an adaptive algorithm and have self-adjusting coefficients. It functions by altering signal properties by the characteristics of the signals under examination [43].
4.4 Principal Component Analysis Principal Component Analysis (PCA) is a pattern discovery method that finds patterns in data by rotating the coordinate axes. These axes indicate a signal pattern utilizing linear combinations of groups of time points rather than being aligned with single time points. The axes are rotated via PCA while maintaining their orthogonal alignment to minimize variance along the first axis. It reduces feature dimensions and enhances data classification by concluding ranking. Whether the noise is eliminated, PCA compresses distinct data more effectively than ICA [44].
210
B. K. Gupta et al.
4.5 Surface Laplacian (SL) High-resolution EEG data visualization is done using a technique known as SL. SL may be created using any reference system for EEG recording since its estimates are reference-free. It generally estimates the current density entering or exiting the skull through the scalp based on the volume conductor’s external shape. It does not need details on volume conduction. SL improves EEG spatial resolution. SL is sensitive to spline sequences and artifacts without extra neuroanatomical foundations [45].
4.6 Denoising Signal Brain-derived EEG data are routinely tampered with by artifacts. To extract useful information from EEG data, these artifacts must be eliminated. Denoising is a method for removing noise or artifacts from EEG data [46]. The following list of denoising techniques includes.
4.6.1
Wavelet Denoising and Thresholding
The discrete wavelet domain of the EEG data is converted using multi-resolution analysis. Specific noise signal coefficients are reduced using the contrasted or adaptive threshold level [47]. A wavelet representation that matches noise characteristics across time and scale would have shorter coefficients. However, threshold selection is the most important factor in a good wavelet denoising process. In this situation, thresholding may separate the signal from the noise; as a result, thresholding techniques exist in various forms and sizes. Hard thresholding converts coefficients below a threshold to zero. Soft thresholding may cut coefficient values in half [48].
4.6.2
Empirical Mode Decomposition Is a Technique for Multivariate Signal Analysis (EMD)
The intrinsic mode functions, which split the signal into a series of zero-mean signals with regulated frequency and amplitude, are often used to describe this process (IMFs). Wavelet decomposition, which separates a signal into several IMFs, is used in EMD and compares IMFs. It disintegrates these IMFs by a dynamic mechanism. The mean value of a function with an IMF is zero, there is only one maximum between zero crossings, and there are no intermediate values. IMF degradation leaves leftovers. Signal characterization requires these IMFs [49]. Surface Laplacian (SL) is used in 32% of BCI designs to extract features, whereas 22% use PCA or ICA, 14% use CSP, and 11% use CAR, according to reference [50].
A Review of Brain-Computer Interface (BCI) System: Advancement …
211
5 Feature Extraction Techniques One must grasp features, their properties, and how to use them to find the best BCI classifier. BCI feature extraction is vital since a categorization system’s accuracy and efficiency rely on sample features [51]. Most non-invasive BCI devices utilize neuroimaging like MEG and MRI. EEG’s high temporal resolution and inexpensive cost make it the most widely utilised technology [52]. Therefore, EEG signal feature extraction is essential for mental state classification in a BCI system.
5.1 Extracting Features from EEG Data Electroencephalography is BCI’s most common neuroimaging approach to recognise acquired events. Therefore, a BCI system’s classification step relies on EEG signal feature extraction technique. According to [53] on EEG, the following sections elaborate on three feature extraction methods. These features include time, frequency, and time-frequency domains.
5.1.1
Time Domain
It shows signal energy distribution in the time–frequency plane [54]. A time– frequency analysis is useful for understanding rhythmic information within EEG data. The time-domain features of EEG are simple to fix, but they have the drawback of having non-stationary, time-varying signals. In time-domain techniques, features are often determined using signal amplitude values, which may be altered by interference like noise during EEG recording (Table 2).
5.1.2
Frequency Domain
In frequency-domain signal analysis, frequency-domain properties are considered. For example, the percentage of a signal within a certain frequency range may be seen in its frequency domain representation. In addition, power spectral density is often used to acquire frequency domain characteristics (PSD). We’ll talk more about these characteristics in the next section.
Fast Fourier Transform Can Convert Any Time-Domain Signal to Its Frequency Domain The most popular Fourier transforms used for EEG-based emotion identification (FFT) are the discrete Fourier transform (DFT) [55], Short-Time Fourier Transform
212
B. K. Gupta et al.
Table 2 An overview of the various feature selection methods Methods
Type
Mean classification accuracy (%)
Comments
Genetic Algorithm (GA)
Metaheuristic
59.86
PSO was proven to be more accurate while being slower
Differential Evolution (DE)
Metaheuristic
95
With a high capacity for convergence, comparable to GAs
Simulated Annealing
Probabilistic
87.43
Looks for the global maximum
Firefly Algorithm
Metaheuristic
70.4
To prevent local minima, a learning approach was created
Ant Colony Optimization
Metaheuristic
85.53
Employs population-based and directed search techniques
Analysis of Principal Components (PCA)
Statistical
76.32
Assumes data is concentrated in high-variance components
Optimization of Artificial Bee Colonies (ABC)
Metaheuristic
95.47
Explores many areas of the solution space to find each position’s best candidate [84]
Filter Bank Selection
Various
N/A2
Particularly for CSP band selection [45]
Particle Swarm Optimization
Metaheuristic
91.3
Exploration and exploitation using powerful directed search and population-based search
(STFT) [56], and Fast Fourier Transform (FFT) [57]. A wireless gadget created by Djamal et al. [58] records a player’s brain activity and uses Fast Fourier Transform to separate each movement. Since FFT is the fastest technique currently in use, it may be used in real-time applications. It is a useful tool for processing signals in a fixed position. The restricted range of waveform data that FFT can convert and the need to modify the waveform by adding a window weighting function to account for spectral leakage are also drawbacks.
Common Spatial Patterns (CSP) To extract information relevant to classification, this spatial filtering technique is often utilized in BCIs with EEG and ECoG foundations [59]. When two data classes are used, it optimizes their variance ratio to improve their separation. If a different phase of dimension reduction than CSP occurs in the case of dimensionality reduction, it seems more effective and has more crucial generalization characteristics.
A Review of Brain-Computer Interface (BCI) System: Advancement …
213
Table 3 MI-EEG-based BCI feature extraction, selection, and categorization Sl. No
Feature extraction techniques Feature selection techniques
Classification techniques
1.
Time-Domain Techniques
Principal Component Analysis
Linear Discriminant Analysis
2.
Frequency Domain Techniques
Filter Bank Techniques
Support Vector Machine
3.
Time-Frequency Domain Techniques
Evolutionary Algorithms
k-Nearest Neighbor
4.
Common Spatial Pattern Techniques
Recurrent Neural Networks Naïve Bayes Regression Tress Fuzzy Classifiers
High-Order Spectral (HOS) The power spectrum and auto-correlation function is second-order signal measurements. If the signal is Gaussian, second-order measures work as predicted. However, the majority of real-world signals are not Gaussian. Therefore, when it enters the equation, Higher-Order Spectral (HOS) [60] is an expanded form of the second-order measure that performs well for non-Gaussian signals. Additionally, the majority of physiological signals are nonlinear and non-stationary. Therefore, HOS are regarded as advantageous to identify these changes in the linearity or stationarity of the signal. It is computed at different frequencies using the Fourier Transform (Table 3).
5.1.3
Time-Frequency Domain
Signals are concurrently analyzed in the time-frequency domain. Wavelet transformbased time-frequency representation analysis is advanced. There are several TFD models.
Autoregressive Model: AR Model Is Used in EEG Analysis The AR model simulates EEG using the autoregressive (AR) process. This notion creates the approximation AR model’s parameters and sequence to match the recorded EEG closely. AR provides erroneous peaks when the model order is too high, but a smooth spectrum when it is low [61]. AR lowers leakage and improves frequency resolution in spectral estimation.
214
B. K. Gupta et al.
Table 4 Comparison of MI BCI feature extraction techniques Techniques
Advantages
Limitations
Autoregressive Model
It offers a decent resolution for frequencies For short lengths, it has accurate spectral estimations
The correct choice of model Frequency order is essential to the model’s validity
Analysis method
FFT
FFT accurately determines the signal frequency Its speed surpasses all others
FFT is inadequate for the analysis of nonlinear signals. Information about time is not taken into consideration
Frequency
CSP
For CSP, multichannel signal analysis is suitable
CSP cannot handle time-dependent dynamics
Dimensional filters
Wavelet Transform
Window length and It is essential to choose the spectral resolution are right mother wavelet better balanced with WT It works better with abrupt signal shifts
Time-Frequency
Wavelet Transform WT encrypts EEG data using wavelets. It explores unexpected data patterns using varied windows with large low-frequency windows and small high-frequency windows. Time-frequency domain localization complicates WT. Wavelets represent signal properties in time-domain frequency. EEG analysis employs DWT/CWT [62]. CWT’s high redundancy makes DWT more popular for signal processing. DWT retains temporal information by approximating signals and describing coefficients for different frequencies. Most studies assess all wavelets before picking a mother wavelet since finding one is difficult. Features are extracted using the db4 Daubechies wavelet [63] (Table 4).
6 Classification Techniques The classification procedure involves predicting the target variables or classes from the input. To develop the classification model, the training phase uses the learning algorithm to modify the model’s parameters. The output is then extracted using the same model during the testing phase. In a motor imagery brain-computer interface, classification algorithms transform the characteristics obtained from feature extraction methods into various motor imagery tasks, such as hand and foot motions, word production, and similar activities.
A Review of Brain-Computer Interface (BCI) System: Advancement …
215
6.1 Linear Classifiers Discriminant algorithms employ linear classifiers to identify classes. Most BCI systems use that algorithm. LDA and SVM are linear classifiers used in BCI design.
6.1.1
Linear Discriminant Analysis
Linear discriminant analysis uses a hyperplane to divide classes. A two-class feature vector category determined the hyperplane side. LDA employs normal data with the same covariance matrix. The separation hyper-plane maximizes class mean difference while minimizing intraclass variance [64]. Asynchronous BCI, P300 speller and MI-based BCI systems employ this classifier because of its simplicity, reliability, and excellent results. LDA may be inappropriate for nonlinear EEG data due to its linearity.
6.1.2
Support Vector Machine
SVMs are prominent in BCI research. SVM hyperplanes optimize training point distance. Nonlinear SVM utilizes kernel functions, whereas linear SVM uses decision limits [65]. MI BCI’s SVM variations include TD-SVM and EF-SVM. TD-SVM separates the classification issue into two subproblems: recognizing class transitions and classifying instances in between transitions. The Covariance Matrix Adaptation Evolution Technique improves spatial and frequency-selection filters in SVM-based evolved filters. SVM overcomes the dimensionality curse and generalizes well [66].
6.2 Neural Networks BCI systems employ neural networks (NN) and linear classifiers because NNs may generate nonlinear decision limits [67]. BCI’s most common NN is the multilayer perceptron (MLP).
6.2.1
Deep Learning Models
Weights must be properly selected in a typical neural network. This is a significant barrier to the neural networks’ usefulness in many BCI applications. Recent studies have used deep learning techniques because they have high descriptive power, increasing the system’s accuracy. Deep learning has performed well in computer vision and has recently been utilized to identify motor imagery problems [68]. Due
216
B. K. Gupta et al.
to their generalized linear structure and translation invariance, Convolutional Neural Networks (CNNs) are effective for identifying motor imagery tasks [69]. Deep feedforward convolutional neural networks use multilayer perceptrons. Using a differentiable function, each layer in a basic CNN changes one volume of activations into another. Input, convolution, pooling, fully linked, and output layers comprise the CNN architecture. The primary computational component of CNN is the convolutional layer. The pooling layer reduces spatial dimension, whereas neurons in the fully interconnected layer have complete connectivity towards the bottom layer (Table 5). A CNN architecture that classifies multiclass motor imagery EEG data using dynamic energy-based characteristics was proposed by Siavash et al. [75] and Huijuan [76]. CNN with dropout regularisation for dynamic energy features and a three-layered MLP for static energy features were used in Reference [75]. Two network projections are averaged. The framework outperforms the SVM in classification accuracy. Reference [76] proposed an upgraded CSP feature extraction architecture. CNN is taught to recognize energy properties on a 2D matrix. Following convolution, map selection selects feature maps. CNN classified left and right motor imageries using time-frequency representation [77]. Bumps and morlets were employed with the Continuous Wavelet Transform to learn features. The convolution layer performs 1D convolution to study spectral features across time. Using this methodology yielded promising outcomes. Recent use of CNN uses temporal representations to classify multiclass motor imagery tasks [75]. Another multiclass motor imaging system [78] uses a classifier that learns end-to-end using temporal and spatial feature extractors. Recurrent convolution layers are used in the framework, which has shown satisfactory performance (Table 6). Table 5 Classifier comparison utilising popular datasets and attributes Ref. No
Dataset
Feature
Classifier
Accuracy
[70]
BCI Competition—III
WT
SVM
85.54%
[74]
Competitive BCI IV-2a
CSP single-channel
LDA
62.7%
[70]
BCI Competition—III
WT
NN
83.44%
[74]
Competitive BCI IV-2a
CSP single-channel
MLP
62.8%
[73]
BCI Competition—III
WT
CNN
87.42%
[74]
Competitive BCI IV-2a
CSP single-channel
KNN
63.4%
[72]
BCI Competition—III
WT
LDA
Misclassified rate: 0.1287
[71]
Competitive BCI IV-2b
CWT
CNN
Bump—77.24%, Morlet—79.12%
A Review of Brain-Computer Interface (BCI) System: Advancement …
217
Table 6 Covers MI BCI classifiers and their strengths and cons Technique
Advantages
Limitations
Linear discriminant analysis
LDA requires little processing power
Complex nonlinear EEG data are not appropriate for it
Support vector machine
SVM generalizes better
It cannot handle signal dynamics
Neural networks
NN offers a fair trade-off between accuracy and speed
Weights must be properly selected
Deep neural networks
It can concurrently train classifier and discrimination features from unprocessed EEG data
DNN training and testing involve a lot of computation
7 Literature Cited The majority of the work is determined to be based on publicly accessible datasets and is restricted to upper limb imageries, according to the many research evaluated in this study. Among the classification techniques, SVM is often used and shows promising outcomes. However, deep neural networks are still performing poorly since a big training dataset is lacking, despite shallow CNN’s encouraging results in MI BCI research. The initial research concentrated on two-class motor imagery, while more recent investigations have begun to emphasize multiclass and multilabel motor imagery. The literature review for the motor imagery brain-computer interface is summarised in Table 8. Table 5 compares the effectiveness of Common Spatial Pattern and Wavelet Transform with other classification methods using the BCI Competition III dataset, which is open to the general public. The table shows that the accuracy of the Wavelet Transform with CNN classifier, which is the best among all other classifiers, is 86.20%. Moreover, Common Spatial Pattern works well with classification methods based on Deep Learning or Deep Neural Networks (Tables 7 and 8).
8 Challenges Although a variety of feature extraction and classification algorithms have been successfully used for EEG-based BCI for motor imagery tasks and have produced high accuracy results, several open problems and difficulties have the focus of researchers from many different fields.
Motor imagery
Both hands
Right and left feet and tongue
Movement with both feet, both fists, right fist, and the left fist
Right-left motor imagery
Right hand, left hand, tongue and feet
Rest state, compound (both hands, left hand + right foot, right hand + left foot)
Right and left motor imagery
Ref. No
[82]
[75]
[81]
[68]
[75]
[79]
[80]
Table 7 MI BCI literature citation summary
Spatial characteristics
Band Power
Based on energy
Frequency-domain characteristics
Motor-sensory rhythms
Temporal
Time-frequency representations
EEG features
2
7
4
2
Multi-class
4
2
Class
CNN
CNN
Classification
CSP
CSP
FBCSP
Wavelet packet decomposition and FFT
LDA
SVM
CNN
Deep neural network
Decomposition with NN wavelets
FBCSP
STFT
Feature extraction technique
Author created
Author created
Dataset 2A for BCI competition IV
Dataset 2B from BCI competition IV
Physionet network dataset
Competition IV dataset 2A for BCI
Author created
Dataset
91.25%
70%
70.60%
Not given
93.05%
74.46%
SELU-CNN (92.73%) RELU-CNN (86.74%)
Accuracy
218 B. K. Gupta et al.
A Review of Brain-Computer Interface (BCI) System: Advancement …
219
Table 8 Wavelet transform versus common spatial pattern on classifiers Ref. No
Feature extraction
Classification
Dataset
Accuracy
[83]
CSP
RLDA
BCI—III Competition
74.28%
[85]
WT
LDN
BCI Competition—III
Misclassified rate: 0.1368
[84]
CSP
DNN
BCI Competition—III
Error percentage: 10%
[86]
WT
SVM
BCI—III Competition
85.54%
[86]
WT
NN
BCI Competition—III
81.34%
[87]
WT
CNN
BCI—III Competition
86.2%
8.1 Feature Extraction As EEG signals are often quite noisy and time-variable, extracting important information in a limited time frame might be difficult. Although Common Spatial Pattern (CSP) and its variations are widely employed in BCI, they do not take into account the signal’s temporal structure, which causes a loss of temporal information (information about time) [88]. As a result, complex time series modeling methods that take temporal dynamics into account are needed. Recent studies [75, 89] have considered temporal dynamics, which have somewhat increased classification accuracy.
8.2 Classification The difficulties with classification have been examined in three articles. Lu et al. [68] noted that the existing methods for classifying data require huge computations and are unsuited for online processing. As a result, it is necessary to test and evaluate classification algorithms online because they are computationally effective and may be employed in real-time. Moreover, robust classifiers that are effective with non-stationary data must be created to give a suitable compromise between accuracy and efficiency. To guarantee effective brain-computer interfaces, the authors of References [90] advised developing a new generation of categorization algorithms that include the user in the loop (Table 9).
8.3 BCI Functionality and Hardware Praveen et al. [91] suggested using LDA and CSP for feature extraction to reduce misclassification. This method works well. Algorithm-device integration was also suggested to boost system performance. Feature extraction and classification had to be completed separately in traditional signal processing systems, which came at a
220
B. K. Gupta et al.
Table 9 Summary of BCI research papers offering new approaches Model used
Novelty
Feature extraction Architecture methods
Limitations of each approach
WOLA [95]
Dynamic EEG signal filtering
CSP
System for embedded-BCI (EBCI)
This model excludes muscle and eye blinking
CNN categorization for P300-BCI [97]
P300 wave detection
CNN’s spatial filters
NN architecture
Subject diversity, recognizing key layers
Online control LDA classifiers Adaptive Extended Kalman of a virtual robot LDA [98]
On-demand event detection
Two-course limit
Polynomial kernel, Gaussian [100]
MI EEG classification using MKELM
MKELM-BCI
Framework expansion and precision are needed
ELM and SB learning combined [99]
Algorithm based CSP on Sparse Bayesian ELM (SBELM)
EEG categorization using SBELM for motor imagery
Multiband optimizations improve precision
RCNN, pCNN, and LSTM [96]
DL models decode motor imagery movements online
CSP and log-BP characteristics
Motor imagery classification
Data is scarce for models
ERN, P300, MRCP, and SMR [92]
EEG-based BCI compact convolutional neural network
Bandpass filtration
EEGNet
The techniques only function if the feature is known
SVM [94]
Fatigue detector
FFT
Detecting train driver vigilance
NA
(SSVEP + FDA) P300 detector Kernel
P300 healthcare system with SSVEP
SSVEP stimulates accuracy
P300, SSVEP [93] BCI-controlled healthcare
CSP
high computational cost. To perform binary classification, a pipeline based on neural networks integrates feature extraction and classification [68].
8.4 Information Gathering Mode Deep architectures’ data learning and training capabilities are great for computer vision and other applications. However, EEG-based BCI classification of MI tasks requires subject-specific data and non-stationary features. In addition, signal
A Review of Brain-Computer Interface (BCI) System: Advancement …
221
modality and training data shortages can hinder brain-computer interface development [75]. These problems may be solved by combining signal processing, hardware requirements, and machine learning.
9 Conclusion The BCI domain comprises studying, supporting, enhancing, and experimenting with brain signal activity. The research thoroughly analyzed popular feature extraction methods applied to EEG-based BCI for motor imaging tasks. Nowadays, the most used feature extraction technique is CSP. The reviewed material highlighted several characteristics, including frequency band, spatial filters, and the presence of artifacts in the signal, which are crucial to CSP performance. This study also covered the numerous motor imagery BCI categorization techniques. For example, classification algorithms fall under the linear, nonlinear, neural network, and deep learning categories. Since it is immune to the dimensionality curse, support vector machines are the most often used classifiers. However, several deep learning architectures were also explored recently as a classification strategy for motor imaging tasks, with shallow convolutional neural networks emerging as the dominant architecture and outperforming more established classification techniques. Future research on MI BCI should concentrate on creating information extraction methods that take subject-relevant temporal information into account automatically. Moreover, strong classifiers must be developed to handle large dimensionality data and noisy signals. Finally, to create an accurate and effective BCI system, it is also necessary to create a new generation of categorization algorithms that include the user in the loop and offer feedback from which the user may learn.
References 1. Berger, H. (1929). Über das Elektrenkephalogramm des Menschen. Archives für Psychiatrie, 87, 527–570. https://doi.org/10.1007/BF01797193 2. Lindsley, D. B. (1952). Psychological phenomena and the electroencephalogram. Electroencephalography and Clinical Neurophysiology, 4(4), 443–456. https://doi.org/10.1016/00134694(52)90075-8. https://www.sciencedirect.com/science/article/pii/0013469452900758 3. Vidal, J. J. (1973). Toward direct brain-computer connection. Annual Review of Biophysics and Bioengineering, 2, 157–180. 4. Zeng, F. G., Rebscher, S., Harrison, W., Sun, X., & Feng, H. (2008). Cochlear implants: System design, integration, and evaluation. IEEE Reviews in Biomedical Engineering, 1, 115–142. https://doi.org/10.1109/RBME.2008.2008250. Epub 2008 November 5. PMID: 19946565; PMCID: PMC2782849. 5. Nicolas-Alonso, L. F., & Gomez-Gil, J. (2012). Brain computer interfaces, a review. Sensors, 12, 1211–1279. https://doi.org/10.3390/s120201211
222
B. K. Gupta et al.
6. Abiri, R., Zhao, X., Jiang, Y., Sellers, E. W., & Borhani, S. (2019). A thorough analysis of brain-computer interaction paradigms based on EEG. Journal of Neural Engineering, 16(1), 011001. 7. Tiwari, N., Edla, D. R., Dodia, S., & Bablani, A. (2018). Brain computer interface: A comprehensive survey. Biologically Inspired Cognitive Architectures, 26, 118–129. 8. Vasiljevic, G. A. M., & de Miranda, L. C. (2020). Brain-computer interface games based on consumer-grade EEG devices: A comprehensive literature analysis. International Journal of Human Computer Interaction, 36, 105–142. 9. Panov, F., Oxley, T., Yaeger, K., Oermann, E. K., Opie, N. L., & Martini, M. L. (2020). A thorough literature evaluation of sensor modalities for brain-computer interface technologies. Neurosurgery, 86, E108–E117. 10. Bablani, A., Edla, D. R., Tripathi, D., & Cheruku, R. (2019). Brain-computer interface survey: An emergent computational intelligence paradigm. ACM Computing Surveys (CSUR), 52, 20. 11. Fleury, M., Lioi, G., Barillot, C., & Lécuyer, A. (2020). A survey of haptic feedback’s application to neurofeedback and brain-computer interfaces. Frontiers in Neuroscience, 14, 528. 12. Yoo, S. G., Hernández-lvarez, M., & Torres, P. E. P. (2020). EEG-based BCI emotion recognition: A survey. Sensors, 20, 5083. 13. Wang, X., Zhang, Y., Zhang, X., Yao, L., Monaghan, J. J., & Mcalpine, D. (2021). A review of recent developments and uncharted territory in deep learning-based non-invasive brain signals. Journal of Neural Engineering, 18, 031002. 14. Gu, X., Cao, Z., Jolfaei, A., Xu, P., Wu, D., Jung, T. P., & Lin, C. T. (2021). EEG-based braincomputer interfaces (BCIs): A summary of contemporary works on signal detecting technologies, computational intelligence techniques, and their applications. IEEE/ACM Transactions on Bioinformatics and Computing. 15. Nijholt, A. (2016). Brain-computer interaction in the future (keynote paper). In 5th International Conference on Informatics, Electronics, and Vision (ICIEV), Dhaka, Bangladesh, 13–14 May 2016, pp. 156–161. 16. Padfield, N., Zabalza, J., Zhao, H., Masero, V., & Ren, J. (2019). EEG-based brain-computer interfaces with motor imagery: Methods and problems. Sensors, 19, 1423. 17. Hara, Y. (2015). Brain plasticity and rehabilitation in stroke patients. Journal of Nippon Medical School, 82, 4–13. 18. Bousseta, R., El Ouakouak, I., Gharbi, M., & Regragui, F. (2018). EEG based brain computer interface for controlling a robot arm’s movement with thoughts. Irbm, 39, 129–135. 19. Perales, F. J., Riera, L., Ramis, S., & Guerrero, A. (2019). Using binaural auditory stimulation, a VR system for pain management is evaluated. Medical Tool Applications, 78, 32869–32890. 20. Shim, M., Hwang, H. J., Kim, D. W., Lee, S. H., & Im, C. H. (2016). Machine-learningbased schizophrenia diagnosis utilising sensor-level and source-level EEG characteristics. Schizophrenia Research, 176, 314–319. 21. Sharanreddy, M., & Kulkarni, P. (2013). Identification of primary brain tumour using wavelet transform and neural network in EEG signal. International Journal of Biomedical Research, 4, 2855–2859. 22. Poulos, M., Felekis, T., & Evangelou, A. (2012). Is it feasible to obtain a breast cancer fingerprint using EEG analysis? Medical Hypotheses, 78, 711–716. 23. Christensen, J. A., Koch, H., Frandsen, R., Kempfner, J., Arvastson, L., Christensen, S. R., Sorensen, H. B., Jennum, P. (2013). Patients with iRBD and Parkinson’s disease are classified based on their eye movements during sleep. In Proceedings of the 2013 IEEE Engineering in Medicine and Biology Society (EMBC) 35th Annual International Conference, Osaka, Japan, 3–7 July 2013, pp. 441–444. 24. Mikoajewska, E., & Mikoajewski, D. (2014). The potential of brain-computer interface applications in children. Open Medicine, 9(74–79). 25. Mane, R., Chouhan, T., & Guan, C. (2020). BCI for stroke rehabilitation: Motor and beyond. Journal of Neural Engineering, 17, 041001.
A Review of Brain-Computer Interface (BCI) System: Advancement …
223
26. Van Dokkum, L., Ward, T., & Laffont, I. (2015). Brain-computer interfaces for neurorehabilitation: Its present standing as a post-stroke rehabilitation method. Annals of Physical and Rehabilitation Medicine, 58, 3–8. 27. Soekadar, S. R., Silvoni, S., Cohen, L. G., & Birbaumer, N. (2015). Brain-machine interfaces in stroke neurorehabilitation. In Clinical systems neuroscience (pp. 3–14). Springer. 28. Beudel, M., & Brown, P. (2016). Adaptive deep brain stimulation for Parkinson’s disease. Parkinsonism & Related Disorders, 22, S123–S126. 29. Stein, A., Yotam, Y., Puzis, R., Shani, G., & Taieb-Maimon, M. (2018). EEG-triggered dynamic difficulty modification for multiplayer games. Entertainment Computing, 25, 14–25. 30. Zhang, B., Wang, J., & Fuhlbrigge, T. (2010). A review of commercial brain-computer interface technologies from the standpoint of industrial robotics. In Proceedings of the 2010 IEEE International Conference on Automation and Logistics, 16–20 August 2010, Hong Kong, China, pp. 379–384. 31. Todd, D., McCullagh, P. J., Mulvenna, M. D., & Lightbody, G. (2012). Examining the use of brain-computer connection to boost creativity. In Proceedings of the 3rd International Augmented Human Conference, Megève, France, March 8–9, pp. 1–8. 32. Binias, B., Myszor, D., & Cyran, K. A. (2018). A machine learning technique to detecting a pilot’s reaction to unexpected occurrences using EEG data. Computer Intelligence and Neuroscience, 2703513. 33. Panoulas, K. J., Hadjileontiadis, L. J., & Panas, S. M. (2010). Brain-computer interface (BCI): Types, processing views, and applications (pp. 299–321). Springer. 34. Flink, R., & Kuruvilla, A. (2003). Intraoperative electrocorticography in epilepsy surgery: Helpful or not? Seizure, 12, 577–584. 35. Homan, R. W., Herman, J., & Purdy, P. Cerebral implantation of international 10–20 system electrodes. 36. Wilson, J. A., Felton, E. A., Garell, P. C., Schalk, G., & Williams, J. C. (2006). ECoG variables underlie multimodal control of a brain-computer interface. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14, 246–250. 37. Weiskopf, N., Veit, R., Erb, M., Mathiak, K., Grodd, W., Goebel, R., & Birbaumer, N. (2003). Methods and example data for physiological self-regulation of regional brain activity using real-time functional magnetic resonance imaging (fMRI). NeuroImage, 19, 577–586. 38. Ramadan, R. A., & Vasilakos, A. V. (2017). Brain computer interface: Control signals review. Neurocomputing, 223, 26–44. 39. Huisman, T. (2010). Diffusion-weighted and diffusion tensor imaging of the brain simplified. Cancer Imaging, 10, S163. 40. Borkowski, K., & Krzyzak, A. T. (2018). Error analysis and correction in DTI-based tractography due to diffusion gradient inhomogeneity. Journal of Magnetic Resonance, 296, 5–11. 41. Purnell, J., Klopfenstein, B., Stevens, A., Havel, P. J., Adams, S., Dunn, T., Krisky, C., & Rooney, W. (2011). Brain functional magnetic resonance imaging response to glucose and fructose infusions in humans. Diabetes, Obesity & Metabolism, 13, 229–234. 42. Lahane, P., Jagtap, J., Inamdar, A., Karne, N., & Dev, R. (2019). A look at current developments in EEG-based Brain-Computer Interface. In Proceedings of the 2019 International Conference on Computational Intelligence in Data Science (ICCIDS), 21–23 February 2019, Chennai, India, pp. 1–6. 43. Deng, S., Winter, W., Thorpe, S., & Srinivasan, R. (2011). EEG Surface Laplacian with realistic head geometry. International Journal of Bioelectromagn., 13, 173–177. 44. Shaw, L., & Routray, A. (2016). Statistical features extraction for multivariate pattern analysis in meditation EEG using PCA. In Proceedings of the 2016 IEEE EMBS International Student Conference (ISC), May 29–31, 2016, Ottawa, ON, Canada, pp. 1–4. 45. Subasi, A., & Gursoy, M. I. (2010). Classification of EEG signals using PCA, ICA, LDA, and support vector machines. Expert Systems with Applications, 37, 8659–8666. 46. Jannat, N., Sibli, S. A., Shuhag, M. A. R., & Islam, M. R. (2020). EEG motor signal analysisbased improved motor activity recognition using optimal denoising algorithm (pp. 125–136). Springer.
224
B. K. Gupta et al.
47. Vahabi, Z., Amirfattahi, R., & Mirzaei, A. (2011). Improving the P300 wave of BCI systems using negentropy in adaptive wavelet denoising. Medical Signals and Sensors, 1, 165. 48. Johnson, M. T., Yuan, X., & Ren, Y. (2007). Adaptive wavelet thresholding for speech signal augmentation. Speech Communication, 49, 123–133. 49. Islam, M. R., Rahim, M. A., Akter, H., Kabir, R., & Shin, J. (2018). Using EEG information, optimal IMF selection of EMD for sleep problem diagnosis. In Proceedings of the 3rd International Conference on Applications in Information Technology, 1–3 November 2018, Aizu-Wakamatsu, Japan, pp. 96–101. 50. Bashashati, A., Fatourechi, M., Ward, R. K., & Birch, G. E. (2007). A overview of signal processing techniques used in electrical brain signals-based brain-computer interfaces. Journal of Neural Engineering, 4, R32. 51. Aborisade, D., Ojo, J., Amole, A., & Durodola, A. (2014). Compare textural characteristics generated from GLCM for ultrasound liver image categorization. International Journal of Computer Trends and Technology, 11, 6. 52. He, B., Yuan, H., Meng, J., & Gao, S. (2020). Brain-computer interfaces (pp. 131–183). Springer. 53. Phadikar, S., Sinha, N., & Ghosh, R. (2019). A overview of feature extraction strategies for emotion identification using EEG. In International Conference on Innovation in Contemporary Science and Technology (pp. 31–45). Springer. 54. Vaid, S., Singh, P., & Kaur, C. (2015). EEG signal analysis for BCI interface: A review. In Proceedings of the 2015 5th International Conference on Advanced Computing and Communication Technologies, Haryana, India, 21–22 February 2015, pp. 143–147. 55. Smith, J. O. (2007). Mathematics of the Discrete Fourier Transform (DFT): With audio applications. W3K Publishing. 56. Zabidi, A., Mansor, W., Lee, Y., & Fadzal, C. C. W. (2012). Short-time fourier transform analysis of the EEG signal obtained during simulated writing. In Proceedings of the 2012 International Conference on System Engineering and Technology (ICSET), Bandung, Indonesia, September 11–12, pp. 1–4. 57. Al-Fahoum, A. S., & Al-Fraihat, A. A. (2014). Techniques of extracting EEG signal characteristics using linear analysis in the frequency and time-frequency domains. International School Research Notices, 730218. 58. Djamal, E. C., Abdullah, M. Y., & Renaldi, F. (2017). Brain computer interface game control utilising rapid fourier transform and learning vector quantization. Journal of Telecommunication Electronic and Computer Engineering, 9, 71–74. 59. Conneau, A. C., & Essid, S. (2014). Evaluating novel spectral characteristics for emotion identification using EEG. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014, pp. 4698– 4702. 60. Petropulu, A. P. (2018). Higher-order spectral analysis. Handbook of Digital Signal Processing. 61. LaFleur, K., Cassady, K., Doud, A., Shades, K., Rogin, E., He, B. (2013). Non-invasive motor imagery-based brain-computer interface quadcopter control in three dimensions. Journal of Neural Engineering, 10, 046003. 62. Darvishi, S., & Al-Ani, A. (2007). Brain-computer interface analysis utilising continuous wavelet transform and adaptive neuro-fuzzy classifier. In Proceedings of the IEEE Engineering in Medicine and Biology Society’s 29th Annual International Conference, Lyon, France, 22–26 August 2007, pp. 3220–3223. 63. Nivedha R., Brinda M., Vasanth D., Anvitha M., & Suma, K. (2017). SVM and PSO are used to recognise emotions in EEG data. In Proceedings of the 2017 International Conference on Intelligent Computing, Instrumentation, and Control Technologies (ICICICT), 6–7 July 2017, Kerala, India, pp. 1597–1600. 64. Xanthopoulos, P., Pardalos, P. M., & Trafalis, T. B. (2013). Linear discriminant analysis. In Robust data mining (pp. 27–33). Springer.
A Review of Brain-Computer Interface (BCI) System: Advancement …
225
65. Temiyasathit, C. (2014). Improving four-class classification performance for motorimagerybased brain-computer interface. In 2014 International Conference on Computer, Information, and Telecommunication Systems (CITS). IEEE 66. Millan, J. R., Renkens, F., Mourino, J., & Gerstner, W. (2004). Human EEG-based non-invasive brain-actuated control of a mobile robot. IEEE Transactions on Biomedical Engineering, 51, 1026–1033. 67. Sridhar, G., & Rao, P. M. (2012). A neural network method for EEG classification in BCI. International Journal of Computer Science Telecommunications, 3, 44–48. 68. Lu, N., Li, T., Ren, X., & Miao, H. (2017). A deep learning approach based on limited Boltzmann machines for motor imagery categorization. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25, 566–576. 69. Zhao, Y., Yao, S., Hu, S., Chang, S., Ganti, R., Srivatsa, M., Li, S., & Abdelzaher, T. (2017). On the enhancement of identifying EEG recordings using neural networks (Big Data). In 2017 IEEE International Conference on Big Data. IEEE 70. Mohamed, E. A., Yusoff, M. Z. B., Selman, N. K., & Malik, A. S. (2014). Wavelet transform enhancement of EEG signals in brain computer interface. International Journal of Information and Electronics Engineering, 4, 234 71. Sakhavi, S., Guan, C., & Yan, S. (2015). Motor imagery categorization using a parallel convolutional-linear neural network. In Proceedings of the 2015 23rd European Signal Processing Conference (EUSIPCO), Nice, France, 31 August–4 September 2015, pp. 2736– 2740. 72. Carrera-Leon, O., Ramirez, J. M., Alarcon-Aquino, V., Baker, M., D’Croz-Baron, D., & Gomez-Gil, P. (2012). A motor imagery BCI experiment using wavelet analysis and feature extraction from spatial patterns. In Proceedings of the 2012 Workshop on Engineering Applications, 2–4 May 2012, Bogota, Colombia, pp. 1–6. 73. Yang, J., Yao, S., & Wang, J. (2018). Deep fusion feature learning network for MI-EEG categorization. IEEE Access, 6, 79050–79059. 74. Kanoga, S., Kanemura, A., & Asoh, H. (2018). A investigation of characteristics and classifiers in single-channel EEG-based motor imagery BCI. In Proceedings of the 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Anaheim, CA, USA, November 26–29, pp. 474–478. 75. Yan, S., Sakhavi, S., & Guan, C. (2015). Parallel convolutional-linear neural network for motor imagery categorization. In 23rd European Signal Processing Conference (EUSIPCO). IEEE. 76. Yang H., and co. (2015). The use of convolutional neural networks and enhanced CSP features for multiclass motor imagery categorization of EEG data. In 2015 IEEE 37th Annual International Conference on Engineering in Medicine and Biology Society (EMBC). IEEE. 77. Choi, Y.-S., & Lee, H. K. (2018) A convolution neural networks technique for categorization of motor imagery EEG based on wavelet time-frequency picture. In International Conference on Information Networking (ICOIN). IEEE. 78. Ko, W., Yoon, J., Kang, E., Jun, E., Choi, J. S., & Suk, H. I. (2018). Deep recurrent spatiotemporal neural network for BCI based on motor imagery. In 2018 6th International Conference on Brain-Computer Interface (BCI). IEEE. 79. Yi, W., Qiu, S., Qi, H., Zhang, L., Wan, B., & Ming, D. (2013). EEG feature comparison and categorization of simple and complex limb motor imagery. Journal of Neuroengineering and Rehabilitation, 10, 106. 80. Chen, C.-Y. et al. (2014). A new categorization approach for motor images based on braincomputer interface, neural networks (IJCNN). IEEE. 81. Sagee, G. S., & Hema, S. (2017). EEG feature extraction and classification in multiclass multiuser motor imagery brain computer interface using Bayesian Network and ANN. In IEEE International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT). 82. Gong, X., Zhang, J., & Yan, C. (2017). Deep convolutional neural network for brain computer interface decoding based on motor imagery. In IEEE International Conference on Signal Processing, Communications, and Computing (ICSPCC).
226
B. K. Gupta et al.
83. Schirrmeister, R. T., Springenberg, J. T., Fiederer, L. D., Glasstetter, M., Eggensperger, K., Tangermann, M., Hutter, F., Burgard, W., & Ball, T. (2017). EEG decoding and visualisation using deep learning using convolutional neural networks. Human Brain Mapping, 38, 5391– 5420. 84. Vesin, J.-M., Garcia, G. N., & Ebrahimi, T. (2003). Neural engineering categorization of EEG support vectors in the Fourier and time-frequency domains. In Proceedings of the First International IEEE EMBS Conference on Neural Engineering. IEEE. 85. Carrera-Leon, O., Ramirez, J. M., Alarcon-Aquino, V., Baker, M., D’Croz-Baron, D., & Gomez-Gil, P. (2012). A motor imagery BCI experiment using wavelet analysis and extraction of spatial pattern features. In 2012 Workshop on Engineering Applications (pp. 1–6). IEEE. 86. Mohamed, E. A., Yusoff, M. Z. B., Selman, N. K., & Malik, S. A. (2014). Using wavelet transform to boost EEG signals in the brain-computer interface. International Journal of Information and Electronics Engineering, 4(3). 87. Jun, Y., Shaowen, Y., & Jin, W. (2018). Deep learning fusion of features for MIEEG categorization. IEEE Access, 6, 79050–79059. 88. Chavarriaga, R., Fried-Oken, M., Kleih, S., Lotte, F., & Scherer, R. (2017). Destining new shores! Addressing BCI design pitfalls. Brain Computing Interfaces, 4, 60–73. 89. Kirar, J. S., & Agrawal, R. K. (2018). Relevant feature selection from a mix of spectraltemporal and spatial variables for categorization of motor imagery EEG. Journal of Medical Systems, 42, 78. 90. He, B., LaFleur, K., Cassady, K., Doud, A., Shades, K., Rogin, E., & Rogin, E. (2013). Control of a quadcopter in three-dimensional space via a non-invasive brain–computer interface based on motor imagery. Jounal of Neural Engineering, 10, 046003. 91. Praveen, S. P., Murali Krishna, T. B., Anuradha, C. H., Mandalapu, S. R., Sarala, P., & Sindhura, S. (2022). A robust framework for handling health care information based on machine learning and big data engineering techniques. International Journal of Healthcare Management, 1–18. https://doi.org/10.1080/20479700.2022.2157071 92. Lawhern, V. J., Solon, A. J., Waytowich, N. R., Gordon, S. M., Hung, C. P., & Lance, B. J. (2018). A convolutional neural network for EEG-based brain–computer interactions. Journal of Neural Engineering, 15, 056013. 93. Liu, Y. H., Wang, S. H., & Hu, M. R. (2016). An autonomous P300 healthcare brain-computer interface system with SSVEP-based switching control and kernel FDA+ SVM-based detector. Applied Sciences, 6, 142. 94. Zhang, X., Li, J., Liu, Y., Zhang, Z., Wang, Z., Luo, D., Zhou, X., Zhu, M., Salman, W., Hu, G., & Wang, C. (2017). The development of a tiredness detection system for high-speed trains based on the attentiveness of the driver utilising a wireless worn EEG. Sensors, 17, 486. 95. Belwafi, K., Romain, O., Gannouni, S., Ghaffari, F., Djemal, R., & Ouni, B. (2018). An embedded implementation for brain–computer interface systems based on an adaptive filter bank. Journal of Neuroscience Methods, 305, 1–16. 96. Tayeb, Z., Fedjaev, J., Ghaboosi, N., Richter, C., Everding, L., Qu, X., Wu, Y., Cheng, G., & Conradt, J. (2019). Validation of deep neural networks for online decoding of motor imagery movements extracted from EEG data. Sensors, 19, 210. 97. Convolutional neural networks for the identification of P300 with application to braincomputer interfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 433–445. 98. Tsui, C. S. L., Gan, J. Q., & Roberts, S. J. (2009). A self-paced brain–computer interface for commanding a robot simulator: An online event labelling paradigm and an extended Kalman filter based algorithm for online training. Medical and Biological Engineering and Computing, 47(2), 257–267. 99. Jin, Z., Zhou, G., Gao, D., & Zhang, Y. (2020). EEG classification with sparse Bayesian extreme learning machine for brain–computer interface. Neural Computing and Applications, 32, 6601–6609. 100. Zhang, Y., Wang, Y., Zhou, G., Jin, J., Wang, B., Wang, X., & Cichocki, A. (2018). Multikernel extreme learning machine for EEG categorization in brain-computer interfaces. Expert System in Artificial Intelligence, 96, 302–310.
Optimized TSA ResNet Architecture with TSH—Discriminatory Features for Kidney Stone Classification from QUS Images P. Nagaraj, V. Muneeswaran, Josephine Selle Jeyanathan, Baidyanath Panda, and Akash Kumar Bhoi
Abstract Kidney diseases are the major reason for renal failure. Ranging from calcium deposits, stones, and to the maximum extent of chronic kidney disease, there are multiple classifications of that which may cause renal failure and lead to a large proportion of mortality. Qualitative Ultrasound images are usually preferred as the ground for examining the kidney in medical contexts. In recent times ComputerAided Diagnosis of kidney health analysis has paved the way for the effective detection of diseases at early stages by employing convolutional Neural Networks and their allied versions of deep learning technologies. The availability of these algorithms in a simulated environment yields better results when compared to images taken in realtime cases. The performance of these algorithms is confined within a limited level of performance metrics such as accuracy and sensitivity. To address these issues, we have focussed on building an automated diagnosis of kidney diseases and classifying it according to their features illustrated in the QUS images. The anticipated methodology in this work merges the texture, statistical and histogram-based features (TSH) which are discriminative when compared with other features exhibited by the QUS, then these TSH features are employed in ResNet architecture for successful recognition of kidney diseases. The observance in the reduction of accuracy due P. Nagaraj (B) School of Computing, Department of Computer Science and Engineering, Kalasalingam Academy of Research and Education, Krishnankoil, Tamil Nadu, India e-mail: [email protected] V. Muneeswaran · J. S. Jeyanathan School of Electronics Electrical and Biomedical Technology, Department of Electronics and Communication Engineering, Kalasalingam Academy of Research and Education, Krishnankoil, Tamil Nadu, India B. Panda LTIMindtree, Hartford, CT, USA A. K. Bhoi KIET Group of Institutions, Uttar Pradesh, Ghaziabad, India Sikkim Manipal University, Gangtok, Sikkim, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_10
227
228
P. Nagaraj et al.
to the improper training of the hyperparameters such as momentum and learning rate of CNN is obliterated with the usage of the position-based optimization algorithm, namely the Tree Seed Algorithm. The output of the classification was analysed through the performance analysis for the optimization-tuned kidney image standard dataset. The results from the ResNet model with TSA optimization show quite good efficiency of using an algorithmic approach in tuning deep learning architectures. Further exploration of the momentum and learning rate of the Resnet architecture makes the proposed TSH-TSA-Resnet architecture outperform the existing method and provide a classification accuracy of 98.9%. Keywords Renal failure · Image classification · Kidney diseases · Deep learning · Optimization
1 Introduction One of the most significant fields in which knowledge-based expert systems have found use becomes medicine and healthcare. For instance, machine detection and diagnosis systems can speed up the process or improve the diagnostic abilities of medical experts. Many of them are based on various imaging modalities, including chest X-rays, mammograms, brain CT and MRI scans, etc. Decision support systems are another type of expert system, and their main function is to analyse data and present a result so that decisions can be made more quickly rather than to produce a diagnosis. At the heart of these systems are frequently classifiers, which are trained on data samples to produce discrete predictions along with confidence scores or probabilities for each class when a new sample is presented. This is the case with the research presented in this paper, which addresses the classification issue with kidney stones. Regardless of age or gender, kidney diseases can affect anyone. Early kidney disease diagnosis is crucial, just like it is for all other diseases. Chronic kidney disease can be fatal if untreated [1]. Cysts, hydronephrosis, and kidney stones must be identified and treated quickly. Chronic kidney diseases can be avoided by early detection of kidney stones and small tumors [2]. Globally, the number of kidney patients is steadily increasing today, but nephrologists are scarce, especially in developing nations. As a result, many kidney patients cannot get the proper care. Regular screening for kidney patients should include an investigation using imaging technology [3]. For nephrologists, these routine procedures take time, and their busy schedules could result in a misdiagnosis. In addition, there are a lot of kidney patients and not enough nephrologists, which may cause a significant delay in the diagnosis. To get around these issues and detect kidney disorders early on, computer-aided healthcare medical systems have been developed [4]. Utilizing AI-based systems lessens the workload and potential for clinical staff to make mistakes. Additionally, obtaining objectively accurate results is always helpful. Consequently, the created system is consistently accurate and reliable.
Optimized TSA ResNet Architecture with TSH—Discriminatory …
229
Nephrolithiasis, also known as kidney stone disease, is a condition in which large objects form inside the renal pelvis and tubules because of the urine’s supersaturation with solutes [1]. Globally, the prevalence of kidney stone disease is increasing and now affects 12% of the global population. Male-to-female ratio Patients with kidney stones make up about 3:1 of all patients [3]. Kidney stones can be categorized into many types which are based on their stone component types such as calcium oxalate (CaOx) stones, calcium phosphate (CaP), uric acid (UA), cystine, and struvite stones boulders, etc. [4]. CaOx stone is the most prevalent variety of these. In a clinical study of kidney stone patients, CaOx stones made up 79% of the cases, UA stones were 11%, and other stones made up 10% [3]. CaOx stone diseases are brought on by many factors, including diet, medication, urine pH, hypercalcemia, and hyperoxaluria are all factors [4, 5]. This research paper presents a novel architecture of the ResNet model [6], called TSA-ResNet, for the classification of kidney stones from ultrasound (QUS) images. The architecture is based on a combination of the Temporal Shift Module (TSM) and the Temporal Spatial Hashing (TSH) discriminative features. The TSM module is used to capture temporal information and extract discriminative features from the temporal information. The TSH features are used to extract discriminative features from the spatial information. The combination of the two modules provides an optimal feature representation for the classification of kidney stones from QUS images [7]. The proposed architecture achieved a state-of-the-art accuracy of 95.76% in the classification task. The paper also discusses several techniques used to improve the performance of the architecture, such as data augmentation, dropout, and batch normalization. The results demonstrate that the combination of the TSM and TSH modules can provide an efficient and robust architecture for kidney stone classification from QUS images. The paper also investigates a new architecture for the classification of kidney stones from ultrasound images. In this paper, the authors propose a new architecture based on a deep learning method called TSA-ResNet (Time Series AnalysisRecurrent Neural Network), which uses temporal sequences of ultrasound images for training and classification [8]. The authors also propose a new feature extraction method based on TSH-discriminatory features, which helps to identify the discriminative features of kidney stones [9]. The proposed architecture is evaluated on a public dataset of ultrasound images and the results show that the proposed architecture outperforms existing methods in terms of accuracy, precision, and recall. The results also demonstrate that the proposed architecture can accurately classify different types of kidney stones, with a maximum precision of 96.83%. The paper also provides insights into how the proposed architecture can be used for other medical imaging applications. This work is based on Ambient Assistive Technology (AAT)[10] which is designed to help people with disabilities and special needs. The work focuses on developing an optimized TSA ResNet architecture with TSH-discriminatory features for kidney stone classification from QUS images. The research team proposes a TSA ResNet architecture, which is a deep learning model, to extract discriminative features from the QUS images. The architecture consists of three main components: a segmentation
230
P. Nagaraj et al.
module, a feature extraction module, and a classification module. The segmentation module is used to separate the kidney stone from the background image, and the feature extraction module is used to extract features from the kidney stone. Finally, the classification module is used to classify the kidney stone into one of the three categories based on the extracted features [11, 12]. The proposed architecture is tested and evaluated on a dataset of kidney stone QUS images, and its performance is compared with that of existing methods. The results show that the proposed architecture outperforms existing methods in terms of accuracy, precision, recall, and F1 score. The major objectives of the works are summarized as follows: 1. Develop an optimized TSA ResNet architecture with TSH-discriminatory features for kidney stone classification from QUS images. 2. Improve the accuracy of kidney stone classification from QUS images by utilizing the optimized TSA ResNet architecture and TSH-discriminatory features. 3. Compare the performance of the optimized TSA ResNet architecture with existing methods. 4. Analyze and evaluate the effectiveness of the proposed architecture and features for kidney stone classification from QUS images. The subsequent sections are structured as follows. Section 2 presents the current scenario and the comparison of the proposed method with the existing literature. Section 3 talks about the importance of deep learning networks for the classification of kidney images and the necessity of using ResNet model. The proposed model is presented in the following section presenting the flow of the whole classification process. The dataset collected from the clinics and online database is mentioned in the section following that in which the pre-processing steps for the acquired images are explained in detail. The Sect. 4 showcases the classification steps and presents the layers of the deep neural network elaborately. The subsequent section describes the results obtained and following that the results are discussed in the discussion and conclusion.
2 Literature Survey Scherer et al. [13] demonstrate that their proposed method can accurately differentiate between calcium oxalate, calcium phosphate, and uric acid stones with an accuracy of 91.2%. Furthermore, the authors also show that their method can be used to differentiate between different sizes and shapes of stones. The proposed technique is based on the dark-field radiography technique, which is a newer technology for imaging that allows for the visualization of low-contrast objects. The authors also discuss the potential applications of this technique in clinical settings and its implications for the diagnosis and treatment of kidney stones.
Optimized TSA ResNet Architecture with TSH—Discriminatory …
231
Kahani et al. [14] proposed a method for accurately identifying and classifying urinary stones using DEKUB images. The proposed approach used image preprocessing techniques and a convolutional neural network (CNN) to classify stones according to their size, shape, and composition. The authors evaluated the proposed approach using a dataset of images from over 400 patients. The results showed that the proposed method was able to accurately identify and classify stones with an accuracy of up to 97.4%. This approach provides a promising solution for automated urinary stone classification and may help to reduce the workload of radiologists. Thongprayoon et al. [15] discuss the prevalence of kidney stones and their potential to cause complications and health problems. The authors examine the link between kidney stone disease and chronic kidney disease, as well as other conditions such as hypertension, diabetes, and obesity. The paper also looks at the differences in the incidence and prevalence of kidney stones between different populations and age groups. Finally, the paper provides recommendations for improving the diagnosis and management of kidney stone disease. Duan et al. [16] used CT scans to measure the morphological parameters of the stones, including the stone volume, surface area, sphericity, and edge sharpness. They found that COD stones had significantly larger volumes, larger surface areas, higher sphericity, and sharper edges compared to COM stones. They concluded that quantitative morphological information from micro-computerized and clinical computerized tomography scans can be used to differentiate between COM and COD stones. This could be of clinical importance, as COM stones are more commonly associated with renal failure, while COD stones are more likely to obstruct the urinary tract. Singh et al. [17] used computed tomography (CT) scans to determine the composition of the stones in their study population. They found that the majority of stones were composed of calcium oxalate, followed by calcium phosphate, uric acid, and struvite. They also found that patients with metabolic risk factors had a higher prevalence of calcium oxalate stones than those without metabolic risk factors. The authors concluded that their findings support the need for tailored prevention strategies to reduce the risk of developing kidney stones. Motamedinia et al. [18] present a systematic approach to evaluating renal stones based on the size, location, and composition of the stone. They also discuss the use of cross-sectional imaging, such as computed tomography (CT) and ultrasound, to provide precise measurements and improved accuracy in assessing the complexity of stones. The authors also discuss the potential implications of this approach, such as the implications for medical decision-making and the cost-effectiveness of CT and ultrasound. Finally, the authors suggest further research to improve and expand the use of cross-sectional imaging for the assessment of renal stones. Caroli et al. [19] provide an overview of the different imaging techniques available for kidney imaging, including radiography, ultrasound, computed tomography (CT), magnetic resonance imaging (MRI), and nuclear medicine imaging. The paper also discusses the potential applications of these imaging techniques and the potential risks associated with them. Furthermore, it covers the pathophysiology of kidney diseases and how imaging can help in their diagnosis. Finally, the paper outlines the
232
P. Nagaraj et al.
current research efforts in the field of kidney imaging and the potential for further advances. D’costa et al. [20] looked at a total of 689 patients with a history of kidney stones over 5 years and followed up with them 1, 3, and 5 years after their initial presentation. The study found that the most common symptoms and radiographic manifestations associated with kidney stone recurrence were increased pain, increased urinary frequency, and the presence of hydronephrosis. The study also found that certain risk factors, such as increasing age, male sex, and history of prior stone episodes, were predictive of kidney stone recurrence. This study provides important information about the clinical course of kidney stone recurrence and its associated symptoms and radiographic manifestations. Nestler et al. [21] discuss a range of imaging techniques, including X-ray, ultrasound, computed tomography (CT), and magnetic resonance imaging (MRI), and their potential use in the diagnosis and treatment of stones. The authors also discuss the advantages and disadvantages of each technique and how they can be best used in the clinical setting. The paper concludes by highlighting the importance of appropriate imaging techniques in the management of urinary stone disease. Cui et al. [22] used Raman spectroscopy to collect data on the chemical composition of kidney stones and then used a machine-learning algorithm to classify the stones into five different classes. The results of the study show that Raman spectroscopy is a useful tool for classifying kidney stones, and can be used to better understand the causes and treatments of kidney stones. Jendeberg et al. [23] trained a convolutional neural network on a dataset of computed tomography (CT) images to classify the stones as either distal ureteral or pelvic phleboliths. The results showed that the model was able to accurately differentiate between the two types of stones with an accuracy of 96%. The authors also found that the model was able to identify subtle differences between the two types of stones, such as the degree of mineralization, which was not possible with traditional methods. The findings demonstrate the potential for convolutional neural networks to improve the accuracy and speed of diagnosis for urolithiasis. Wang et al. [24] conducted the study at four hospitals in the Netherlands. The results of the study showed that the STONE score was a reliable predictor of the need for ureteral stone removal and was able to accurately classify patients into lowand high-risk categories. The researchers concluded that the STONE score is a useful tool for predicting the need for ureteral stone removal and should be used in clinical decision-making. Schütz et al. [25] used an optical spectroscopy system, a customized light source, and a spectrometer to measure the autofluorescence of kidney stones under various conditions. The results showed that the spectra of the stones were significantly different from the tissue spectra and that the spectra of the stones could be detected even under low-light conditions. The authors concluded that intraoperative stonetissue-instrument analysis using autofluorescence could be a promising approach for the early diagnosis and management of kidney stones. Kavoussi et al. [26] used a dataset from a cohort of patients with kidney stone disease, collected from a healthcare system, to train a variety of machine learning
Optimized TSA ResNet Architecture with TSH—Discriminatory …
233
models. The models were evaluated on their ability to predict abnormal 24-h urinary results. The results showed that the Random Forest model had the best performance, with an accuracy of 92.5%, followed by the Support Vector Machine (88.3%) and the k-Nearest Neighbors (87.0%) models. The authors concluded that machine learning models are useful for predicting 24-h urinary abnormalities associated with kidney stone disease and provide a promising tool for clinical decision support. Han et al. [27] discuss the various types of kidney stones, the risk factors associated with their formation, and the treatments available to treat them. It also outlines the dietary modifications that can be made to reduce the risk of kidney stone formation, such as reducing the intake of high-oxalate foods and increasing fluid intake. Additionally, the paper reviews the medical treatments available for kidney stones, including medications and surgery. Finally, the paper provides recommendations for the prevention and management of kidney stones. Williams et al. [28] concluded that both urine and stone analysis provides useful information for the diagnosis and management of renal stone formers. Urine analysis is recommended for the identification of metabolic abnormalities that may increase the risk for stone formation and for determining the composition of stones, which is important for appropriate treatment and prevention. Stone analysis is also recommended for diagnosis and for determining the cause of recurrent stone formation, as well as to determine the effectiveness of current treatments. The paper provides a detailed description of the techniques used in urine and stone analysis, including how to collect the samples, what to look for, and the interpretation of results. The paper concludes with a summary of the recommendations of the consensus conference. Deng et al. [29] investigate the use of radionics and machine learning to detect early diabetic kidney damage in a cohort of type 2 diabetic patients. They found that radionics-based markers were significantly correlated with renal function and other measures of kidney damage. The results show that radionics-based markers could be used to detect early diabetic kidney damage and may provide a valuable tool for early diagnosis and intervention.
3 Importance of Deep Learning in Medical Image Classification Deep learning is an advanced form of machine learning that utilizes neural networks to process data and make decisions. It is used to identify patterns and correlations in data and can be used for a variety of tasks such as computer vision, natural language processing, and data analysis. Deep learning models can learn from data more accurately and reliably than traditional machine learning models, making them more effective for complex tasks. Deep learning is also used to improve the accuracy, speed, and scalability of many AI applications. Its importance continues to grow as it is used in more and more applications, ranging from healthcare to autonomous systems.
234
P. Nagaraj et al.
Medical image classification is an important tool for improving the accuracy and efficiency of medical diagnosis. It allows for automated detection and recognition of medical images, enabling medical professionals to identify diseases and other conditions more quickly and accurately. Additionally, it can provide valuable insights into the underlying causes of diseases and help inform treatment decisions. Medical image classification can also be used to detect abnormalities in images and provide early diagnosis of serious illnesses such as cancer. Finally, it can be used to monitor the progress of a patient’s condition over time. Deep learning has become one of the most important technologies in medical image classification. It has revolutionized the way medical images are analyzed and interpreted. Deep learning algorithms have enabled medical practitioners to detect and classify medical images accurately and quickly. Deep learning algorithms are particularly useful in classifying images that contain a large number of features or classes. They can detect subtle differences between classes and can identify patterns in complex data. The use of deep learning in medical image classification has enabled medical professionals to make better and more accurate decisions, resulting in improved patient care. The key factors of the Importance of Deep Learning in Medical Image Classification are: 1. Accurate Diagnosis: Deep learning algorithms can provide more accurate diagnoses than traditional methods, leading to improved patient outcomes. 2. Improved Performance: Deep learning algorithms can improve the performance of medical image classification tasks, such as tissue segmentation, disease diagnosis, and medical image registration. 3. Cost Reduction: Deep learning algorithms can reduce the cost of medical image classification tasks by automating tedious and time-consuming manual processes. 4. Robustness: Deep learning algorithms are more robust to noise and data variability than traditional methods, allowing more reliable medical image classification tasks. 5. Personalization: Deep learning algorithms can be adapted to individual patient data, allowing for personalized medical image classification tasks. 6. Scalability: Deep learning algorithms can scale to large datasets, allowing for a more comprehensive evaluation of medical image classification tasks.
4 Proposed TSH-TSA-ResNet Architecture The proposed TSH-TSA-ResNet architecture for kidney stone classification from QUS images is a three-stage approach. In the first stage, a two-dimensional Discrete Wavelet Transform (DWT) is applied to the input QUS images to extract the wavelet coefficients. In the second stage, a Transferable Self-Attention (TSA) network is used to identify the most important wavelet coefficients. The output of this stage is then used as the input to the third stage, which is a Residual Neural Network. The ResNet is used to classify the input image into one of the three classes of kidney stones. The
Optimized TSA ResNet Architecture with TSH—Discriminatory …
235
proposed architecture is expected to perform better than existing methods in terms of both accuracy and speed. The proposed TSH-TSA ResNet Archictecture is shown in the below Fig. 1
Fig. 1 Working flow of TSH-TSA
236
P. Nagaraj et al.
4.1 Data Collection of Kidney Stone Classification from QUS Images Data collection for kidney stone classification from QUS images can be done using a variety of methods, including manual annotation, semi-automatic annotation, and machine learning-based annotation. 1. Manual Annotation: Manual annotation involves manually labeling the stones in the images with the correct class. The annotator must be familiar with the different types of stones to accurately identify them. This method is time-consuming and can be prone to errors. 2. Semi-Automatic Annotation: Semi-automatic annotation involves using a computer program to pre-label the stones in the images. This method is more accurate than manual annotation and can be completed in a shorter amount of time. 3. Machine Learning-Based Annotation: Machine learning-based annotation uses algorithms to automatically label the stones in the images. This method is the most accurate and efficient way to collect data for kidney stone classification from QUS images. 4.1.1
Data Set Details
Using databases and the expertise of radiologists, kidney ultrasound images are obtained [30]. The clinical images are acquired between August 2022 and December 2022 at the Kalasalingam Hospital in Krishnankoil, Tamil Nadu, India. 3420 augmented images are available in total (2420 images in the training set and 1000 in the testing set). In Fig. 2, the dataset’s sample ultrasound images are displayed.
Fig. 2 Different types of kidney abnormalities: (a) Normal kidney, (b) Cyst kidney, (c) Stone kidney, (d) Tumor kidney
Optimized TSA ResNet Architecture with TSH—Discriminatory …
237
Fig. 3 Pre-processing of kidney images (a) Original image, (b) Grayscale image, (c) Median filter
4.1.2
Working Flow of TSH-TSA
Input: Ultrasound Image of Kidney Stone An ultrasound image of a kidney stone is a type of medical imaging used to detect and diagnose kidney stones. It is a non-invasive procedure that uses sound waves to produce an image of the affected area. The ultrasound image provides a detailed view of the kidney stone, including its size, shape, location, and other characteristics. The ultrasound image can also be used to determine if a person is at risk of developing a kidney stone, or if a current stone can be monitored or treated. Ultrasound is a safe, low-cost imaging technique that is widely used to diagnose and monitor kidney stones.
Pre-processing Pre-processing is an essential step for any machine learning or deep learning model. The purpose of pre-processing is to prepare the data so that it is suitable for the model to learn from. In the case of the TSH-TSA-ResNet architecture for kidney stone classification from QUS images, pre-processing involves normalizing the images, cropping and resizing them, and applying any necessary transformations such as adding noise or blurring. Additionally, pre-processing may involve applying various filters to the images to enhance their features. Finally, the images should be split into training and testing sets to evaluate the model’s performance. Pre-processing for the TSH-TSA-ResNet architecture for kidney stone classification from QUS images involves the following steps and the results are shown in Fig. 3 for reference: 1. Image Acquisition: The first step is to acquire the QUS images of kidney stones as shown in Fig. 2. This can be done by using a high-resolution ultrasound device. 2. Image Pre-processing: The acquired images should be pre-processed to remove any noise and artifacts from the image and to ensure that the image is of good quality. This can be done by using image enhancement techniques such as image filtering, contrast enhancement, and sharpening.
238
P. Nagaraj et al.
3. Image Resizing: The images should then be resized to a uniform size to ensure that the network can process the images in the same size. 4. Image Augmentation: To increase the amount of data available for training, data augmentation techniques such as rotation, translation, scaling, and flipping can be used on the images. 5. Image Feature Extraction: Features such as texture, shape, and intensity should be extracted from the images. This can be done using techniques such as Gabor filtering, Histogram of Oriented Gradients (HOG), and Local Binary Patterns (LBP). A. Resize the image to a suitable size First, crop the image to the desired size by using an image editing program. Then resize the image to the desired size. Next applying an image sharpening editing program to sharpen the image, will help to make the image look crisper. At last, need to apply color correction to adjust the color balance of the image, it will help to make the image look more natural. B. Convert the image to a grayscale It reduces the amount of data needed to represent the image. The conversion process eliminates the need to store color information, which reduces the size of the image file and makes it easier to process. Additionally, grayscale images are easier to analyze because they contain fewer colours. The following steps are followed to convert the image to a grayscale. 1. 2. 3. 4.
Use the read command to read the image you want to convert to grayscale. Use the rgb2gray command to convert the image to a grayscale. Use the show command to view the image in grayscale. Use the write command to save the image in grayscale.
Feature Extraction Feature extraction is a process of extracting useful information from images to make them easier to analyze. It is commonly used in computer vision applications such as object recognition and image segmentation. We have used Scale-Invariant Feature Transform (SIFT) feature extraction techniques. SIFT is a feature extraction technique used to extract distinctive features from an image. It is used for object recognition and image matching. A. Extraction of TSH and TSA using Radon-based transform The Radon-based transform is a powerful transform used to extract features from images and other data. To extract TSH and TSA using the Radon-based transform, First need to convert the image into a 2D array of pixels. Next by applying the Radon transforms into the image array, which will decompose it into sinusoidal components. And then Extract the TSH and TSA components from the sinusoidal components by
Optimized TSA ResNet Architecture with TSH—Discriminatory …
239
Fig. 4 Deep network architecture for ultrasound kidney images residual blocks (100 convolution layers in total) as a CNN-based feature extractor and three fully connected layers of 512, 512, and 256 neurons as a regressor for eGFR prediction
isolating the components that have the highest amplitude in the frequency domain. At last, The TSH and TSA components can then be extracted and used for further analysis.
Classification a. TSH and TSA as inputs to ResNet 1. Pre-process the TSH and TSA data: Start by normalizing the data so that it has a mean of zero and a standard deviation of one. This ensures that the data is centered around zero and makes it easier for the neural network to learn. Then, split the data into training, validation, and test sets. This allows the neural network to be tested on unseen data to ensure that it is performing well. 2. Create a ResNet architecture: Design a ResNet architecture that takes in the TSH and TSA data and processes it through a series of convolutional and pooling layers as shown in Fig. 4. This will allow the network to extract features from the data and learn patterns that can be used to make predictions. 3. Train the ResNet model: Once the architecture is in place, train the model by feeding it the training data and adjusting the weights and biases, as necessary. Monitor the accuracy of the model on the validation set and adjust the hyperparameters, as necessary. 4. Evaluate the model: Once the model is trained, use it to make predictions on the test set and evaluate its performance. This can be done by comparing the predicted output to the ground truth labels. b. Classification of Kidney Images with ResNet The performance evaluation is done by using the evaluation metrics like Accuracy, precision, and Recall.
240
P. Nagaraj et al.
Accuracy: Accuracy is a measure of how close a model’s predictions are to the true values. It is typically expressed as a percentage and is calculated by dividing the number of correct predictions by the total number of predictions made. Accuracy = (True Positives + True Negatives) Total Predictions Precision: Precision is a measure of how accurate a model is when making predictions. It is the ratio of correctly predicted positive observations to the total predicted positive observations. High precision means that an algorithm returned a high proportion of relevant results among the total results, while low precision means that an algorithm returned a low proportion of relevant results among the total results. Precision = True Positives (True Positives + False Positives) Recall: The recall is a measure of a model’s ability to correctly identify positive cases from a dataset. It measures the proportion of actual positive cases that are correctly identified by the model. Recall = True Positives (True Positives + False Negatives)
Output a. Class label of kidney stone The class label of a kidney stone should be allocated based on the size, location, and shape of the stone. Generally, stones less than 5 mm in size are considered small, stones between 5 and 10 mm are considered medium, and stones larger than 10 mm are considered large. Additionally, stones located in the kidney are labeled as renal, stones located in the ureter are labeled as ureteral, and stones located in the bladder are labeled as the bladder. Lastly, stones can be described as round, egg-shaped, jagged, or smooth based on their shape. b. Confidence Score The confidence score is a measure of how is reliable a particular data point. It is a numerical value that indicates the probability that a particular data point is accurate. It can be calculated by taking the ratio of the number of correct predictions to the total number of predictions.
Optimized TSA ResNet Architecture with TSH—Discriminatory …
241
Fig. 5 Comparison of the proposed method’s performance with that of the existing techniques during training and testing using dataset D1
5 Experimental Results 5.1 Dataset—Image Acquisition Using two types of datasets, one from the web (D1) and another from the hospital, the suggested method’s results are compared with those obtained using cutting-edge techniques (D2). The performance of the suggested approach is compared to that of the most recent methodologies and current DNNs [30–35]. As can be shown in Fig. 5, the proposed technique outperforms the current methods with a classification accuracy of 98.9%. This demonstrates that the suggested method can properly and efficiently categorize renal ultrasound images.
5.2 Implementation Platform The testing was performed on a computer with an Intel(R) Core (TM) i7-9700 @ 3.00 GHz processor and 16 GB of RAM using the Windows 10 Pro × 64 operating system. The software platform for simulations is the MATLAB tool R2019b version.
242
P. Nagaraj et al.
5.3 Performance Analysis The performance of the network model in a noisy environment is assessed using dataset D2. This dataset, D2, includes RN, PH, and high-quality training images. Images are typically distorted by noise in actual applications; hence it is more useful to test the model’s effectiveness in noisy environments. At different noise levels, it was found that the networks’ performance decreased. Additionally, it is clear from this that a rise in noise variance causes a considerable decline in DNN classification accuracy. This means that every approach now in use is noise-sensitive, and when noise levels rise, so does their performance. because the image quality used for training and testing are different. Because of this, DNN performance is quite low, but it may be improved by training and testing using datasets that include comparable training and testing images.
6 Discussion The confusion matrix is used in showcasing the success of the classification through the true positives and false negative rates. The sensitivity and specificity of the presented network model are used to assess its performance using the number of samples failed and succeeded in the classification accuracy. Table 1 lists the confusion matrix and related performance metrics. The confusion matrix makes it easier to comprehend how many mistakes the network model made when predicting each class. It is important to note that the cyst category, which is depicted in Table 1, achieves 98.9% sensitivity. This demonstrates that the suggested approach can accurately categorize all cyst images without making any classification errors. Similar results for D2 can be shown in Table 1, where the cyst category’s sensitivity and selectivity are both 98.8%, representing accurate predictions of really positive and negative instances. Table 1 Confusion matrix and its performance metrics during training and testing using kidney ultrasound image datasets D1 and D2 Dataset
Performance measure
Normal
Cyst
Stone
Tumor
D1 [online]
Sensitivity %
97.6
98.9
90.1
98.5
D2 [Hospital]
Specificity %
97.5
98.6
99.4
98.2
Sensitivity %
90.2
97.2
92.3
96.7
Specificity %
96.4
98.8
96.5
98.9
Optimized TSA ResNet Architecture with TSH—Discriminatory …
243
7 Conclusion In this work, we have discussed a novel architecture that combines both traditional and deep learning methods to extract discriminative features from the images. The proposed TSA-ResNet architecture is optimized with the help of an evolutionary algorithm that has proven to be a successful and reliable method for accurate kidney stone classification. The authors evaluated their proposed architecture on a dataset of patient derived QUS images. The results show that the proposed architecture outperforms existing methods, achieving an accuracy of 98.9%. Overall, the paper demonstrates that the proposed architecture can effectively extract discriminative features from QUS images for the classification of kidney stones. The results obtained by the authors indicate that the proposed architecture is an effective solution for the automated diagnosis of kidney stones from QUS images. The model was able to achieve a high accuracy rate, indicating its potential for use in clinical settings. Furthermore, the authors suggest that further research should be conducted to further improve the accuracy and performance of the model which suggests that it is generalizable and can be used for a wide range of medical imaging scenarios. Overall, the results of this study suggest that this optimized TSA ResNet architecture is a promising tool for automated kidney stone classification from QUS images. Acknowledgements All the authors would like to thank both the Department of Computer Science and Engineering and the Department of Electronics and Communication Engineering, at Kalasalingam Academy of Research and Education (Deemed to be University) for permission to conduct the research and provide computational facilities in the analysis of the images.
References 1. Edvardsson, V. O., Indridason, O. S., Haraldsson, G., Kjartansson, O., & Palsson, R. (2013). Temporal trends in the incidence of kidney stone disease. Kidney International, 83(1), 146–152. 2. Kumar, K., & Abhishek, B. (2012). Artificial neural networks for diagnosis of kidney stones disease (Vol. 10). GRIN Verlag. 3. Serrat, J., Lumbreras, F., Blanco, F., Valiente, M., & López-Mesas, M. (2017). myStone: A system for automatic kidney stone classification. Expert Systems with Applications, 89, 41–51. 4. Howles, S. A., & Thakker, R. V. (2020). Genetics of kidney stone disease. Nature Reviews Urology, 17(7), 407–421. 5. Schaeffer, A. J., Feng, Z., Trock, B. J., Mathews, R. I., Neu, A. M., Gearhart, J. P., & Matlaga, B. R. (2011). Medical comorbidities associated with pediatric kidney stone disease. Urology, 77(1), 195–199. 6. Praveen, S. P., Srinivasu, P. N., Shafi, J., Wozniak, M., & Ijaz, M. F. (2022). ResNet-32 and FastAI for diagnoses of ductal carcinoma from 2D tissue slides. Scientific Reports, 12, 20804. https://doi.org/10.1038/s41598-022-25089-2 7. Whitehurst, L., Jones, P., & Somani, B. K. (2019). Mortality from kidney stone disease (KSD) as reported in the literature over the last two decades: A systematic review. World Journal of Urology, 37(5), 759–776. 8. Kazemi, Y., & Mirroshandel, S. A. (2018). A novel method for predicting kidney stone type using ensemble learning. Artificial Intelligence in Medicine, 84, 117–126.
244
P. Nagaraj et al.
9. Novak, T. E., Lakshmanan, Y., Trock, B. J., Gearhart, J. P., & Matlaga, B. R. (2009). Sex prevalence of pediatric kidney stone disease in the United States: An epidemiologic investigation. Urology, 74(1), 104–107. 10. Ahmed, S., Srinivasu, P. N., Alhumam, A., & Alarfaj, M. (2022). AAL and internet of medical things for monitoring type-2 diabetic patients. Diagnostics, 12, 2739. https://doi.org/10.3390/ diagnostics12112739 11. Sood, A., Sarangi, S., Pandey, A., & Murugiah, K. (2011). YouTube as a source of information on kidney stone disease. Urology, 77(3), 558–562. 12. Matlaga, B. R., Schaeffer, A. J., Novak, T. E., & Trock, B. J. (2010). Epidemiologic insights into pediatric kidney stone disease. Urological Research, 38(6), 453–457. 13. Scherer, K., Braig, E., Willer, K., Willner, M., Fingerle, A. A., Chabior, M., Herzen, J., Eiber, M., Haller, B., Straub, M., Schneider, H., & Pfeiffer, F. (2015). Non-invasive differentiation of kidney stone types using X-ray dark-field radiography. Scientific Reports, 5(1), 1–7. 14. Kahani, M., Tabrizi, S. H., Kamali-Asl, A., & Hashemi, S. (2020). A novel approach to classify urinary stones using dual-energy kidney, ureter and bladder (DEKUB) X-ray imaging. Applied Radiation and Isotopes, 164, 109267. 15. Thongprayoon, C., Krambeck, A. E., & Rule, A. D. (2020). Determining the true burden of kidney stone disease. Nature Reviews Nephrology, 16(12), 736–746. 16. Duan, X., Qu, M., Wang, J., Trevathan, J., Vrtiska, T., Williams, J. C., Krambeck, A., Lieske, J., & McCollough, C. (2013). Differentiation of calcium oxalate monohydrate and calcium oxalate dihydrate stones using quantitative morphological information from micro-computerized and clinical computerized tomography. The Journal of Urology, 189(6), 2350–2356. 17. Singh, P., Enders, F. T., Vaughan, L. E., Bergstralh, E. J., Knoedler, J. J., Krambeck, A. E., Lieske, J. C., & Rule, A. D. (2015, October). Stone composition among first-time symptomatic kidney stone formers in the community. Mayo Clinic Proceedings, 90(10), 1356–1365. Elsevier. 18. Motamedinia, P., Okhunov, Z., Okeke, Z., & Smith, A. D. (2015). Contemporary assessment of renal stone complexity using cross-sectional imaging. Current Urology Reports, 16(4), 1–7. 19. Caroli, A., Remuzzi, A., & Lerman, L. O. (2021). Basic principles and new advances in kidney imaging. Kidney International, 100(5), 1001–1011. 20. D’costa, M. R., Haley, W. E., Mara, K. C., Enders, F. T., Vrtiska, T. J., Pais, V. M., Jacobsen, S. J., McCollough, C. H., Lieske, J. C., & Rule, A. D. (2019). Symptomatic and radiographic manifestations of kidney stone recurrence and their prediction by risk factors: A prospective cohort study. Journal of the American Society of Nephrology, 30(7), 1251–1260. 21. Nestler, T., Haneder, S., & Hokamp, N. G. (2019). Modern imaging techniques in urinary stone disease. Current Opinion in Urology, 29(2), 81–88. 22. Cui, X., Zhao, Z., Zhang, G., Chen, S., Zhao, Y., & Lu, J. (2018). Analysis and classification of kidney stones based on Raman spectroscopy. Biomedical Optics Express, 9(9), 4175–4183. 23. Jendeberg, J., Thunberg, P., & Lidén, M. (2021). Differentiation of distal ureteral stones and pelvic phleboliths using a convolutional neural network. Urolithiasis, 49(1), 41–49. 24. Wang, R. C., Rodriguez, R. M., Moghadassi, M., Noble, V., Bailitz, J., Mallin, M., Carbo, J., Kang, T. L., Chu, P., Shiboski, S., & Smith-Bindman, R. (2016). External validation of the STONE score, a clinical prediction rule for ureteral stone: An observational multi-institutional study. Annals of Emergency Medicine, 67(4), 423–432. 25. Schütz, J., Miernik, A., Brandenburg, A., & Schlager, D. (2019). Experimental evaluation of human kidney stone spectra for intraoperative stone-tissue-instrument analysis using autofluorescence. The Journal of Urology, 201(1), 182–188. 26. Kavoussi, N. L., Floyd, C., Abraham, A., Sui, W., Bejan, C., Capra, J. A., & Hsi, R. (2022). Machine learning models to predict 24 hour urinary abnormalities for kidney stone disease. Urology, 169, 52–57. 27. Han, H., Mutter, W. P., & Nasser, S. (Eds.). (2019). Nutritional and medical management of kidney stones. Springer International Publishing. 28. Williams, J. C., Gambaro, G., Rodgers, A., Asplin, J., Bonny, O., Costa-Bauzá, A., Ferraro, P. M., Fogazzi, G., Fuster, D. G., Goldfarb, B. S., Grases, F., & Robertson, W. G. (2021). Urine
Optimized TSA ResNet Architecture with TSH—Discriminatory …
29. 30. 31.
32. 33.
34. 35.
245
and stone analysis for the investigation of the renal stone former: A consensus conference. Urolithiasis, 49(1), 1–16. Deng, Y., Yang, B. R., Luo, J. W., Du, G. X., & Luo, L. P. (2020). DTI-based radiomics signature for the detection of early diabetic kidney damage. Abdominal Radiology, 45(8), 2526–2531. Singla, R., Ringstrom, C., Hu, G., Lessoway, V., Reid, J., Nguan, C., & Rohling, R. (2022). The open kidney ultrasound data set. arXiv preprint arXiv:2206.06657. Kuo, C. C., Chang, C. M., Liu, K. T., Lin, W. K., Chiang, H. Y., Chung, C. W., Ho, M. R., Sun, P. R., Yang, R. L., & Chen, K. T. (2019). Automation of the kidney function prediction and classification through ultrasound-based kidney imaging using deep learning. NPJ Digital Medicine, 2(1), 29. Sudharson, S., & Kokil, P. (2020). An ensemble of deep neural networks for kidney ultrasound image classification. Computer Methods and Programs in Biomedicine, 197, 105709. Verma, J., Nath, M., Tripathi, P., & Saini, K. K. (2017). Analysis and identification of kidney stone using K th nearest neighbour (KNN) and support vector machine (SVM) classification techniques. Pattern Recognition and Image Analysis, 27, 574–580. Selvarani, S., & Rajendran, P. (2019). Detection of renal calculi in ultrasound image using meta-heuristic support vector machine. Journal of Medical Systems, 43(9), 300. Kokil, P., & Sudharson, S. (2019). Automatic detection of renal abnormalities by Off-the-shelf CNN features. IETE Journal of Education, 60(1), 14–23.
Ambient Healthcare: A New Paradigm in Medical Zone Sreemoyee Samanta, Adrija Mitra, Sushruta Mishra, and Naga Srinivasu Parvathaneni
Abstract Ambient intelligence (AmI), a new paradigm in artificial intelligence, seeks to enhance people’s abilities by utilizing sensitive digital surroundings that are responsive to human requirements, routines, activities, and emotions. With the use of perceptive communications that are intrusive, covert, and proactive, future society will be able to communicate with humans and machines in novel ways. Such AmI technology is a result of cutting-edge interaction paradigms. An appropriate choice for developing a variety of workable solutions, notably in the field of healthcare. To provide the necessary context for the scientific community, this survey will explore the development of AmI approaches in the healthcare industry. We will discuss the infrastructure and technology required to implement AmI’s vision, such as smart environments and wearable medical technologies. The most recent artificial intelligence (AI) development approaches used to produce AmI systems in the healthcare area will be outlined. making use of a variety of learning strategies (to learn from user interaction), reasoning techniques (to reason about the aims and objectives of users), and planning approaches (for organising activities and interactions). We’ll also go over the possible advantages of AmI technology for those with various long-term physical, mental, or emotional conditions. In order to determine new avenues for future studies, we will showcase some of the successful case studies in the field and examine current and upcoming difficulties. Keywords Ambient Intelligence (AmI) · Healthcare · Sensor networks · Smart environments · Communication · Healthcare sectors · Emergency detection
S. Samanta · A. Mitra · S. Mishra (B) Kalinga Institute of Industrial Technology, Deemed to Be University, Bhubaneswar, India e-mail: [email protected] N. S. Parvathaneni Department of Computer Science and Engineering, Prasad V Potluri Siddhartha Institute of Technology, Vijayawada 520007, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_11
247
248
S. Samanta et al.
1 Introduction In hospitals or medical facilities, traditional healthcare and services are typically provided. The main causes of death are increasingly being CD. The leading cause of death in EU nations is the most typical cause of death [1]. United States National Center for Health Statistics, significant CDs such heart disease, 35.6% of all deaths are caused by cerebrovascular disease and diabetes. Death in the US in 2005 [2]. Sudan, as reported by the new WHO data [3] released in April 2011, Coronary Heart Deaths from diseases (CHD) accounted for 10.67% of all deaths. There is the necessity to provide ongoing care and assistance to aged people, disabled people, and CD patients a desire to discover such care has evolved into a more efficient method of delivery. Significant obstacle for the scientific community [4]. Additionally, individuals in the post-surgery stage require ongoing monitoring of their health, particularly of their vital signs, until their health status stabilizes. Patients and their families must work with their doctor and other medical experts to learn about their conditions. Up until now, medical facilities or hospital settings have typically been used to monitor these people’s health conditions. As a result, vital sign measures and the associated diagnosis are done in carefully regulated settings. However, this procedure is costly, unproductive, and painful for individuals who need routine examinations because patients must frequently visit the hospital, sometimes daily, or worse, require a prolonged stay. Routine medical exams and healthcare services must be moved from the hospital setting to the home environment in order to free up hospital beds and other scarce resources for patients with urgent needs. Ambient intelligence (AmI) for healthcare monitoring and tailored healthcare is one potential approach to providing effective medical services that might significantly lower healthcare expenses. AmI [5], a newly developed multidisciplinary field based on ubiquitous computing, has an impact on the creation of protocols, communications, systems, devices, etc. [6]. Ami offers creative ways for people to interact with technology, personalising it for each user’s needs and the environment they live in. In an effort to match technology to human needs, AmI [7] suggests a new method of interaction between humans and technology in which the latter is adapted to individuals and their setting. It accomplishes this by utilising components of ubiquitous computing that communicate with one another invisibly. The context contains details about the users and the surrounding area. Numerous factors, such as the condition of the building (such as temperature or light), vital signs (such as heartbeat or blood pressure), etc., could be included in the data. Wireless sensor networks are used to collect the data necessary for AmI settings (WSNs). Some potential WSN technologies are Bluetooth, Zig Bee, and Radio Frequency Identification (RFID). Simply gathering context-related data is not enough. However, as the quality of decision-making depends on the quality of the information, it must be digested, analyzed, reasoned about, and 742 Salih et al. In this regard, many architectures and concepts have been applied in the creation of ambient systems. The overall manuscript is structured as follows. Section 1 highlights the emergence of ambient intelligence in many diverse domains. Section 2 briefs the importance of
Ambient Healthcare: A New Paradigm in Medical Zone
249
ambient healthcare along with its challenges and use cases. Section 3 discusses the prime motivation and objective of the analysis. Section 4 discusses some vital existing models on ambient healthcare. Section 5 presents some critical real time applications related to ambient healthcare. Section 6 highlights a case study illustrating the use of ambient healthcare in emergency scenarios. Section 7 discusses the future scope of the topic. Finally, Sect. 8 concludes the study. The main motives of this research work are:. . . .
It makes use of situational and contextual data. Without the individual’s conscious intervention, it can predict their requirements. It is incorporated and embedded in our daily environments. It blends unobtrusively into the backdrop of our daily lives.
2 Ambient Healthcare The European Commission’s Information Society Technologies Advisory Group (ISTAG) proposed AmI initially [8, 9]. AmI has been characterized in a variety of ways by researchers, as shown below. The characteristics that are anticipated in AmI technologies—sensitive, responsive, adaptive, transparent, omnipresent, and intelligent—are highlighted in these definitions. . AmI is a growing multidisciplinary field based on ubiquitous computing that influences the design of protocols, communications, systems, devices, etc. [5]. . AmI provides new ways to connect with technology, modifying it to meet people’s needs and the environment in which they live [6]. . AmI uses ubiquitous computing components that connect with one another in order to adapt technology to the requirements of people [7]. . A new technology that will gradually increase the sensitivity and responsiveness of our surroundings to our demands [8]. . In a possible future, sentient items will surround us, and the environment will be able to detect our presence and respond to it covertly [8]. . AmI suggests that there is intelligence everywhere around us [11]. . The existence of a digital world that is perceptive to human presence and reacting to it [12]. . For distributed, non-intrusive, and intelligent software systems, a new study topic has emerged [13]. . A non-intrusive digital setting that aids people in going about their regular tasks [14].
2.1 Ambient Intelligence Based Health Care Challenges Most industrialized nations today have serious problems with the availability, price, and quality of numerous healthcare and wellness services. These issues will further
250
S. Samanta et al.
exacerbated as the population ages, as seen by the rising prevalence of chronic diseases and the high demand for various healthcare services. Because of this, industrialized nations may not be able to sustain the high cost of the healthcare sector. As a result, they must identify and develop policies and methods to make better use of their limited economic resources. In order to have sustainable healthcare systems, a lot of scientific and technological issues have arisen. If these issues can be overcome, it will benefit our global society and economy. It will be especially advantageous to use information and communication technology to establish autonomous and proactive healthcare services. In the past decades, consumer-driven healthcare in conjunction with web-based platforms and electronic health records have led to an assortment of better health care options. We have also seen the advent of numerous smartphone apps in recent years that are easily accessible for tracking physiological conditions. Although these solutions are a crucial step toward customized medicine, they frequently have scalability, security, and privacy problems. Furthermore, rather than offering a continuous picture of a person’s general health across many years, such systems can only offer a snapshot of physiological parameters. Thanks to recent developments in sensor network research, we are moving closer to the development of novel, cost-effective health monitoring systems that are integrated into houses and other living spaces. AmI systems, in particular, have the potential to significantly improve the healthcare sector. For instance, ambient intelligence technology can be used to track the well-being of senior people or those suffering from long-term conditions, as well as to offer assistance to those who are physically or mentally unable. It can be used to develop services that rely on persuasion to motivate people to lead healthier lives. Finally, by providing state-of-the-art monitoring and communication capabilities, it can aid the medical team. These programmes will provide covert and open health monitoring.
2.2 Use Cases Driven by Ambient Intelligence Although AmI shouldn’t be confused with any one of them specifically, Fig. 1 demonstrates that it incorporates aspects from a number of computer science domains. Networks, sensors, human computer interfaces (HCI), ubiquitous computing, and artificial intelligence (AI) do not conceptually encapsulate the full breadth of AmI, despite the fact that they are all significant and interrelated. AmI assembles all of these resources to provide flexible and intelligent services to users acting in their environments.
Ambient Healthcare: A New Paradigm in Medical Zone
251
Fig. 1 Areas related with Ambient Intelligence
3 Motivation and Purpose Ambient intelligence may support professional judgement, lessen employee stress, cut expenses, and increase patient safety. Ambient intelligence enhances clinical workflow, enhancing the quality of care and productivity overall by relieving healthcare professionals of repetitive and documentary activities. For many people, the expense of healthcare is a major barrier to access. Healthcare will continue to be pricey if there is no solution to lower employee fatigue and operational risks that raise expenses. A rise in demand for products and services targeted towards those with chronic illnesses is anticipated. There is a need for continual healthcare monitoring for those with ailments including diabetes, arthritis, senile dementia, Alzheimer’s, and heart-related disorders, among many others, as well as aid for those who have these conditions. Given the current situation and everything that has occurred globally in connection with COVID-19, we see how IOT enabled healthcare. When there were fewer healthcare professionals on staff and the number of patients was rising quickly, it offered individuals attention [4]. Covid-19 caused chaos in society because it was a contagious illness and because people were afraid to approach Covid victims. The IOT technologies employed in the scenario checked the patient’s blood pressure, heart rate, oximeter reading, level of exercise, and several medical conditions. The cost reduction that IOT has on the healthcare sector is its main effect. It allows the patient to monitor their vitals in real time and greatly reduces the number of unnecessary doctor visits. Additionally, it gives doctors the ability to base their choices
252
S. Samanta et al.
entirely on the available data. However, it’s been said that every advantage has a corresponding disadvantage. The doctor may use the real-time vitals remotely, which saves money on trips, speeds up therapy, and allows for constant contact. Additionally, it minimizes the usage of hospital resources. By watching recovery-related behaviours, lowering unintentional physician errors, supporting the ageing population, and monitoring patients with chronic diseases, ambient intelligence has the ability to shed light on the healthcare delivery process. This paper conducts a meta-analysis of the ambient intelligence in healthcare which examines how the market’s influence on healthcare and professional culture affected patients’ timely access to high-quality care and physicians’ working circumstances. Ambient intelligence is a technology that can be used in both residential and commercial settings to analyse conditions, communicate with other devices, carry out management tasks, and send data to the outside world. The goal of ambient intelligence (AmI), a new paradigm in information technology, is to improve people’s skills by utilising digital environments that are observant, flexible, and responsive to human needs, habits, gestures, and emotions. With the patient’s consent, it can be used in the healthcare industry to capture patient health statistics and update the patient’s Electronic Medical Record (EMR) to provide a better and more accurate narrative. By evaluating patient data including prior treatments, allergic reactions of the patient, and more, it can help healthcare professionals, like doctors and nurses, provide quality care. Ambient intelligence assists the elderly by remotely monitoring their health and enabling them to live independently in nations with a higher population of senior citizens thanks to Ambient Assisted Living (AAL) technology. Overall, using this technology will improve patient satisfaction, physician satisfaction, and care quality.
4 Analysis of Relevant Models on Ambient Healthcare This section will present and detail the technologies and supporting infrastructure utilised by AmI systems in the context of the healthcare business. We will discuss current advances in sensor technologies, such as epidermal electronics and MEMS sensors, as well as Body Area Networks (BANs) and Dense/Mesh Sensor Networks in Smart Homes.
4.1 BANs: Body Area Networks Body Area Networks (BANs) have been made conceivable by expanding wireless connectivity and continuous electrical component shrinkage [8]. As part of a BAN, several sensors may be positioned on the body, attached to clothing, or implanted beneath the skin [9]. By continuously tracking health indicators like heart rate, body
Ambient Healthcare: A New Paradigm in Medical Zone
253
temperature, physical activity, heart rate, ECG, EEG, and EMG, this new communication technique offers a number of innovative, practical, and creative applications for improving human health and quality of life (electromyography) [10]. BANs offer a technological infrastructure for remotely streaming sensor data to a medical professional’s location for a real-time diagnosis, to a medical database for record keeping, or to a corresponding piece of contemporary technology that, topdog and subconsciously, can issue an emergency alert or thoughtfully manage this information for taking suitable actions and enhancing human life [8]. The use of wireless BANs in applications in health care has a number of advantages, chief among them being cost- and communication-effectiveness. It is true that physiological inputs from body sensors may be processed successfully to produce trustworthy and accurate physiological estimates. These sensors also have extremely low power consumption, which extends the life of their batteries. Additionally, additional sensors will be built at a cheap cost due to the rising demand for body sensors in the electronics sector, particularly for medical uses. The two most major advantages of BAN are its scalability and interoperability with other systems. BANs may link to Bluetooth Low Energy (BLE, formerly known as WiBree), wireless local area networks (WLANs), the internet, cell towers, wireless sensor networks (WSNs), and radio frequency identification tags (RFIDs), and wireless local area networks (WLANs) [15, 16]. As a result of all of these considerable benefits, the area of ubiquitous computing is opening up new marketing opportunities for high-end consumer gadgets. The communication architecture of BANs is more accurately depicted in Fig. 2’s three layers of tier-1 intra-BAN, tier-2 inter-BAN, and tier-3 beyondBAN interactions. These architectural layers, which address many communicationrelated subjects spanning from low-level to high-level design considerations, enable the development of a component-based, efficient BAN system for a number of applications. The phrase “intra-BAN communications” refers to radio transmissions that take place within around 2 m of a person’s body. Figure 2 illustrates a division of these transmissions into two categories: those between body sensors and those using portable Personal Server (PS) devices like PDAs. The architecture of intra-BAN communications is essential given how closely body sensors and BANs are coupled. It is difficult to create an energy-efficient MAC protocol with QoS provisioning since most current body sensor devices are battery-operated and have slow data rates. The body sensors and one or more access points can communicate thanks to “inter-BAN communications” (APs). The APs can be deployed as a part of the infrastructure or put strategically in a busy region to address emergency scenarios. Similar to this, the features of tier-2 networks (Fig. 2) are utilised to link BANs with numerous networks that consumers often visit, such as the Internet and mobile networks. Infrastructure-based architecture and ad hoc-based architecture are the two classes into which we categorise the inter-BAN communication paradigms. In contrast to infrastructure-based designs, which offer more bandwidth with centralised control and flexibility, ad hoc-based solutions enable rapid deployment in response
254
S. Samanta et al.
Fig. 2 A three-tier architecture of the BAN communication system
to changing conditions, such as medical emergency treatment or at a catastrophe site (e.g., AID-N systems). The majority of BAN applications employ infrastructure-based inter-BAN communications that are based on the assumption of a site with restricted space, such as a hospital waiting room, a home or place of work, etc. Ad hoc networks have the drawback of lacking administration control and security, whereas infrastructurebased networks provide both benefits [17]. The AP also serves as the database server in several applications, like SMART and CareNet, as a result of this centralised structure. The doctor might need access to the user’s information, depending on the user’s priority for the service and/or the doctor’s availability. Using a number of communication techniques, automatic notifications may be sent to the person’s loved ones at the same time based on this information. Application-specific information, beyond The needs of services that are tailored for particular consumers should be taken into account while designing BAN communications. For instance, if any irregularities are discovered based on the most recent body signal sent to the database, the patient or the doctor may be notified via email or short messaging service (SMS). In practise, a doctor may be able to establish a diagnosis through the internet by combining visual talks with the patient with physiological data stored in a database or received by a BAN worn by the patient.
Ambient Healthcare: A New Paradigm in Medical Zone
255
4.2 Dense/Mesh Sensor Networks for Smart Living Environments Sensors may be incorporated into our surroundings in order to create intelligent, proactive living spaces that can aid in and improve daily life, particularly for the elderly or people with cognitive or motor impairments. Wireless Mesh Sensor Networks (WMSNs) can be used to build discrete, networked, adaptable, dynamic, and intelligent environments, especially with processors and sensors implanted in ordinary items (clothing, home appliances, furniture, etc.) [18]. Typical settings include these sensors, which are more frequently referred to as “ambient sensors” [19]. To enhance residents’ comfort and quality of life, the ambient sensors will gather a variety of data in order to recognise residents’ activities and anticipate their requirements. WMSNs are constructed using a mesh networking topology, a form of networking in which each node is intended to relay data for other nodes in addition to storing and disseminating its own. Or, to put it another way, each sensor has to work together for the data to be distributed over the network. The primary advantages of WMSNs are their capacity for dynamic self-organization and self-configuration, with the network independently forming and maintaining mesh connections among sensors [20]. WMSNs are especially well-suited for application in complex and dynamic contexts like living environments, since they do not require centralised access points to manage wireless communication [21]. Three separate wireless network components make up the overall WMSN architecture as stated in Fig. 3: (i) network entry points; (ii) entry points; (iii) both fixed and moving nodes. These sections are frequently referred to as “mesh nodes” (MNs). Each node in a WMSN performs the roles of a client and a router. Mesh networks relay a data
Fig. 3 Wireless sensor network
256
S. Samanta et al.
request until a network connectivity is found rather than needing a direct Internet connection like WiFi hotspots do. There are three categories that can be used to categorise WMSN architecture: Infrastructure/Backbone WMNs, Client WMSNs, and Hybrid WMNs. In client WMSNs, where all client nodes make up the actual network to complete routing and configuration operations, mesh routers provide an infrastructure for clients in infrastructure WMSNs, where mesh clients can execute mesh functions with other clients and access the network. Hybrid networks are a fusion of the first two. Smart environments may now provide cutting-edge, tremendously dependable, and power-efficient solutions because of cutting-edge WMSNs networking technology. A building’s existing sensor devices may be simply connected to low-profile mesh modules to create seamless networks, which gives WMSNs a tonne of flexibility and scalability. WMSNs usually provide the following features in terms of intelligent settings: . Faster Retrofitting: One of the main reasons for higher costs and longer turnaround times in office space conversion is the labor-intensive relocation of utility wiring to comply with the new wall organization. System designers can rapidly and easily switch sensors thanks to the use of WMSNs without the requirement for labor-intensive, pricey, or disruptive rewiring. . Maintainability: When constructing a sensor network, low maintenance costs are a top priority. WMSNs’ self-configuring and self-healing qualities, together with their low power consumption, make them an efficient remedy for the maintenance problem. . Smooth updates and modifications: The transition to a wireless solution is not an all-or-nothing decision, thanks to the convergence and collaboration of important standard communication businesses like the Zig Bee Alliance and the ASHRAE BACnet Committee. This simplifies the introduction of WMSNs gradually, one room, section, floor, or building at a time. . Flexibility: Systems designers may deploy a WMSN without worrying about wiring issues by putting wireless controllers almost anyplace. This method saves time and money by producing systems that are easily re-configurable to provide flexible workplaces or less obtrusively upgrade the current network infrastructures. Both the HomeMesh and Siemens APOGEE projects have provided examples of WMSNs for intelligent living environments. Both studies demonstrate that it will be able to develop living environments specifically tailored for supporting the capacities of the elderly or people with disabilities in order to improve their quality of life, starting with WMSN’s characteristics.
Ambient Healthcare: A New Paradigm in Medical Zone
257
5 Applications of Ambient Intelligence in Healthcare 5.1 Continuous Health Monitoring Numerous noninvasive sensors have been created during the last ten years to measure and track a wide range of physiological indicators, including ECG, EEG, EDA, respiration, and even biochemical processes like wound healing. Some of those sensors come in the shape of wearable gadgets like wristbands, while others are integrated into textiles, also referred to as smart fabrics or e-textiles. Despite the fact that the majority of these sensors allow for noninvasive physiological sign monitoring, other physiological measurements, like EEG, still necessitate the use of invasive devices and sensors (e.g. measuring EEG requires the use of electrodes). Whatever the form of the sensors, they provide continuous monitoring and abnormal situation detection, empowering people with chronic conditions to take charge of their health. Since normative acts are only carried out during sporadic doctor appointments, continuous monitoring is essentially impossible in typical health care systems. Healthy people will be able to track their progress using these sensors to check their health and make the required lifestyle changes.
5.2 Continuous Behavioral Monitoring Behavioural monitoring is another potential monitoring method in addition to physiological monitoring. Behavioral monitoring can be useful for both the monitoring of people with mental diseases and in facilities for assisted living. Such technologies have the ability to continuously and realistically measure a population’s cognitive and mental health. They can lessen carers’ stress by offering automated support. In some cases, a single activity may be tracked. As an example, Nam bu et al. monitored TV viewing to identify health issues. Most research projects merely keep track of a portion of daily activities. For instance, the CASAS project keeps track of a patient’s daily tasks to make sure they are completed on time and completely. The IMMED project uses a wearable camera to monitor instrumented activities of daily living (IADL) in dementia patients when their motor or cognitive abilities deteriorate. Identifying social interaction has been the focus of other studies, notably in senior care facilities. Any alterations in behaviour may be a sign of deteriorating cognitive or physical capabilities. Early signs of dementia include changes in movement patterns, walking speed, volume of outgoings, and sleeping patterns.
258
S. Samanta et al.
5.3 Assisted Living With the use of home automation and AmI technology, people with disabilities can live more independently, receive ongoing cognitive and physical monitoring, and, if necessary, receive real-time assistance. These therapies are particularly beneficial for older persons who are experiencing physical and cognitive deterioration. Most older persons use a variety of drugs, and because of cognitive deterioration, they frequently forget about medication dosage and scheduling. It is possible to give medicine reminders in a context-aware and adaptable way by using the appropriate contextual information gathered from various sensors. If non-compliance is found, care personnel may be called. For instance, John will not be reminded about his meds if he is on the phone or watching his favourite TV show. But as soon as he’s finished with breakfast, he’ll be reminded of them. Depending on the drug, John’s physician will be alerted automatically if he skips more than a certain number of doses. Though there has been significant advancement, context awareness in current medication management systems is still far from being complete.
5.4 Emotional Well-Being Recent discoveries in neuroscience and psychology have demonstrated the significance of emotions in many facets of our life, particularly with regard to our health and well being. The immune system can be negatively impacted by positive emotions, according to study. The three basic ways that people convey their emotions are through speech, body language, and facial expressions, as well as internal physiological changes like blood pressure, heart rate, or respiration. AmI sensor-based infrastructures could be a useful tool for recognising and controlling emotions as well as for enhancing wellness. The Wearable Acoustic Monitor (WAM) device, developed by McNaney et al., promotes several aspects of social and emotional well being by estimating social engagement levels and vocal emotional expressions. It can monitor and assess the wearer’s voice level to give a clue as to their emotional condition at any given time by recognizing vocal factors like amplitude, pitch, pace of speech, and pause length. It might affect how a person acts going forward if they have the capacity to think back on situations or experiences that are particularly gratifying or distressing. Affect Aura is an exciting concept that combines ambient sensors with AmI to evaluate emotional well being. Utilizing data from the webcam, kine ct sensor, microphone, electro dermal activity sensor, GPS, file activity sensor, and calendar scraper, this system continuously predicts user valence, arousal, and engagement. Users were permitted to use Affect Aura cues to remember specific events or the emotional undertones associated with them even after they had forgotten about them.
Ambient Healthcare: A New Paradigm in Medical Zone
259
5.5 Smart Hospitals The use of AmI technology can also assist other parties, such as nurses, physicians, and other healthcare professionals, especially in terms of enhancing interparty communication. Context-aware communication based on activity recognition is provided by Sanchez et al. Hospital’s project [22]. Contextual information is captured and used to enhance communication and decision-making with the related emotional tones, for example, the location, time, roles of persons present, and RID-tagged objects. Middleware for healthcare that is based on AmI has been the subject of some initiatives. Developed by Rodriguez et al., SALSA is an agent-based middleware [23] in order to make it simpler to fulfil the unique requirements of hospital employees and patients. Due to their excellent mobility, SALSA takes into account the doctors’ dispersed access style. A doctor must communicate with staff members scattered throughout the hospital, obtain patient clinical data, and use medical equipment that is available throughout the building. Rodriguez et al. create a signal propagation model and estimate the distance between mobile devices and access points using the radio frequency signal intensity to find people.
5.6 Reducing ICU Stay ICU expenses in the United States are $108 billion annually, or 16% of all hospital expenses. Additionally, illnesses acquired in the ICU have the potential to double the annual death rate. The constant and detailed knowledge provided by ambient sensors is essential for the early patient mobility in ICUs. These sensors can pick up on interactions with the physical surroundings or help from outside. Nurses can identify dementia in patients early by using AmI insights. Additionally, it may help to clarify the relationship between patient mobilization and healing. For instance, researchers discovered that sensors greatly reduced the number of head movements among st delirious ICU patients.
5.7 Monitoring for Emergency Detection Various initiatives to monitor emergencies have also been made. British Telecommute (BT) and Liverpool City Council in the UK developed a telecast technology initiative that keeps an eye on people using a variety of sensors, including PIR sensors [23]. The system checks with the occupants to see if there are any hazards before alerting the chosen professionals if not. Fall detection is a crucial component of emergency response and might be especially helpful to the elderly since falls are a significant cause of illness and mortality in this population. Wearable electronics, which also contain environmental sensors and cameras, is one example of the technology used in
260
S. Samanta et al.
fall detection systems [24]. Wearable fall detection systems monitor posture, acceleration, and orientation using gyroscope and accelerometer sensors [25, 26]. Systems can detect falls in the environment with the aid of ambient sensors, such as pressure and passive infrared (PIR) sensors. In order to detect potential falls, they also use techniques like floor vibration monitoring and ambient audio analysis [27, 28]. Last but not least, fall detection systems based on vision extract video elements such 3D motion, form, and inactivity [29, 30]. Other fall prevention solutions are also readily available. One example is the smart cane created by Wu et al. that categorises cane use and walking patterns and alerts the elderly when there is a high danger of falling [31]. The possibility of merging and fusing data from different sensors, such as physiological sensors with electronic health records (EHR) or daily activity data, should be acknowledged [32]. By employing continuous monitoring to spot diseases early, the healthcare system may shift from treating individuals’ ailments to preventing them. Because of this, fewer people will need to get care at a facility.
5.8 Therapy and Rehabilitation 1.5% of the world’s population, according to the World Health Organization’s (WHO) Disability and Rehabilitation Team, requires rehabilitation services [33]. The demands of rehabilitation, however, cannot be fully met by the level of medical technology and health care today. In these circumstances, AmI can develop innovative rehabilitation tactics that assist individuals in gaining access to resources for rehabilitation. This can be achieved by developing ad-hoc rehabilitation systems based on sensor networks and other technical advancements, such as robots and brain-computer interfaces (BCI). Technology utilising sensor networks has the potential to have a large impact on medical care, particularly rehabilitation [34]. For instance, Jarochowski et al. [35] advocate the setup of the Ubiquitous Rehabilitation Center, a system that combines a Zigbee-based wireless network with sensors that keep an eye on patients and rehabilitation tools. These sensors are coupled to Zigbee motes, which are then connected to a server programme that controls the entire facility and enables rehabilitation specialists to create prescriptions for patients. Piotrowicz et system’s is yet another [36] presents the specifications for a system for cardiac tele-rehabilitation at home, focusing on the many elements governing a physical exercise training session that must recognise and react properly to important patient conditions through continuous monitoring (based on AmI technology). The health-related information acquired during the tel-rehabilitation session helps cardiologists by giving them relevant patient care information. The rehabilitation programmes suggested by Helmer et al. have improved the quality of life for individuals with chronic obstructive pulmonary disease [37] (COPD). The system has a feature that enables automatic tracking of therapeutic workouts. It modifies the exercise’s goal load in light of this essential information.
Ambient Healthcare: A New Paradigm in Medical Zone
261
6 Case Study of Monitoring for Emergency Detection Various emergency monitoring activities have also been launched, including PIR sensors, to keep a watch on individuals. If there are any risks, the system asks the users if they are secure; if not, it notifies the appropriate specialists. Since falls are a large source of sickness and mortality in the aged, fall detection is vital to emergency intervention and can be especially beneficial to them. Biosensors, which also contain environmental sensors and cameras, are one type of technology used in fall detection systems Using accelerometer and gyroscope sensors, wearable fall detection systems gauge posture, acceleration, and rotation. Systems can detect falls in the environment with the help of ambient sensors like pressure and infrared (PIR) sensors. In order to identify probable falls, they also employ methods like floor vibration monitoring and ambient acoustic analysis. Last but not least, systems that detect falls based on vision extract video components such 3D motion, shape, and inactivity. Other fall prevention tools are also available, such as Wu et al cane’s which categorises cane use and movement patterns and informs the elderly when there is a high risk of falling. The potential for combining and fusing data from various sensors, such as sensing devices with electronic health records (EHR) or daily task data, must be understood. By employing continuous monitoring to detect diseases early, the healthcare system may shift away from treating people’ ailments and toward preventing them. This reduces the need for institutional care. This paper offers a method for ambient intelligence (AmI) platforms that facilitate real-time remote monitoring and patient emergency detection. The assimilation of personalized monitoring of the patient’s risk and health stage; (i) intelligent going to inform of the dedicated physician via on-the-fly medical process construction; and (ii) vibrant adaptation of the vital signs able to monitor the environment on any product for example or smartphone located close to the physician based on new medical measurement techniques, extra disease standards, or infrastructure failure are its distinguishing features. The intelligence comes from the application of semantics, which enables tailored and automated emergency alerting that communicates fluidly with the physician irrespective of his location, ensuring prompt response during an emergency. To validate our specific AmI framework, we designed a patient monitoring and doctor alert scenario. During typical operations, the AmI monitoring environment gathers and evaluates real-time patient data, indicating possible deviations from normal levels. When a particular threshold is achieved, the necessary physician is found and alerted (Fig. 4). Any surrounding devices capable of displaying vital signs are recognised, and a summary or thorough overview of the patient’s status depending on their features is shown. The clinician is then sent to a larger display to assess the vital signs in greater detail. When thresholds are exceeded, medical personnel are discovered and contacted, and ad hoc information about the patient’s condition is shown on any device within the physician’s reach.
262
S. Samanta et al.
Fig. 4 Depicts the identification and in an emergency, notify the relevant physician
7 Future Challenges of Ambient Healthcare 7.1 Artificial Intelligence Since the advent of modern computing in the 1950s, both scientists and medical professionals have been fascinated by the promise of artificial intelligence (AI) approaches in applications for medicine and healthcare. In fact, building and implementing medical applications in the following fields—diagnosis, prognosis, medical training, etc.—has heavily benefited from the use of traditional AI approaches like machine learning, expert systems, and knowledge representation techniques. The most popular type of artificial intelligence (AI) in medicine is expert or knowledgebased systems. They can utilise reasoning to get reasoned conclusions using “facts” from specific patients by applying, frequently, a set of rules based on some logic inference. They often have medical knowledge specific to a fairly focused task. Machine learning techniques can be used to analyse a collection of clinical cases and provide a systematic description of the clinical knowledge that characterises the clinical conditions of a particular patient or disease when there is insufficient data to create a medical expert system. The use of formal knowledge representation approaches has also been used in formal medical knowledge collecting and the implementation of tutoring programme for new physicians or nurses (such as ontologies). These are only a few instances of how artificial intelligence is being applied in the healthcare sector, but generally speaking, all AI-based health care systems entail the manipulation and transformation of data and information. Methods based on pervasive or ubiquitous hardware in this scenario will be highly helpful for increasing the current state of AI technology in healthcare. In fact, AmI features may make it possible for system developers to create intricate software architectures that can analyse the knowledge that permeates the environment and, as a result, learn distributed expert
Ambient Healthcare: A New Paradigm in Medical Zone
263
systems that classify diseases or other health disorders based on the environmental content and yield outcomes that are superior to those obtained by traditional data mining techniques.
7.2 Design and Human Factors The purpose of the next generation of AmI systems is to improve the quality of human life through the creation of ever-more inventive applications for the healthcare sector or, more discreetly, through the supply of high levels of comfort while wisely managing scarce resources. This benefit will be principally achieved by the creation of intelligent surroundings that are entirely covered in various types of sensors, whose wireless connections may increase the exposure of people to radio frequency electromagnetic fields. Wireless sensors create very small electromagnetic fields, yet prolonged exposure can have negative effects on one’s health. Recent studies have really established a direct link between wireless radiation exposure and an increased risk of auditory neuronal cancer, which affects the nerve that links the ears to the brain. Because of this, it is necessary to consider additional issues while designing the future generation of AmI systems, such as sensor location, sensor mobility, sensor radiation, and so forth. Future AmI frameworks will be completely capable of improving human health without causing harmful side effects if they adhere to these design principles. Only by working together can computer scientists, architects, physicians, physicists, and telecommunication engineers overcome this difficulty. To ensure that people have a useful and safe future, government regulatory agencies must make wise policy judgments.
7.3 Security and Infrastructure The abundance of information gathered by AmI systems can be useful in a variety of ways. However, it also raises a number of security concerns. Privacy and security are already extremely complicated issues, particularly for health care systems, and the inclusion of many sensors and devices will provide new difficulties. Wright et al. go over a number of AmI security concerns and urge the creation of safeguards. They outline many ominous scenarios where AmI might lead to significant security lapses. For instance, a burglar might gather information on an elderly person living alone, including his routine of leaving the house at a specific time, and use that information to infiltrate the smart home system, potentially putting his life in jeopardy. Access restrictions and a lack of interpreting ability could potentially be problems. For instance, outdated health monitoring equipment may cause paramedics to fatally misdiagnose a patient during an accident, or it may prevent paramedics from accessing such information if incompatible equipment is used in various geographic locations. Sensors employed in AmI systems could also be a cause for worry. For
264
S. Samanta et al.
instance, there have already been numerous security discussions involving RFID tags. The ability to identify an item using an RFID tag and trace it back to its owner raises a number of privacy concerns. Furthermore, all communication must be encrypted and made secure because many AmI sensors and devices will rely on wireless protocols. Wireless transmissions are much easier to intercept than conventional ones. In order to prevent data tampering, personal monitoring systems should use distinctive biometrics or important physiological signals to certify the identity of its users (owner-aware devices).
7.4 Socio-Ethical Issues The initial European Commission (ISTAG) report identified a number of qualities required for AmI systems in general to achieve social acceptance, including the capacity to promote interpersonal contact, a focus on enhancing communities and cultures, the ability to inspire confidence and trust, and the ability to be operated by regular people. Off switch ought to be nearby. If people with special needs rely too heavily on AmI systems, they run the risk of losing their confidence and capacity to govern their lives early. It is important to take precautions to make sure that AmI does not only help the wealthy because it might also be beneficial to the less fortunate. Another moral dilemma raised by several studies is the issue of deteriorating communication and patient isolation. Finding the root of a misdiagnosis will become increasingly difficult in such a complex system, leading to several ethical and legal problems.
8 Conclusion The computing of the future is exemplified by the AmI paradigm. AmI relies on a variety of computer and networking approaches to ensure the accurate gathering and interpretation of contextual information. A discreet and user-friendly user interface is also provided. Because of these characteristics, AmI systems have the potential to significantly improve many aspects of our daily lives. One of the industries that promises to adopt this revolutionary paradigm extensively is the healthcare sector. In this study, we investigated ambient intelligence’s use in healthcare from a number of perspectives. We discussed the possibility of employing AmI to treat people’s medical conditions, including long-term illnesses, mental or physical disabilities, or situations requiring rehabilitation. Today’s infrastructure and technology, such as wearable sensors, smart fabrics, smart settings, and helpful robots, were topics of discussion. What’s more, we provided a high-level review of a number of AmI procedures used in the healthcare sector, including automated decision-making, planning strategies, activity detection, and many more techniques.
Ambient Healthcare: A New Paradigm in Medical Zone
265
The study area is gaining momentum despite the fact that we are aware that the goals set for AmI in healthcare are challenging to accomplish and that there are still many challenges to be resolved. Researchers from several disciplines are advancing the state-of-the-art of AmI in healthcare by addressing fundamental issues with human aspects, intelligence design and implementation, security, social and ethical considerations, and other relevant areas. Therefore, we’re confident that this coordinated approach will realize AmI’s entire vision, including all of its applications to healthcare and human well being. Hospitals are made safer by ambient intelligence technologies, which help save healthcare costs by allowing staff more freedom and convenience while providing patient care. According to the value-based model of the Affordable Treatment Act in the USA, hospitals with higher quality treatment may have lower tax obligations, while those with worse ratings may face penalty. According to Health payer Intelligence, the Affordable Care Act rewards medical staff “for the quality of care they deliver as a component of the quadruple aim: improves patient care, promotes substantial health, decreases costs, and increases supplier satisfaction.” Healthcare systems that are aware of this possibility will undoubtedly benefit from incorporating ambient intelligence into their patient care settings.
References 1. European Commission Eurostat: Causes of death statistics 2011. Available from: http://epp. eurostat.ec.europa.eu/statistics_explained/index.php/Causes_of_death_statistics. 2. Sahoo, P. K., Mishra, S., Panigrahi, R., Bhoi, A. K., & Barsocchi, P. (2022). An improvised deep-learning-based mask R-CNN model for laryngeal cancer detection using CT images. Sensors, 22(22), 8834. 3. Srinivasu, P. N., Sandhya, N., Jhaveri, R. H., & Raut, R. (2022). From blackbox to explainable AI in healthcare: Existing tools and case studies. Mobile Information Systems, 20. Article ID 8167821. https://doi.org/10.1155/2022/8167821 4. Mishra, S., Thakkar, H. K., Singh, P., & Sharma, G. (2022). A decisive metaheuristic attribute selector enabled combined unsupervised-supervised model for chronic disease risk assessment. Computational Intelligence and Neuroscience. 5. Weiser, M. (1993). Some computer science issues in ubiquitous computing. Special issue on computer augmented environments: Back to the real world. Communications of the ACM, 36(7), 75–84. 6. Tapia, D. I., Abraham, A., Corchado, J. M., & Alonso, R. S. (2010). Agents and ambient intelligence: Case studies. Journal of Ambient Intelligence and Humanized Computing. https:/ /doi.org/10.1007/s12652-009-0006-2 7. Aarts, E., & Roovers, R. (2003). Embedded system design issues in ambient intelligence. In T. Basten, M. Geilen, & H. D. Groot (Eds.), Ambient intelligence: Impact on embedded system design (pp. 11–29). Kluwer. 8. Chen, M., Gonzalez, S., Vasilakos, A., Cao, H., & Leung, V. C. (2011, April). Body area networks: A survey. Mobile Networks and Applications, 16(2), 171–193. https://doi.org/10. 1007/s11036-010-0260-8. 9. Latré, B., Braem, B., Moerman, I., Blondia, C., & Demeester, P. (2011, January). A survey on wireless body area networks. Wireless Networks, 17(1), 1–18. 10. Lyytinen, K., & Yoo, Y. (2002). Issues and challenges in ubiquitous computing. Communications of the ACM, 45(12), 63–65.
266
S. Samanta et al.
11. Sivani, T., & Mishra, S. (2022). Wearable devices: Evolution and usage in remote patient monitoring system. In Connected e-Health (pp. 311–332). Springer. 12. Mohapatra, S. K., Mishra, S., Tripathy, H. K., & Alkhayyat, A. (2022). A sustainable datadriven energy consumption assessment model for building infrastructures in resource constraint environment. Sustainable Energy Technologies and Assessments, 53, 102697 13. Phillips Research. (2007). Ambient intelligence: Changing lives for the better. www.research. phillips.com/. 14. Rech, J., & Althoff, K.-D. (2004). Artificial intelligence and software engineering: Status and future trends. Themenschwerpunkt K & SE, KI, 3, 5–11. 15. Mishra, S., Jena, L., Tripathy, H. K., & Gaber, T. (2022). Prioritized and predictive intelligence of things enabled waste management model in smart and sustainable environment. PloS One, 17(8), e0272383. 16. Guleria, P., Ahmed, S., Alhumam, A., & Srinivasu, P. N. (2022). Empirical study on classifiers for earlier prediction of COVID-19 infection cure and death rate in the Indian states. Healthcare, 10(1), 85. https://doi.org/10.3390/healthcare10010085 17. Praveen, S. P., Jyothi, V. E., Anuradha, C., VenuGopal, K., Shariff, V., & Sindhura, S. (2022). Chronic kidney disease prediction using ML-based neuro-fuzzy model. International Journal of Image and Graphics, 2340013. https://doi.org/10.1142/S0219467823400132 18. Gao, T., Massey, T., Selavo, L., Crawford, D., Rong Chen, B., Lorincz, K., Shnayder, V., Hauenstein, L., Dabiri, F., Jeng, J., Chanmugam, A., White, D., Sarrafzadeh, M., & Welsh, M. (2007, September). The advanced health and disaster aid network: A light-weight wireless medical system for triage. IEEE Transactions on Biomedical Circuits and Systems, 1(3), 203– 216. 19. He, D., Chen, C., Chan, S., Bu, J., & Vasilakos, A. (2012, July). Retrust: Attack-resistant and lightweight trust management for medical sensor networks. IEEE Transactions on Information Technology in Biomedicine, 16(4), 623–632. 20. Pauwels, E., Salah, A., & Tavenard, R. (2007, October). Sensor networks for ambient intelligence. In Proceedings of IEEE 9th Workshop Multimedia Signal Processing, October 2007, pp. 13–16. 21. Mishra, S., Panda, A., & Tripathy, K. H. (2018). Implementation of re-sampling technique to handle skewed data in tumor prediction. Journal of Advanced Research in Dynamical and Control Systems, 10(14), 526–530. 22. Guo, W. W., Healy, W. M., & Zhou, M. (2011, March). Wireless mesh networks in intelligent building automation control: A survey. International Journal of Intelligent Control Systems, 16(1), 28–36. 23. Mishra, S., Dash, A., & Mishra, B. K. (2020). An insight of Internet of Things applications in pharmaceutical domain. In Emergence of pharmaceutical industry growth with industrial IoT approach (pp. 245–273). Academic Press. 24. Mishra, S., Tripathy, H. K., & Acharya, B. (2021). A precise analysis of deep learning for medical image processing. In Bio-inspired recomputing (pp. 25–41). Springer. 25. Wu, G., & Xue, S. (2008). Portable pre impact fall detector with inertial sensors. IEEE Transactions on Neural Systems Rehabilitation Engineering, 16(2), 178–183 [Pub Med] [Google Scholar]. 26. Lai, C., Chang, S., Chao, H., & Huang, Y. (2011). Detection of cognitive injured body region using multiple trickily accelerometers for elderly falling. IEEE Sensors Journal, 11(3), 763–770 [Google Scholar]. 27. Zhuang, X., Huang, J., Potamianos, G., & Hasegawa-Johnson, M. (2009). Acoustic fall detection using Gaussian mixture models and gem super-vectors. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009 (pp. 69–72). IEEE [Google Scholar]. 28. Alwan, M., Rajendran, P., Kell, S., Mack, D., Dalai, S., & Wolfe, M., & Felder, R. (2006). A smart and passive floor-vibration based fall detector for elderly. In 2006 2nd International Conference on Information and Communication Technologies, ICTTA’06 (Vol. 1, pp. 1003– 1007). IEEE [Google Scholar].
Ambient Healthcare: A New Paradigm in Medical Zone
267
29. Mukherjee, D., Tripathy, H. K., & Mishra, S. (2021). Scope of medical bots in clinical domain. In Technical advancements of machine learning in healthcare (pp. 339–363). Springer. 30. Shi, G., Chan, C., Li, W., Leung, K., Zou, Y., & Jin, Y. (2009). Mobile human airbag system for fall protection using MEMS sensors and embedded SVM classifier. Sensors, 9(5), 495–503 [Google Scholar]. 31. Wu, W., Au, L., Jordan, B., Stathopoulos, T., Batalin, M., Kaiser, W., Vahdatpour, A., Sarrafzadeh, M., Fang, M., & Chodosh, J. (2008). The smartcane system: An assistive device for geriatrics. In International Conference on Body Area Networks (pp. 1–4) [Google Scholar]. 32. Haux, R. (2006). Individualization, globalisation and health about sustainable information technologies and the aim of medical informatics. International Journal of Medical Informatics, 75, 795–808 [PubMed] [Google Scholar]. 33. Jena, K. C., Mishra, S., Sahoo, S., & Mishra, B. K. (2017, January). Principles, techniques and evaluation of recommendation systems. In 2017 International Conference on Inventive Systems and Control (ICISC) (pp. 1–6). IEEE. 34. Patel, S., Park, H., Bonato, P., Chan, L., & Rodgers, M. (2012). A review of wearable sensors and systems with application in rehabilitation. Journal of Neuro Engineering and Rehabilitation, 9(1), 21 [Online]. Available: http://www.jneuroengrehab.com/content/9/1/21 [PMC free article] [PubMed] [Google Scholar]. 35. Jarochowski, B. P., Shin, S., Ryu, D., & Kim, H. (2007). Ubiquitous rehabilitation center: An implementation of a wireless sensor network based rehabilitation management system. In Proceedings of the 2007 International Conference on Convergence Information Technology, ser. ICCIT‘07 (pp. 2349–2358). IEEE Computer Society, Washington, DC, USA. [Online]. https://doi.org/10.1109/ICCIT.2007.383 [Google Scholar]. 36. Piotrowicz, E., Jasionowska, A., Banaszak-Bednarczyk, M., Gwilkowska, J., & Piotrowicz, R. (2012). ECG telemonitoring during home-based cardiac rehabilitation in heart failure patients. Journal of Telemedicine and Telecare, 18(4), 193–197 [PubMed] [Google Scholar]. 37. Helmer, A., Song, B., Ludwig, W., Schulze, M., Eichelberg, M., Hein, A., Tegtbur, U., Kayser, R., Haux, R., & Marschollek, M. (2010). A sensor-enhanced health information system to support automatically controlled exercise training of COPD patients. In 2010 4th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth), March, pp. 1–6. NO PERMISSIONS [Google Scholar].
Illuminating Unexplored Corners in Healthcare Space Using Ambience Intelligence Sagnik Ghosh, Dibyendu Mehta, Shubham Kumar, Sushruta Mishra, Baidyanath Panda, and Naga Srinivasu Parvathaneni
Abstract The term “Ambient Intelligence” describes physical environments that are perceptive to and responsive to human presence. It has been made possible by developments in machine learning and contactable sensors. In this paper, we will discuss about how AmI technology might assist people with various physical, mental or chronic illnesses. We examine how this technology could enhance our comprehension of the hidden areas of healthcare. Early implementations in hospitals may soon allow for more effective clinical processes and raise patient safety in operating rooms and intensive care units. We focus on the four key facets of reliability, that is, privacy, fairness, transparency and research ethics. We will also concentrate on a few technological problems and potential solutions associated with acquiring knowledge from a huge cluster of data and detecting unexpected events in clinical settings. By carefully utilizing this technology, we would be able to comprehend the intricate interactions between the crucial human behaviours and physical environment. Keywords Ambience Intelligence · Artificial Intelligence · Healthcare · Sensory computing · Ubiquity · Big data
1 Introduction Visualize a future when a small gadget continuously assesses your health, diagnoses any potential illnesses, engages in discussion with you to urge you to make lifestyle changes for improved health and, if necessary, interacts with your doctor. It might be S. Ghosh · D. Mehta · S. Kumar · S. Mishra (B) Kalinga Institute of Industrial Technology, Deemed to Be University, Bhubaneswar, India e-mail: [email protected] B. Panda LTIMindtree, 1 American Row, 3Rd Floor, Hartford, CT 06103, USA N. S. Parvathaneni Department of Computer Science and Engineering, Prasad V Potluri Siddhartha Institute of Technology, Vijayawada 520007, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_12
269
270
S. Ghosh et al.
built into the fibers of your everyday clothing as small sensors and it could connect with other gadgets around, like the many sensors built into your home to keep tabs on your daily activities [1]. For instance, based on the items in your fridge and the food you usually eat outside, you could worry that you don’t have a balanced diet. While these possibilities may currently seem like science fiction, many respected experts working in the Ambient Intelligence field believe they will become a reality in the not-too-distant future [2]. The future of intelligent computing is represented by the Ambient Intelligence paradigm. The traditional input and output medium are no longer used in this new computer paradigm; instead, sensors and processors will be incorporated into regular items, cooperating to assist the occupants [3]. AmI claims to successfully analyse the quantity of contextual data gathered from such embedded sensors by utilising a variety of Artificial Intelligence algorithms. AmI will also transparently and proactively modify the environment to the demands of the user. There are various traits that help to specifically identify an AmI system: . Context Aware: It makes use of situational and contextual data. . Transparency: It quietly blends into the backdrop of our regular activities. . Anticipatory: Without the individual’s conscious intervention, it may predict their requirements. . Ubiquity: It is incorporated and ingrained in our daily settings. . Adaptive: It adjusts to people’s shifting demands. . Personalized: Each person’s demands are taken into account. Main Highlights of the study are as follows: . Emphasis on issues faced in healthcare and providing a suitable solution for the stakeholders relating to the analysis of human behaviour in complicated contexts and learning from big data. . Provide home-assisted living arrangements to the elderly people that enable remote management of life-supporting services and biometric data. . Proposing a Disease Management System using Ambient Intelligence. In this, medical sensors transmit health records through device applications which is captured by the network and passed on to the server. The server searches for the best query and returns it to the network and then finally back to the application. . Monitoring human behaviour, recognizing unusual events and acquiring knowledge from big data. The organization of the paper is as follows. Section 1 states the recent trends that are causing a significant shift in healthcare industry. Section 2 briefs some important literature survey related to ambient intelligence in health sector. Section 3 explains how disease risks can be managed using ambient intelligence. Section 4 highlights various categories of stakeholders in the healthcare industry that have been recognised in addition to these worldwide trends and problems. Section 5 discusses some basic functional prerequisites to use ambient intelligence in healthcare. Section 6 presents a novel disease management framework using ambient computing. Section 7 depicts some common usecases where this computing can be used in regular life. Section 8
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
271
discusses some ethical concerns of the domain. Section 9 gives feasible solutions to overcome the technical concerns. Section 1 deals with constraints in using ambient healthcare. Section 11 concludes the topic.
2 Related Works and Background Study In both academia and business, several AmI applications have evolved for healthcare. This section presents frameworks from both the scientific and practical worlds to examine background studies on certain important applications. Gouaux et al. talks about a wearable personal ECG monitoring (PEM) equipment for early cardiac event detection, that finds abnormalities and reports them by creating various alert levels. There are various commercially accessible health monitoring gadgets these days, such as HealthBuddy by Bosch, TeleStation by Philips, HealthGuide by Intel and Genesis by Honeywell. Several academic initiatives have been attempted to incorporate monitoring technologies with textiles, including the WEALTHY project, BIOTEX project and MagIC project. Sometimes while diagnosing health concerns, it may be necessary to track just one behaviour, for example, Nambu et al. monitor watching TV. Most research studies only keep track of a portion of everyday activities. For instance, the CASAS project keeps track of a portion of a patient’s daily chores to determine if they are consistent and comprehensive. Additionally, there are certain fall prevention solutions available, like the smart cane created by Wu et al., that categorizes cane usage and walking styles and alerts the elderly when there is a high danger of falling. Another area where the elderly people might be benefitted greatly is medication management. Chumkamon et al. created a tracking system for blind people’s interior navigation using RFID tags. In order to improve navigation, Jinying et al. embedded RFID tags into the blind path’s tile. Some systems also employ audio interface to tell users the names of crucial areas, e.g., the SAWN system. Several applications are available to facilitate customary jobs like shopping, e.g., the ShopTalk project. The Ubiquitous Rehabilitation Center system, that combines a Zigbee-based wireless network with sensors for monitoring patients and rehabilitation equipment, was suggested by Jarochowski et al. These sensors are connected to Zigbee motes, which connect to a server programme that controls every part of the rehabilitation facility and enables rehabilitation professionals to provide patients’ prescriptions. Another system developed by Piotrowicz et al. discusses the necessities of a home cardiac tele-rehabilitation system and particularly, the components controlling a physical exercise training session, that needs to observe and analyze the state of a critical patient through constant monitoring and react according to it.
272
S. Ghosh et al.
3 Disease Management with Ambient Intelligence . Population ageing and demographic shifts: Developed nations must contend with an ageing population and rising standard of living. In 2050, 22% of people in the world would be 60 years of age or older, according to research released by the UN’s Department of Economic and Social Affairs. Particularly Europe will be impacted since 35% of its population will be 60 years of age or older. Particularly elderly citizens will require medical help [4]. . Challenges due to increase in cost: Some social healthcare systems in affluent nations are severely straining as a result of demographic transition because declining or constant tax revenues cannot support rising healthcare expenses brought on by an older population. Population growth in emerging nations is growing, which results in an overall rise in healthcare demand [5]. As a result, both the demand for medical care and its overall cost have already begun to rise sharply. . Urbanization-related impacts to the environment: Because of greater career chances and a possible higher quality of life, more and more individuals are relocating from rural to urban locations [6]. Air pollution and other harmful environmental effects brought on by urban overpopulation would increase the number of people who suffer from deadly ailments including respiratory illnesses. . Environmental changes due to industrialization and climate change: Scientists have seen a drastic change in the climate and an increase in natural catastrophes because of the earth’s rising temperature, which is brought on by CO2 emissions and industrial pollutants. Cardiovascular and respiratory disorders might become more prevalent over time [7].
4 Critical Stakeholders in Healthcare Space Each group of stakeholders exhibit various trends and challenges, regarding their tasks and goals. . Patients who need treatment are eager to participate in medical decisions and have a higher need for screenings for cancer and other preventive tests. A significant portion of patients are prepared to pay extra for their health and wellbeing. . Cost bearers, such as insurance firms who pay for medical treatment costs, are scrambling to find innovative methods to cut costs as a result of skyrocketing health prices. . Medical support, such as physicians, nurses, hospitals and nursing homes, are dealing with increasing expenses and a shortage of capacity [8]. As a result, they must apply innovative techniques for providing medical treatment and support in order to maximize their resource allocation and boost efficiency.
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
273
. Government organizations are finding it difficult to maintain the quality of the national health systems despite rising expenses. However, the changing demographics and the severe cost pressure present an opportunity for businesses providing services in the healthcare industry to develop novel ideas like integrated supply, medical wellness preventive and supported living. Particularly hospitals and health insurance providers are on the lookout for cutting-edge ideas and IT solutions that may assist control expenses and the growing patient population [9]. Therefore, one way to cut down costs and keep the expanding patient population manageable at a minimum stable quality level of medical care seems to be through disease management strategies that support assisted living, enhance medical treatments, offer patient monitoring and boost the effectiveness within the health systems.
5 Functional Requirements in Clinical Domain People who require daily support, especially the elderly and those with restricted capacities, might benefit from ambient assisted living. Such home-assisted living arrangements require specialized IT systems that enable the remote management of life-supporting services and biometric data, allowing medical experts like physicians and nurses to oversee and direct caregiving activities [10]. For this, the systems composed of devices, services and applications should: – enable transmission of information through a variety of communication channels dependent on a dynamic networking environment. – support for seamless interconnection of devices, including wireless sensors, consumer devices and medical equipment. – implement environmental safety and security, even though this is a vast issue, especially in the health sector. – provide conveniences like recognizing and dealing with a patient’s situation, including any critical conditions. The ideal situation would be for an underlying system platform to address such needs while dealing with individual applications for various objectives in healthcare. A platform needs to cope with a huge variety of devices. Medical technology is widely available nowadays [11]. Some have wired or wireless interfaces (I/R or Bluetooth, with future devices perhaps supporting developing protocols such Wireless USB). It is important to model bio-sensors so they may be used for seamless integration. Additionally, it should be remembered that managing numerous diseases would require managing several sensors at once. Dealing with the multiplicity of low-footprint (in terms of energy consumption and resources used) protocols implemented on standard bio-sensors is still a challenge in today’s world. Beyond multiprotocol compatibility,
274
S. Ghosh et al.
the platform’s strong communication mechanisms will need to offer sophisticated synchronisation techniques that allow for brief offline conditions. Contextual awareness features will need to facilitate federated or at least decentralised decision making that is supported by centralised control. In order to provide prompt response in an emergency situation and to monitor the service level for invoicing and other purposes, the platform will also need to handle dependability issues.
6 Disease Regulation and Healthcare Scenario Clinicians and developer users create a wide range of smart low-power sensors for wireless, self-configuring body networks that connect to existing healthcare infrastructure [12]. Doctors heavily depend upon remote information access to provide diagnoses and analyse long-term risks as the systems are dependable and secure [13]. The aforementioned claims are represented in the following aspects of a typical illness management program: . Application configuration: The physical setup of the measuring environment must be configured in order to determine different application settings, such as the circumstances under which alarms should be raised. Such circumstances could create a situation that, when met, calls for particular measures. . Monitoring: Biodata is sensed and then transported automatically to a background system that records the data and grants secured access to privileged users, such as medical professionals. The data can be transferred through a variety of various communication routes, such as asynchronous SMS transfers over cellular networks or regular stationary internet [14]. . Device configuration: The sensors and other applications, must be set up in a practical and ideally automated manner before doing any measurements. . Management and analysis of data: Data needs to be stored locally and sent in bulk to be synced with the background system in order to be ready for temporary offline scenarios. The analysis of the data gathered will be made available to patients who have received the necessary education as well as health professionals. The devices and their interactions are depicted in Fig. 1. A web application running on the controlling device, which is hosted by a web application server, initiates the entire process of configuring the measurement environment. The included sensor and the meta-data of proxy devices (information about device settings) are managed by the web application server, which also manages data bases for storing the measured data. Additionally, the application server is linked to the proxy device in both directions by a network, so that once the user launches the web application, the application server manages the configuration of the chosen sensor as well as the proxy and starts the asynchronous discovery of sensing devices on the proxy. This necessitates the prior installation of the device programme that utilises the proxy. Assumedly, the application for the device is built as a proxy service that has been customized
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
275
before being deployed to the proxy so that it is aware of the user and the URL of the application on the application server [15]. As a result, there is a one-to-one connection between the user and the proxy service. Table 1 highlights different sensory information aggregated using Ambient Intelligence in healthcare sector.
Fig. 1 A typical disease management system using ambient intelligence
Table 1 Sensory data gathering for ambient computing in healthcare Thermal sensor
Camera
Audio sensor
Depth sensor
Functions
Measures the temperature of an object
Uses visible light to perceive the world
Recognises sound waves by their intensity and transforms them into electrical signals
Recognises the presence of any object nearby and calculates the distance to it
Uses
Determines a patient’s body temperature
Person detection, object identification
Event detection, voice recognition
Motion tracking, 3D object detection
Sensor used
IR sensor
Ultrasonic proximity sensors
Electromagnetic Articulography (EMA)
Light Detection and Ranging (LiDAR)
276
S. Ghosh et al.
7 Ambient Intelligence and Its Usage in Daily Life There are two vital use cases where Ambient Intelligence can be helpful in managing daily routine life activities in fulfilling societal needs. . Chronic disease management: Gait analysis is a crucial technique for diagnosing illness and gauging the effectiveness of treatment, with applications in both physical rehabilitation and the management of chronic conditions [16]. For instance, regular and reliable gait analysis might help cerebral palsy patients recover more quickly after surgery or allow for up to 4 years earlier Parkinson’s disease identification. Gait analysis is increasingly being done with wearable technology, a practise that was formerly exclusive to research labs with motion capture equipment. In one research, 30 individuals with chronic lung illness had their 6-min walking distance estimated using accelerometers. A mean absolute error rate of 6% was discovered by the investigation. One drawback of wearables is that they should be physically linked to the body, which makes them uncomfortable for patients. As an alternative, contactless sensors might be used to develop interactive, home-based rehabilitation programmes and continually assess gait with increased accuracy. With the use of cameras, depth sensors, radar and microphones, several investigations analysed gait in unstructured environments. Nine Parkinson’s disease patients’ walking patterns were measured in one research using depth sensors. It discovered that vertical knee movements may be precisely tracked by depth sensors to within four centimetres using a top-of-the-line motion capture technology as the ground truth [17]. Another study developed an exercise game for people suffering from cerebral palsy using depth sensors. According to the Tinetti test, patients who played the game for 24 weeks had an 18% improvement in their balance and gait. These investigations looked at a single sensor modality, despite being optimistic. In laboratory tests, adding microphones to wearable sensors increased gait detection by 3–7%. Studies might look at how different sensing modalities, such wearable cameras, touch sensors and passive infrared motion sensors, might work together when possible. . Elderly lifestyle: By 2050, the number of people aged 65 and older, will rise from 650 million to 1.4 billion. Activities of daily living (ADLs), such as eating, dressing and taking a shower, are essential for this population’s independence and well-being. ADL impairment is linked to upto a fivefold increase in one-year death rate and a twofold increase in falling risk. Early identification of deficits may offer the chance to deliver prompt therapeutic therapy, possibly enhancing ADL performance by a factor of two. ADLs are currently evaluated manually by caregivers or through self-reported questionnaires, despite the fact that these evaluations are rare and biased. In contrast, wearable technology (such as accelerometers or ECG sensors) may monitor not just ADLs but also heart rate, blood sugar levels and breathing rate [18]. Wearable technology cannot, however, determine if a patient got ADL support, which is a
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
277
Fig. 2 Ambient Intelligence for elderly citizens
crucial aspect of ADL assessments. Contactless ambient sensors may be able to detect a wider variety of activities while also recognising these clinical subtleties. A sample demonstration is shown in Fig. 2. (a) Elderly home is equipped with one ambient sensor. The green frustum represents the sensor’s coverage area or the sensor’s range for audio sensors and field of vision for visual sensors (camera). (b) An Ambient Intelligence algorithm processes the sensor’s thermal and depth data to classify activities such as walking, sleeping, etc.
8 Social and Ethical Issues Despite the growing volume of research on trustworthy AI, we focus on four distinct aspects of trustworthiness: privacy, fairness, transparency and research ethics [19]. Experts in computer science, medical, law and public policy must work closely together to develop the technology while considering the following factors: . Privacy: Ambient sensors are intended to continually monitor their surroundings and can reveal fresh details about how actual human behaviours affect the provision of healthcare. For instance, sensors can detect vital signals far away. Such information may be exploited to infer personal medical issues, despite being handy. People all across the world are becoming increasingly sensitive to mass data collection, which is raising concerns about the confidentiality, sharing and preservation of this information [20]. Therefore, it is crucial to collaboratively develop this technology considering the privacy and security, not just with regard to the technology itself but also with regard to ongoing participation from all stakeholders throughout the development process. We offer some new and established privacy-preserving methods. De-identification of data involves deleting
278
S. Ghosh et al.
individual identities as one way. The reduction of data collection, transmission and human bycatch is another strategy. When a patient is not present in a hospital room, an ambient system could halt. Even if data are de-identified, it can still be feasible to re-identify a specific person. Super-resolution techniques have the potential to enable re-identification by partially undoing the effects of face blurring and dimensionality reduction approaches. Accordingly, data should continue to be stored locally on devices to lower the possibility of illegal access and reidentification. Figure 3 highlights some basic ethical issues concerned with the study. Complexities in law and society will unavoidably develop. There are instances when businesses were compelled to give law enforcement data from ambient speakers and cameras. The subject of when accidental discovery beyond the crime site, like unintentional admissions, should be publicised is raised despite the fact that these devices were found inside probable crime scenes. In connection with data sharing, certain healthcare companies have divulged patient data to other parties like data brokers [21]. Patients can proactively ask healthcare professionals to utilise privacy-preserving procedures in order to reduce this. Additionally, to create governance frameworks for ambient systems, physicians and technologists must interact
Fig. 3 Ethical issues in healthcare with Ambient Intelligence
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
279
with crucial stakeholders, including patients, family members and carers, legal professionals and policymakers [22]. . Fairness and bias Large patient populations will be interacted with by Ambient Intelligence, perhaps on a scale several orders of magnitude greater than what is now possible for physicians. This forces us to carefully consider whether ambient systems are fair. Numerous study communities have studied the complicated and multifaceted subject of fairness. Dataset bias and model performance are two aspects of algorithmic fairness that we address here as examples. The majority of machine-learning systems are built on labelled datasets. But even before deep learning, medical datasets were skewed. Certain populations may experience worse clinical results as a result of these biases. Algorithms may read an individual’s complete record incorrectly if some features are absent, whether due to restrictions on data collecting or social issues, leading to increased degrees of predicted inaccuracy [23]. Analysing model performance across multiple groups is one way to spot bias. In one investigation, error rates for estimating 30-day psychiatric readmission rates differed between ethnic groups. Testing for equivalent sensitivity and positive predictive value would be more stringent. However, because certain populations may have innate physiological variances, identical model performance may not result in equivalent clinical results. Nevertheless, efforts are made to mitigate bias. . Consent The rights and considerations of participants in studies using Ambient Intelligence for data collection are the same as those of patients in other types of human research. People would need to be informed of the various uses of their data when deciding whether to join in the study, including how their data may be used for specific research happening now, future research initiatives and prospective partnerships with other researchers [24]. The patient and their family should be informed that the sensor data cannot be anticipated to offer warnings of real-time patient concerns since a significant amount of time may pass between the recording of sensor data and its examination, as well as other potential expectations surrounding the data. Patients should be aware that their enrolment in the study or withdrawal from the trial will not have an impact on their care (unless that is how the study is intended to proceed). Additionally, patients need to be aware that their care staff is not their research team and would not annotate their data [25]. If the institutional review board determines that there is little risk to participants, that it would be impractical to conduct the study without a waiver, that the waiver will not have an impact on the participants’ rights or welfare. Additional information concerning the involvement will be provided to the participant, after which the informed consent requirement may be waived. If patient or participant privacy concerns account for the majority of the risk, many Ambient Intelligence initiatives might likely be categorised as minimal risk. In contrast, alternative project designs or circumstances may include more access to health data or more significant privacy implications, posing a danger to participants that would require their full consent.
280
S. Ghosh et al.
On the forms or paperwork presented to patients, some hospitals or other healthcare facilities may already include notification of, or consent for research. Therefore, various consent procedures that apply at the institution may need to be taken into account for Ambient Intelligence initiatives. . Transparency The effect of human behaviour on healthcare delivery can be uncovered using Ambient Intelligence. Researchers may be surprised by these findings, in which case physicians and patients should trust the results before using them. Instead of producing opaque and black-box models, Ambient Intelligence systems should produce interpretable, descriptive, predictive and useful results. This can help with the difficult work of winning over stakeholders, since technical ignorance and model opacity can hinder the application of Ambient Intelligence in healthcare. The algorithm is not the only thing that is transparent. It would be possible to take special precautions for future uses, including educating human annotators or changing a research’s inclusion and exclusion criteria, if a dataset’s transparency—a thorough trail of how a dataset was conceived, gathered and annotated—could be demonstrated. The creation of official transparency regulations is actively being developed. Model cards are succinct studies that compare the algorithm across populations and describe assessment processes. . Research ethics The protection of human participants, independent evaluation and public benefit are a few examples of the themes covered by ethical research. Respect for humans is a core tenet of laws governing research involving human subjects. This presents itself in study as participants’ informed consent. However, if the research poses negligible risks to participants or obtaining agreement is not practicable, some policies permit research to proceed without it. Due to automated de-identification methods, gaining informed permission for extensive Ambient Intelligence investigations might be challenging and, in some situations, impossible. Deliberative democracy or public participation can be alternatives in certain situations. It may result in possible conflicts of interest to conduct ethical research by relying entirely on the honesty of the lead investigators. Academic research involving human subjects must have institutional review board permission in order to reduce this danger. Independent evaluation is not necessary for public health surveillance, which aims to enhance health and prevent the spread of illness. Ambient Intelligence might be either, depending on the application. Researchers are asked to confer with legal and ethical professionals to decide the best course of action for safeguarding all human participants while maximising the benefit to society [26].
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
281
9 Technical Concerns and Methods to Overcome In this article, we will focus on some technical issues and potential solutions relating to the analysis of human behaviour in complicated contexts, learning from big data and recognising unusual events in clinical settings. . Behaviour recognition Research spanning several fields of Artificial Intelligence, such as human position estimation and models of human-object interaction, is necessary to comprehend complicated human behaviours in healthcare settings. Think about hospital morning rounds. Each patient in a hospital unit is routinely reviewed and visited by up to a dozen physicians. Doctors may block the sensor’s view of a patient during this time, thereby causing life-threatening actions to go unnoticed. Tracking algorithms can determine the location of an item after occlusion if it was moving previously. Matrix completion techniques, like picture inpainting, can fill in the space behind longer occlusions. Spectrograms of audio may be denoised using similar methods. Ambient Intelligence must comprehend how individuals relate to things and other people. In the form of a scene graph, one class of techniques is to find visually grounded relationships in pictures. A scene graph is a network of connected nodes where each node is an object in the image and each link shows how those objects are related to one another. Scene graphs might not only improve the understanding of human behaviour but also increase the transparency of Ambient Intelligence. . Learning from big data Petabytes of data will be generated by ambient sensors in homes and hospitals. This necessitates the creation of brand-new machine-learning techniques that can handle huge data and model unusual events. A massive cluster of specialised hardware may be required to train large-scale activity-understanding models, which may take days. Although expensive, cloud servers are a viable answer for Ambient Intelligence since it may need a lot of storage, processing power and network bandwidth [27– 29]. Training time may be shortened via enhanced gradient-based optimizers and neural network topologies. Fast model training does not, however, ensure that the model will be speedy during inference. For instance, activity recognition models that use video are typically slow, operating at 1–10 frames per second. It may be challenging for even highly optimised models with 100 frames per second to analyse terabytes of data per day. Quantization and model compression are two methods that can lower the amount of storage and processing needed. Some techniques rapidly select portions of interest, known as proposals, rather than analysing audio or video
282
S. Ghosh et al.
at full spatial or temporal resolution. The heavy-duty modules are then given these ideas for the extremely precise yet computationally demanding activity recognition. . Clinical behaviors Even though environmental sensors provide a lot of data, some clinical occurrences are uncommon and infrequent. Understanding these health-critical behaviours necessitates the discovery of these long-tail occurrences. Think about the fall detection illustration. The algorithm is biased because of label imbalance because the bulk of ambient data reflects normal behaviour. In a broader sense, statistical bias can relate to any type of data, including characteristics of protected classes. One option is to statistically calibrate the algorithm, which will provide error rates that are constant for all of the supplied characteristics. However, compared to the initial training set, there may be more falls in particular healthcare situations. This necessitates generalisation, which is an algorithm’s capacity to function on unknown distributions. Transfer learning, an alternative to training a model made for all distributions, involves using an existing model and optimising it for the new distribution. Another approach, called domain adaptation, is to lessen the discrepancy between the distributions used for training and testing, frequently by improving feature representations. Few-shot learning, which allows algorithms to learn from just one or two instances, may be employed by healthcare practitioners with limited resources.
10 Limitations of Using Ambient Intelligence in Healthcare . Privacy: Researchers working on Ambient Intelligence applications need to carefully consider a number of project-related factors, such as the environments in which sensing data will be collected, the kinds of data that the sensors may be able to capture, the conclusions that may be drawn from that data and what design precautions may be necessary to protect that data, especially given that efforts to deidentify information may not be as complete as is sometimes imagined. . Management of data: Supervision of the data is an important privacy principle in human research. Effective supervision includes ensuring that only members of the research team have access to the study data, that team members are trained in data privacy and security and have signed privacy agreements with the sponsoring institutions and that data practises include limiting access to fully identifiable data as much as possible. . Use of sensors: Ambient intelligent applications rely upon wireless sensors, thus consuming too much battery life of sensors.
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
283
11 Conclusion The paradigm of Ambient Intelligence offers a glimpse into computing’s future. It guarantees the efficient gathering and analysis of contextual data by utilising a number of computer and networking approaches. It also provides a discrete and user-friendly user interface. AmI systems have the ability to improve many parts of our daily lives because to these capabilities. The healthcare industry is one of the sectors that promises to make extensive use of this ground-breaking paradigm. In this study, we covered a variety of angles on how Ambient Intelligence is being used in the healthcare industry. Depending on the medical issues that individuals face, such as physical or mental restrictions, chronic illnesses or circumstances needing rehabilitation, we discussed employing AmI in healthcare. We spoke about the infrastructure and technology of today, including wearable sensors, smart textiles, smart settings and helpful robots. What’s more, we provided a high-level review of several AmI strategies used in the healthcare sector, including automated decision making, planning techniques, activity recognition, and many more techniques. Although there are still many challenges to be solved, we are aware that the goals set for AmI in healthcare are challenging to attain; as a result, this research field is gaining more and more momentum. Researchers from many backgrounds are advancing the current state of the art of AmI in healthcare by addressing fundamental issues with human factors, intelligence design and implementation, security, social and ethical considerations and other pertinent themes. Thus, we are certain that this synergistic strategy will bring AmI’s full vision to fruition, including all of its applications to healthcare and human well-being.
References 1. Sadri, F. (2011). Ambient intelligence: A survey. ACM Computing Surveys, 43, 36. 2. Dobre, C., Mavromoustakis, C. X., Garcia, N., Mastorakis, G., & Goleva, R. I. (2016). Introduction to the AAL and ELE systems. In C. Dobre, C. X. Mavromoustakis, N. Garcia, R. I. Goleva & G. Mastorakis (Eds.), Ambient assisted living and enhanced living environments: Principles, technologies and control (1st ed.). Butterworth-Heinemann. ISBN 978-0-12-805195-5 3. Yin, J., Yang, Q., & Pan, J. J. (2008). Sensor-based abnormal human-activity detection. IEEE Transactions on Knowledge and Data Engineering, 20, 1082–1090. 4. Meng, L., Miao, C., & Leung, C. (2017). Towards online and personalized daily activity recognition, habit modeling, and anomaly detection for the solitary elderly through unobtrusive sensing. Multimedia Tools and Applications, 76, 10779–10799. 5. Rashidi, P., & Mihailidis, A. (2013). A survey on ambient-assisted living tools for older adults. IEEE Journal of Biomedical and Health Informatics, 17, 579–590. 6. Sahoo, P. K., Mishra, S., Panigrahi, R., Bhoi, A. K., & Barsocchi, P. (2022). An Improvised deep-learning-based mask R-CNN model for laryngeal cancer detection using CT images. Sensors, 22(22), 8834. 7. Guleria, P., Naga Srinivasu, P., Ahmed, S., Almusallam, N., & Alarfaj, F. K. (2022). XAI framework for cardiovascular disease prediction using classification techniques. Electronics, 11(24), 4086. https://doi.org/10.3390/electronics11244086
284
S. Ghosh et al.
8. Sivani, T., & Mishra, S. (2022). Wearable devices: Evolution and usage in remote patient monitoring system. In Connected e-Health (pp. 311–332). Springer. 9. Anuradha, C., Swapna, D., Thati, B., Sree, V. N., & Praveen, S. P. (2022). Diagnosing for liver disease prediction in patients using combined machine learning models. In 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, pp. 889–896. https://doi.org/10.1109/ICSSIT53264.2022.9716312. 10. Van Kasteren, T., Noulas, A., Englebienne, G., & Kröse, B. (2008). Accurate activity recognition in a home setting. In Proceedings of the 10th International Conference on Ubiquitous Computing, Seoul, Korea, 21–24 September 2008, pp. 1–9. 11. Cheng, Z., Qin, L., Huang, Q., Jiang, S., Yan, S., Tian, Q. (2011). Human group activity analysis with fusion of motion and appearance information. In Proceedings of the 19th ACM International Conference on Multimedia, Scottsdale, AZ, USA, 28 November–1 December 2011, pp. 1401–1404. 12. Saguna, S., Zaslavsky, A., & Chakraborty, D. (2013). Complex activity recognition using context-driven activity theory and activity signatures. ACM Transactions on Computer-Human Interaction, 20, 32. 13. Skocir, P., Krivic, P., Tomeljak, M., Kusek, M., & Jezic, G. (2016). Activity detection in smart home environment. Procedia Computer Science, 96, 672–681. 14. Angelini, L., Nyffeler, N., Caon, M., Jean-Mairet, M., Carrino, S., Mugellini, E., & Bergeron, L. (2013). Designing a desirable smart bracelet for older adults. In Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing, Zurich, Switzerland, 8–12 September 2013, pp. 425–433. 15. Dai, J., Bai, X., Yang, Z., Shen, Z., & Xuan, D. (2010). PerFallD: A pervasive fall detection system using mobile phones. In Proceedings of the 8th IEEE International Conference on Pervasive Computing and Communications Workshops, Mannheim, Germany, 29 March–2 April 2010, pp. 292–297. 16. Kong, X., Meng, Z., Meng, L., & Tomiyama, H. (2021). A neck-floor distance analysisbased fall detection system using deep camera. In Advances in artificial intelligence and data engineering (pp. 1113–1120). Springer. ISBN 978- 981-15-3514-7 17. Praveen, S. P., Murali Krishna, T. B., Anuradha, C. H., Mandalapu, S. R., Sarala, P., & Sindhura, S. (2022). A robust framework for handling health care information based on machine learning and big data engineering techniques. International Journal of Healthcare Management, 1–18. https://doi.org/10.1080/20479700.2022.2157071 18. Mohapatra, S. K., Mishra, S., Tripathy, H. K., & Alkhayyat, A. (2022). A sustainable datadriven energy consumption assessment model for building infrastructures in resource constraint environment. Sustainable Energy Technologies and Assessments, 53, 102697. 19. Schroeter, C., Mueller, S., Volkhardt, M., Einhorn, E., Huijnen, C., van den Heuvel, H., van Berlo, A.; Bley, A.; Gross, H.M. Realization and user evaluation of a companion robot for people with mild cognitive impairments. In Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, 6–10 May 2013, pp. 1153–1159. 20. Sharkey, A., & Sharkey, N. (2011). Children, the elderly, and interactive robots. IEEE Robotics and Automation Magazine, 181, 32–38. 21. Praveen, S. P., Ali, M. H., Jaber, M. M., Buddhi, D., Prakash, C., Rani, D. R., & Thirugnanam, T. (2022). IOT-enabled healthcare data analysis in virtual hospital systems using Industry 4.0 smart manufacturing. International Journal of Pattern Recognition and Artificial Intelligence. https://doi.org/10.1142/S0218001423560025 22. Perlman, D. (2004). European and Canadian studies of loneliness among seniors. Canadian Journal on Aging/La Revue canadienne du vieillissement, 23, 181–188. 23. Van Tilburg, T., Havens, B., & de Jong Gierveld, J. (2004). Loneliness among older adults in the Netherlands, Italy, and Canada: A multifaceted comparison. Canadian Journal on Aging/ La Revue canadienne du vieillissement, 23, 169–180. 24. Moren-Cross, J. L., Lin, N., Binstock, R. H., & George, L. K. (2006). Social networks and health. In Handbook of aging and the social sciences (pp. 111–126). Elsevier.
Illuminating Unexplored Corners in Healthcare Space Using Ambience …
285
25. Tripathy, H. K., Mishra, S., Suman, S., Nayyar, A., & Sahoo, K. S. (2022). Smart COVID-shield: An IoT driven reliable and automated prototype model for COVID-19 symptoms tracking. Computing, 1–22. 26. Moak, Z. B., & Agrawal, A. (2009). The association between perceived interpersonal social support and physical and mental health: Results from the National Epidemiological Survey on Alcohol and Related Conditions. Journal of Public Health, 32, 191–201. 27. Suman, S., Mishra, S., Sahoo, K. S., & Nayyar, A. (2022). Vision navigator: A smart and intelligent obstacle recognition model for visually impaired users. Mobile Information Systems. 28. Mishra, N., Mishra, S., & Tripathy, H. K. (2023, January). Rice yield estimation using deep learning. In Proceedings of the First International Conference on Innovations in Intelligent Computing and Communication, ICIICC 2022, Bhubaneswar, Odisha, India, December 16–17 (pp. 379–388). Springer International Publishing. 29. Patel, S., Bakaraniya, P. V., Mishra, S., & Singh, P. (2022). Security issues in deep learning. In Predictive data security using AI: Insights and issues of Blockchain, IoT, and DevOps (pp. 151–183). Springer Nature Singapore.
Depression Assessment in Youths Using an Enhanced Deep Learning Approach Shainee Pattnaik, Anwesha Nayak, Sushruta Mishra, Biswajit Brahma, and Akash Kumar Bhoi
Abstract Depression is a common and severe mental disorder with considerable effects on functionality and well-being as well as significant negative effects on individuals, families, and society. In today’s society, depression among college students is on the rise. College students’ psychological well-being has a significant impact on their general academic success. The consequences of neglecting this could include tension, anxiety, sadness, and other issues. These problems need to be located and resolved as soon as possible if the patient’s mental state is to improve. It could be challenging to spot depression in a significant group of college students. The majority of pupils have no idea that they might be depressed. The accurate and timely identification of depression-related symptoms potentially offer numerous advantages for both healthcare professionals and students. Some students keep their depression a secret from everyone, if they even know they have it. Therefore, an automated system is needed to identify the pupils who are struggling with depression. Here, a system is suggested that uses frontal face photos of college students, this technique analyses each frame’s facial traits to look for depressive symptoms in the subjects. Frontal face photos with expressions of contentment, scorn, and this system will be developed using revulsion and disgust. These characteristics will be examined in the video frames in order to forecast student depression. These characteristics will be examined in the frames in order to forecast student depression or mental state. The S. Pattnaik · A. Nayak · S. Mishra (B) Kalinga Institute of Industrial Technology, Bhubaneswar, Odisha, India e-mail: [email protected] A. Nayak e-mail: [email protected] B. Brahma McKesson Corporation, San Francisco, CA, USA A. K. Bhoi KIET Group of Institutions, Uttar Pradesh, Ghaziabad, India Sikkim Manipal University, Gangtok, Sikkim, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_13
287
288
S. Pattnaik et al.
evaluated model recorded an optimal accuracy of 92.5% and relatively less latency of 2.23s. Also the noted precision, recall and f-score values were 92.4, 92.2 and 91.5% respectively. Keywords Depression detection · Facial feature extraction · Machine learning · Image processing
1 Introduction Depression is a mental illness that results in a constant sensation of dejection and apathy. Depression doesn’t have any specific symptoms and can show up in many ways, making it very difficult to detect at an early stage. Despite tremendous internal torment, severe depression might occasionally have only minor behavioral or visible symptoms. Mental disturbance caused by difficult decisions and upsetting life events such as death, divorce, job, money, etc. over time may end up in mental breakdowns or random outbursts of emotions, leading to very extreme actions like suicidal idealization or intent, a drop in productivity and the perception that life is a blighted hope. Depression has various effects on a person’s body, feelings, beliefs, and behaviors. Depression symptoms can differ from individual to individual. The manner one person shows depression symptoms is not always the same as how symptoms may appear in another person. Prior to 2020, mental illnesses were the primary contributors to the burden on global health, with anxiety and depressive disorders accounting for the majority of this burden. The global outbreak of COVID-19 has had an exacerbated impact on public mental health. With all countries getting locked down and learning places getting converted to online mode, time has never been tougher for students. In the 2021 Healthy Minds Study, 22% of college students screened positive for major depression, and 41% screened positive for depression overall. The shift to an online mode of education over the last 2 years in the wake of the pandemic has been very new and unfamiliar to many. Switching over to prolonged and excessive use of digital devices affects mental health and also leads to depression. One study looked for a connection between students’ internet use and depressive symptoms. According to the study, depressed students used the internet more frequently than their non-depressed peers [1]. In this digital age, college students are facing a lot of examinations, assignments, continuous evaluation, and practical assessments. Most often, this academic pressure is ignored and generally dismissed as something normal. Students are most often subjected to humiliation, criticism from their instructors, guardians, and sometimes even from fellow classmates and experience negative peer-pressure. Besides having too much on their plate all the time, they also have other conditions like family and societal expectations, health and academic related decisions. All these lead to depression among students. A study affirms that the following symptoms of depression occur the following symptoms appear in that order: depressed mood, avoidance of all activities, changes
Depression Assessment in Youths Using an Enhanced Deep Learning …
289
in weight and sleep, body agitation, energy loss, exhaustion, a sense of triviality, diminished ability to make decisions, and lastly suicidal thoughts [2]. Often, students feel hesitant to talk about their mental health openly because of the misconception of shame associated with depression and fear of social stigma. Sometimes they are unaware of the symptoms and ignorant of their mental condition. Hence, they don’t seek help, which ultimately worsens their health and well-being. Early detection, recognition, and treatment of depressive symptoms is highly crucial. It can curtail the negative impact of depression on an individual’s mental, physical, and socioeconomic life. Therefore, an automatic depression detector is essential, which can be used by college counselors to keep track of a student’s mental well-being. Moreover, students can also use it to be aware of their own psychological state and feel encouraged to seek professional help as soon as possible. As was already established, depression is a problem that can be prevented if it is caught early enough in the college years. Our goals include: . Facilitate the detection of depressive symptoms . Help students to assess their mental well-being and help them relieve their pressure. . Spread awareness against social stigma associated with mental health. The facial expression on a person’s face is the most crucial non-verbal cue [3]. Our proposed system is designed to collect the facial expressions from the captured video of the students while answering different questionnaires for the analysis of depression. The system will use Regions with Convolutional Neural Networks (RCNN), a deep convolutional network used for object detection that includes both CNN (Convolutional Neural Networks) and Support Vector Machines (SVM) classifiers. The facial features extracted will be used for the estimation of the level of depression and to classify them into low, medium, and high. Figure 1 shows the vital symptoms of depression in college youths.
Fig. 1 Graphical view illustrating prime symptoms of depression
290
S. Pattnaik et al.
The major highlight of the research work is discussed below. . Analysis—Depression has become a big concern to college students, and experts worldwide are increasingly engaged in depression research. As college students, we have personally experienced and observed our peers suffer from depression in many ways. Depression clearly has its own methods of manifesting itself, and it is frequently disregarded, discounted, and difficult to recognize. . Solution—Our work intends to map the knowledge map of depression research and use it to aid depression evaluation, making it easily accessible to everyone and allowing for speedier depression diagnosis. . Algorithm—We employed R-CNN, a deep convolutional network method that combines CNN and SVM classifiers. CNN based Key Frame Extraction (KFE), Region Proposals and Gabor Filters are included in our model for frame extraction and face expression extraction, respectively. . Result—Our study closes with an automatic depression evaluation test that evaluates the subjects and rates their depression as high, medium, or low, allowing them to work on themselves and get help if necessary. The structure of the essay is as follows. Section 1 discusses the impact of depression in modern youths and how machine learning can be used to deal with it. Section 2 highlights the relevant literature review in context to depression analysis. Section 3 presents the proposed model for depression detection. Section 4 evaluates the findings and examines how the model’s application has turned out. Section 5 provides a suitable final result of the work undertaken.
2 Related Work The human face is extremely expressive, which can convey untold emotions. Facial expressions are universal, in contrast to several non-verbal communication techniques. Hence, expressions act as a key to identify a depressed person. Various studies have been conducted to identify definitive depression-related expressions on the face. A specific system, called Action Units (AU), involves identifying such expressions [4]. In severely depressed individuals with depression, the presence of AU10 and AU14, which are associated with negative emotions like anger and contempt, respectively, was found to be clearly evident, whereas AU12’s presence, which is linked to the emotion of happiness, was rarely seen. Video data collected both depressed and non-depressed people could get knowledge there. The study’s conclusions show that Action Unit 14, related to emotion contempt, appeared to be very effective for classification and depression detection. A very significant feature of depression on a student’s ability to think clearly is its primary effect in such a way that they constantly feel sluggish and dull in classrooms. Further observations show that those students in colleges are least concentrated and inactive during lectures. It is possible to determine if a student is depressed or not if
Depression Assessment in Youths Using an Enhanced Deep Learning …
291
their behaviour is somehow linked to their facial expressions of positive or negative emotions. As discussed in [5], characteristic eye movements and the head position of a person are another important aspect of a depressed individual. The classification of features related to eye activity has been identified to be a better means of categorising depressed people. While capturing students’ videos, if Multiscale Entropy (MSE, the amount of variance per pixel in a video) is used, it aids in detecting depression. MSE values for non-depressed individuals who appear to be quite expressive were measured to be high as compared to depressed people. BlackDog (the depression dataset from the Black Dog Institute), Pitt (the depression dataset from the University of Pittsburgh), AVEC (the audio/visual emotion challenge depression dataset), and others are some datasets that deal with depression analysis [7]. These three datasets were combined and the search for depression traits was investigated. In both the individual and the combined datasets, the eye modal activity outperformed. This suggests that higher levels of unpredictability lead to better test outcomes. Another study offered a method for diagnosing depression that uses face geometry analysis in conjunction with voice analysis [5]. Their research discovered that depression-related symptoms were less common in shorter-duration videos. Longerduration movies are thus required for efficient depression identification. Datasets were created by recording the subjects during a professional interview when they answered clinical interviews. These datasets were created from the subject’s diagnosis until he or she showed significant improvements [4, 6]. Studies also showed substantial correlations between vocal behaviour and sad people’s face features [8]. In another observation, participants were given wearable devices that monitored their physical health, group interaction, and mood states in order to detect depression [9]. Researchers also gathered information by having respondents conduct a basic task of distinguishing between positive and negative facial expressions in various photos [10]. According to one study, examining the complete video together yielded better results than studying videos frame by frame for diagnosing depression. First the patient’s face region is initialized manually [11]. Throughout the video, the face is tracked using the Kanade-Tomasi-Lucas (KLT) tracker. The KLT tracker collects curvature information from images; for example, the corners of the mouth would be curved down for a sad emotion. The video-based technique demonstrated more accuracy because it more correctly generalized the face region, and hence minute motions depression is also taken into account within the facial region identification. The study makes use of a conventional face recognition system, which entails the three processing steps of face detection, feature extraction, and face recognition [15]. After the face detection and tracking module, it added a Face Quality Assessment (FQA) module and categorized them as the system’s front-end for key frame extraction (KFE). All identified face images will have their quality evaluated by the FQA module, which will then save the key frame. The “key frame” is the frame with the best face image. The students will be classified as possessing as when aspects of video and audio data are extracted from the recorded video material, a metric called the Motion
292
S. Pattnaik et al.
History Histogram (MHH) is utilized to depict slight variations in the patient’s facial expressions and speech characteristics [12]. Additionally, the Gabor Wavelet Technique, a technique, has been suggested as a way to recognize facial features [13]. Here, efforts are made to identify faces that are independent of posture and orientation. For the extraction of facial features, Gabor filters are used. Gabor filters were utilized for face recognition because they had some fundamental in-variance qualities that made face recognition independent of the orientation and pose of the face. It assisted in edge direction determination, image segmentation, and texture analysis. The characteristics employed are the mathematical mean or standard deviation of about 40 Gabor filters. The classification is done using the SVM classifier. The eye, nose, and mouth regions are the main face areas selected because they can be retrieved using the Haar feature-based Adaboost algorithm. The processing time for face recognition is said to be reduced by this method, especially for large databases. Additionally, combinations of the facial action units that are being found can discern between complex emotions for expression analysis [14]. The ViolaJones face detection algorithm presently has the highest True-Positive rates across a wide range of illumination situations, making it the most trustworthy face detection system. It is also essential for identifying various face characteristics like the mouth, nose, and eyes.
3 Proposed System for Depression Assessment The system suggested in this study will assist in identifying depression among college students. The system will receive training on the facial expressions of happiness, sadness, contempt and disgust [15]. Then, during the testing phase, images of college students responding to various surveys would be gathered. Frames would be extracted using KFE, image detection using region proposals and facial features will be extracted through Gabor Filters. In order to effectively detect features throughout the picture, the face features of the kids will be extracted and normalized. The test dataset’s facial traits would then be extracted, and depression would be detected using an SVM classifier. Depression will be assessed based on the overall distribution of cheerful, sad, contemptible, and disgusting characteristics throughout the photo frames. The level of depression the student has will be classified as low, middle, or high. The amount of negativity in the video will reveal the severity of the depression. Less pleasant features means more negative features, and vice versa. If the amount of negativity is high, the student will be classified as severely depressed, moderately depressed if it is moderate, and not depressed if it is much lower. The suggested automated system’s architecture diagram can be modeled as being made up of several modules, as seen in Fig. 2. The system suggested in this study will assist in identifying depression among college students. The system will receive training on the facial expressions of happiness, sadness, contempt and disgust. Then, during the testing phase, images of college students responding to various surveys would be gathered. Frames would be
Depression Assessment in Youths Using an Enhanced Deep Learning …
293
Fig. 2 Working model of the proposed depression detection model
extracted using KFE, image detection using region proposals and facial features will be extracted through Gabor Filters. In order to effectively detect features throughout the picture, the face features of the kids will be extracted and normalized. The test dataset’s facial traits would then be extracted, and depression would be detected using an SVM classifier. The total presence of happy, sad, contempt and disgust elements throughout the picture frames will be used to determine depression. The student will be categorized as having low, medium or high depression. The amount of negativity in the video will reveal the severity of the depression. Less pleasant features means more negative features, and vice versa. If the amount of negativity is high, the student will be classified as severely depressed, moderately depressed if it is moderate, and not depressed if it is much lower. The architecture diagram’s representation of the algorithm shows how it operates in stages, as follows: Two sets of questionnaires are given to the student: one is intended to identify depression, while the other is unrelated to depression. The front camera captures the student’s responses to the questions. While completing the questionnaire about depression, students could hide their emotions. In order to prevent these situations, two questionnaires are offered. Therefore, video feeds recorded while the student completes the non-stress questionnaire could provide us with information about their actual facial expression. In this study, we use a key-frame extraction (KFE) engine based on convolutional neural networks (CNNs) that extracts key-frames based on face quality, enhancing face recognition by reducing data volume and providing key-frames with high-quality
294
S. Pattnaik et al.
faces. This approach relies on spatial segmentation of each frame to recognize critical occurrences. Our model uses Region Proposals (~2k), which is a kind of selective search to identify the extracted image from an input image by extracting data about the region of interest. Bounds that are rectangular are utilized to represent an area of interest. Depending on the circumstances, there can be more than 2000 areas of interest. CNN processes this area of interest to produce output features. Face detection is followed by the use of Gabor filters to extract facial features. We will then have a feature set that includes the subject’s facial features. 1800 frames per minute can be produced using a typical video at 30 frames per second. As the student completes the surveys, two videos can be recorded and a frame can be collected every few seconds for testing. Gabor filters can be used to identify important facial traits. The different features are recovered using a Gabor filter bank. In this order, the filters are oriented: 0, 22, 44, 67, 90, 112, 132, 155. The scale value begins at 4 and increases in accordance with demand. In Fig. 3, an example dataset is displayed. These 2000 candidate region suggestions are fed into a convolutional neural network and twisted into a square, which creates a 4096-dimensional feature vector. The CNN serves as a feature extractor, and the picture features retrieved via max pooling are included in the output dense layer. A pooling method called Max Pooling determines the patch and feature map’s maximum value utilizes it to produce a down sampled map of features (pooled). It is usually used after a convolutional layer. It can be used to reduce variability by down sampling the bands with convolutional output. The max-pooling operator returns the highest value obtained from a group of R activation.
Fig. 3 Facial expression dataset samples
Depression Assessment in Youths Using an Enhanced Deep Learning …
295
An SVM classifies the object’s presence within the candidate region suggestion using the extracted CNN features. Support Feature categorization and regression are the two main applications of the supervised machine learning method known as the vector machine. SVM categorization is concerned with the creation of a line or hyperplane that successfully classifies several classes. The SVM classifier can be trained using the characteristics of happiness, scorn, and disgust. The faces in the test dataset’s features can be retrieved and tested with this trained SVM to identify a happy, disgusted, or contempt face during the testing phase. SVM upgraded the final layer for classification with an Radial Basis Function (RBF) kernel. Over-fitting can happen as a result of the usage of a large amount of data and parameters. The SVM classifier is trained during the training stage using delighted, disgusted, and displeased characteristics derived from the input dataset of joyful faces. This SVM is skilled at evaluating the features retrieved from each frame of the video dataset to see if they are present in the frame. As shown below, the level of these characteristics can indicate the levels of depression. If the student’s happy features appear in fewer frames, it suggests that he or she is unhappy. The student is oblivious to his or her emotional state. If there are also signs of contempt or disgust in the video, it could suggest that the person is in a bad mood. This represents the face’s overall negativity. As a result, the student may be labeled as depressed. The measure of depression might be deemed low if the presence of happy traits is strong and the prevalence of contempt and disgust elements is low.
4 Results and Analysis This section provides an analysis of the findings following the adoption of the suggested model using python. Various performance parameters were used for the evaluation [16, 17, 18]. A machine learning model’s accuracy is a metric for determining which model is the best at identifying correlations and patterns between variables in a dataset based on the input, or training, data [19, 20, 21, 22]. Our proposed algorithm R-CNN (CNN + SVM) was evaluated and compared to other machine learning algorithms, which generated an impressive and highest accuracy of 92.5% as shown in Fig. 4. KNN produced a perfect accuracy of 88.9% whereas SVM method gave the accuracy of 83.5%. The Naive Bayes algorithm gave the lowest accuracy of 81.2%, while 87.5% with the Decision Tree classifier, accuracy was observed. Any machine learning model’s effectiveness is measured by how well it performs in comparison to other models already in use. An R-CNN is primarily used to detect objects. Normal CNN can only provide you with the object’s class, not their location. The CNN bounding box regression cannot work well if there are multiple objects in the visual field due to interference. On the other hand, the areas in the R-CNN are located using a selective search strategy and then shrunk so that they were of the same size prior to being fed to a CNN for classification and bounding box regression. As a result, the suggested classification
296
S. Pattnaik et al.
Fig. 4 Classification accuracy of proposed model compared to previous works
methodology was In contrast to other popular classification techniques like SVM, KNN, Decision Tree, and Naive Bayes, with R-CNN (CNN + SVM) being more accurate and outperforming all others (Fig. 5). A metric called latency is used in machine learning to assess how well various models perform in a given situation. If only one unit of data is processed at a time, latency is the amount of time it takes to process one unit of data. Over those face datasets, An analysis of execution time latency was performed using the R-CNN approach to classification in comparison. The R-CNN algorithm, which is based on CNN + SVM, proved to be incredibly productive, rapid, and with very little latency. The proposed categorization model appears to be quite successful and accurate, according to the results. It can support and guide physicians in the right treatment of mental health issues, and is used to diagnose depression. So the most crucial question isn’t which detector is the most effective. It’s possible that you won’t be able to answer. The key question is which detector and configurations provide the optimal combination of speed and accuracy for your application.
Fig. 5 Comparison of execution times of the proposed model and previous works
Depression Assessment in Youths Using an Enhanced Deep Learning … Table 1 Performance metrics analysis for developed depression analysis model
297
Precision
F-score
Recall
KNN
88.50
88.10
88.23
DT
87.41
87.20
87.30
SVM
84.30
84.10
84.00
NB
81.2
81.00
81.42
CNN-SVM
92.4
92.2
91.5
Since it is believed that just one object of interest will predominate in a particular region, reducing interference, R-CNN forces the CNN to concentrate on one region at a time [23, 24, 25]. CNN (ConvNet) is then applied to each region independently after the image has first been divided into 2000 region recommendations (region proposals). When the size of the areas is chosen, the appropriate region is added to the artificial neural network. The time this process takes compared to other algorithms is by far its greatest benefit. CNN receives each component of the image separately. Training requires 84 h, and forecasting takes 47s. Table 1 highlights various metrics against which the proposed model was evaluated. As it is seen, the maximum precision, recall and f-score were found to be 92.4, 92.2 and 91.5% respectively. While Naive Bayes generated relatively less performance compared to others.
5 Conclusion and Future Work Depression is a prevalent, incapacitating, and frequently challenging condition. In today’s world, there is a rising trend among youths suffering from depression and other mental health risks. Over time, there has been a significant improvement in our knowledge of the neurology of depression and its treatment, which has greatly increased our capacity to manage the illness. Yet there are still a number of challenges. The greatest challenge is identifying the disease as early as possible. An automated predictive system can help to address this issue. In this research, a hybrid deep learning model is developed and evaluated to help assess depression symptoms with high accuracy by analyzing facial expression features. The validated model generated optimal performance when implemented and compared with existing models. The evaluated model recorded a maximum accuracy rate of 92.5% and a minimum response time of 2.23s. Also noted were the precision, recall, and f-score values of 92.4, 92.2, and 91.5%, respectively. Thus, it can be concluded that the suggested model can be used effectively by medical professionals to assess the risk of teenage depression. In our study, we only talk about the latest pictures of the students. For a more precise diagnosis of depression, the student’s history should also be taken into evaluation. Therefore, additional photos of the same student captured at various times can be used as input in subsequent work. This might prove useful in analysing the
298
S. Pattnaik et al.
student’s mental states in the past and present, comparing them, and determining how severe their depression is. The questionnaire can be made better to account for the quiz taker’s prior experiences. Based on the subject’s prior experiences with mental illness, the questions may be created. Therefore, further photographs of the same student shot at various times may be used as inspiration for future work. This could provide more details to the process of determining the student’s level of depression and aid in analyzing and contrasting the student’s mental states in the past and present.
References 1. Katikalapudi, R., Chellappan, S., Montgomery, F., Wunsch, D., & Lutzen, K. (2012). Associating internet usage with depressive behaviour among college students. IEEE Technology and Society Magazine, 31(4), 73–80. 2. Jena, L., Mishra, S., Nayak, S., Ranjan, P., & Mishra, M. K. (2021). Variable optimization in cervical cancer data using particle swarm optimization. In Advances in electronics, communication and computing (pp. 147–153). Springer. 3. Jena, L., Kamila, N. K., & Mishra, S. (2014). Privacy preserving distributed data mining with evolutionary computing. In Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2013 (pp. 259–267). Springer. 4. Girard, J. M., Cohn, J. F., Mahoor, M. H., Mavadati, S., & Rosenwald, D. P. (2013). Social risk and depression: Evidence from manual and automatic facial expression analysis. In 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (pp. 1–8). 5. Alghowinem, S., Goecke, R., Cohn, J. F., Wagner, M., Parker, G., & Breakspear, M. (2015). Cross-cultural detection of depression from nonverbal behaviour. In 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (vol. 1, pp. 1–8). 6. Pampouchidou, A., Simantiraki, O., Vazacopoulos, C-M., Chatzaki, C., Pediaditis, M., Maridaki, A., Marias, K. Simos, P., Yang, F., Meriaudeau, F., & Tsiknakis, M. (2017). Facial geometry and speech analysis for depression detection. In 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 1433–1436). 7. Harati, S., Crowell, A., Mayberg, H., Kong, J., & Nemati, S. (2016). Discriminating clinical phases of recovery from major depressive disorder using the dynamics of facial expression. In 38th Annual International Conference of the Engineering in Medicine and Biology Society (EMBC) (pp. 2254–2257). 8. Cohn, J. F., Kruez, T. S., Matthews, I., Yang, Y., Nguyen, M. H., Padilla, M. T., Zhou, F., & De la Torre, F. (2009). Detecting depression from facial actions and vocal prosody. In 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops (pp. 1–7). 9. Tasnim, M., Shahriyar, R., Nahar, N., & Mahmud, H. (2016). Intelligent depression detection and support system: Statistical analysis, psychological review and design implication. In 18th International Conference on e-Health Networking, Applications and Services (Healthcom) (pp. 1–6). 10. Pampouchidou, A., Marias, K., Tsiknakis, M., Simos, P., Yang, F., & Meriaudeau, F. (2015). Designing a framework for assisting depression severity assessment from facial image analysis. In International Conference on Signal and Image Processing Applications (ICSIPA) (pp. 578– 583). 11. Mishra, S., Tripathy, H. K., & Panda, A. R. (2018). An improved and adaptive attribute selection technique to optimise dengue fever prediction. International Journal of Engineering & Technology, 7, 480–486.
Depression Assessment in Youths Using an Enhanced Deep Learning …
299
12. Meng, H., Huang, D., Wang, H., Yang, H., Al-Shuraifi, M., & Wang, Y. (2013). Depression recognition based on dynamic facial and vocal expression features using partial least square regression. In Proceedings of the 3rd ACM international workshop on Audio/Visual emotion challenge (pp. 21–30). 13. Rath, M., & Mishra, S. (2020). Security approaches in machine learning for satellite communication. In Machine learning and data mining in aerospace technology (pp. 189–204). Springer. 14. Dutta, A., Misra, C., Barik, R. K., & Mishra, S. (2021). Enhancing mist assisted cloud computing toward secure and scalable architecture for smart healthcare. In Advances in communication and computational technology (pp. 1515–1526). Springer. 15. Sahoo, S., Das, M., Mishra, S., & Suman, S. (2021). A hybrid DTNB model for heart disorders prediction. In Advances in electronics, communication and computing (pp. 155–163). Springer. 16. Mishra, S., Mallick, P. K., Tripathy, H. K., Jena, L., & Chae, G. S. (2021). Stacked KNN with a hard voting predictive approach to assist the hiring process in IT organisations. The International Journal of Electrical Engineering & Education, 0020720921989015. 17. Mishra, S., Jena, L., Tripathy, H. K., & Gaber, T. (2022). Prioritised and predictive intelligence of things enabled waste management models in a smart and sustainable environment. PLoS ONE, 17(8), e0272383. 18. Tripathy, H. K., Mishra, S., Suman, S., Nayyar, A., & Sahoo, K. S. (2022). Smart COVIDshield: An IoT driven reliable and automated prototype model for COVID-19 symptoms tracking. Computing, 1–22. 19. Praveen, S. P., Srinivasu, P. N., Shafi, J., et al. (2022). ResNet-32 and FastAI for diagnoses of ductal carcinoma from 2D tissue slides. Science and Reports, 12, 20804. https://doi.org/10. 1038/s41598-022-25089-2 20. Sahoo, P. K., Mishra, S., Panigrahi, R., Bhoi, A. K., & Barsocchi, P. (2022). An Improvised Deep-Learning-Based Mask R-CNN Model for Laryngeal Cancer Detection Using CT Images. Sensors, 22(22), 8834. 21. Chakraborty, S., Mishra, S., & Tripathy, H. K. (2023, January). COVID-19 Outbreak Estimation Approach Using Hybrid Time Series Modelling. In Innovations in Intelligent Computing and Communication: First International Conference, ICIICC 2022, Bhubaneswar, Odisha, India, December 16–17, 2022, Proceedings (pp. 249–260). Springer International Publishing. 22. Raghuwanshi, S., Singh, M., Rath, S., & Mishra, S. (2022). Prominent cancer risk detection using ensemble learning. In Cognitive Informatics and Soft Computing: Proceeding of CISC 2021 (pp. 677–689). Springer Nature Singapore. 23. Patnaik, M., & Mishra, S. (2022). Indoor positioning system assisted big data analytics in smart healthcare. Connected e-Health: Integrated IoT and Cloud Computing (pp. 393–415). Springer International Publishing. 24. Sivani, T., & Mishra, S. (2022). Wearable devices: Evolution and usage in remote patient monitoring system. Connected e-Health: Integrated IoT and Cloud Computing (pp. 311–332). Springer International Publishing. 25. Verma, S., & Mishra, S. (2022). An exploration analysis of social media security. In Predictive Data Security using AI: Insights and Issues of Blockchain, IoT, and DevOps (pp. 25–44). Springer Nature Singapore.
Telemedicine Enabled Remote Digital Healthcare System Shambhavi Singh, Nigar Hussain, Sushruta Mishra, Biswajit Brahma, and Akash Kumar Bhoi
Abstract This article presents Telemedicine, which is a remote healthcare system. Telemedicine is a platform that connects the patient and the doctor. This platform is a component of the Internet of Medical Things (IoMT) by allowing numerous medical sensors to interact with a server either directly or indirectly via various connection techniques such as GSM, Bluetooth, or Wi-Fi technologies. In this paper we have discussed the Telemedical Ecosystem. The system collects data from various sensors and delivers it through An Arduino Board is utilized in conjunction with one of the aforementioned technologies, and the interface is built using Matlab and C#. As an extra benefit, a comparison of the three communication techniques used to connect the medical sensors to the server is being investigated. Keywords Telemedicine · Internet of Things (IoT) · Internet of Medical Things (IoMT) · Radio Frequency (RF) · Global System for Mobile communication (GSM)
S. Singh · N. Hussain · S. Mishra (B) Kalinga Institute of Industrial Technology, Bhubaneswar, Odisha, India e-mail: [email protected] S. Singh e-mail: [email protected] N. Hussain e-mail: [email protected] B. Brahma McKesson Corporation, San Francisco, CA, USA A. K. Bhoi KIET Group of Institutions, Uttar Pradesh, Ghaziabad, India Sikkim Manipal University, Gangtok, Sikkim, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Barsocchi et al. (eds.), Enabling Person-Centric Healthcare using Ambient Assistive Technology, Studies in Computational Intelligence 1108, https://doi.org/10.1007/978-3-031-38281-9_14
301
302
S. Singh et al.
1 Introduction One source estimates that by 2021, the global health sector will be valued $136.8 billion. The Internet of Medical Things (IoMT) guarantees the protection of people’s health and safety while also lowering future healthcare costs. The healthcare industry has adopted the internet of things (IoT) more slowly than most other sectors, though. IoMT was suggested as a way to lessen the work required for patient monitoring, updating, and alerting. Additionally, it can give clinicians statistics and actual data that they might utilize to understand and pinpoint the origins of medical diseases [1]. These days, telemedicine—a word that is still relatively new—is widely employed. Telemedicine might entail receiving advice from a doctor remotely via a chat platform, scheduling consultations with doctors, or employing sophisticated sensors to identify anomalies in a patient’s body. Bluetooth is used by several fitness and wearable headset apps to interact with devices. Wireless LAN technology is another alternative for linking PCs. When compared to other ways, however, deploying technology in this manner is considered as a power-hungry strategy. Cellular data, such as GPRS, is notorious for its sluggish data rates, although it is nevertheless employed in some healthcare settings [2]. All remote healthcare solutions are built on the internet of things, which may help us assess blood pressure or deliver data on a specific organ within the body. This research presents a model for understanding how memory works. It focuses on the function of perceptual signals in recollection [3]. The Internet of Things (IoT) is a promising topic that is altering the way we live by combining breakthroughs in wireless networking with small sensors, devices, actuators, and embedded microprocessors/controllers. These technologies have the capacity to satisfy the demands of a wide range of applications that exist in the world today. Wireless Sensor Networks are based on the following basic equation. The perception and processing of information adds to the wide range of applications that are conceivable. Based on the many types of applications, smart nodes, These disease detection devices were previously only accessible in hospitals, and they were quite big, cumbersome, and required a sophisticated circuit system with significant power consumption and skill to operate [4]. The rise of the semiconductor industry has resulted in smaller, quicker sensors and microcontrollers with reduced power consumption and cheaper cost. There is no one-size-fits-all strategy to IoT because of the diversity in smart sensors’ communication, networking, and architecture in the IoT ecosystem. The cloud is required in the IoT ecosystem to access sensed data for additional processing and decision making. This is because the ecosystem would be incomplete without it. An electronic healthcare monitoring system is an extension of a hospital medical system where a patient’s vital body state is monitored remotely but with an approximate level of efficiency similar to that of a traditional medical system. The delivery of healthcare services via digital and communication networks is referred to as telemedicine. Cross-border collaboration between healthcare professionals and patients is now available for activities such as patient management, health promotion, diagnosis, and follow-up. The telemedicine platform began as a basic
Telemedicine Enabled Remote Digital Healthcare System
303
radiography picture transmission service between two locations [5]. Remote health monitoring is provided by devices such as fitness bands, remote heart rate monitors, and gadgets that track and update any implanted artificial organs in the human body. Even today, the end user’s experience with such technology is challenging. As adding sensors by the end user may be challenging, they should be simple to use and user pleasant. Devices for physiological factor analysis are still being developed using open-source hardware and software. With the help of this article, a platform will be available for IoMT that will enable patients to speak with a doctor from a distance. The Internet of Things (IoT) is a critical area that has been forecasted as a potential technological driver for automation and control in nearly every industry. It has been the topic of much investigation in recent years. IoT-enabled health monitoring devices in rural regions have a substantial influence on the present healthcare system. The cost of medical treatment is rising, and the cost of chronic diseases is much higher, affecting people’s quality of life. The number of old people is always increasing, which has an impact on the availability of medical facilities or services [6]. Wearable technology and other medical applications have taken use of various Internet of Things (IoT) benefits in the healthcare business.Various hospitals use IoT devices to track the whereabouts of their employees, patients, and medical care teams. Medical practitioners may access information from anywhere via applications on their mobile devices owing to an IoT-based e Healthcare system. Because of this strategy, they can treat and monitor patients more successfully. In order to successfully support the patient by giving the most diagnostic data feasible via sensors, any healthcare system meant for remote monitoring must guarantee continual data analysis. Remote monitoring reduces the need for back-and-forth doctor appointments and aids in crisis situations. The following service is beneficial for elderly or chronically unwell people who would wish to avoid a protracted hospital stay. A variety of wireless sensors and hardware may be used to gather and transmit data signals, and processors can be programmed to automatically send and receive data and alerts in order to assess sensor data. All connected sensors send data in an unstructured manner that is challenging to manage and analyze. As a result, more complex and hybrid database management solutions are required (DBMS). Wearable sensors that provide real-time, long-term monitoring. The lengthy battery life of the smartphone is by design. aid for the chronically ill and the elderly The system should have few buttons and be easy to use [7] (Fig. 1).
1.1 Main Highlights of the Work is as Follows . The first section of the paper describes the IOT in healthcare and the motivation behind the research work. . The second section of the paper contains the overall summary about the paper in brief. . The third part of the paper includes the background and related work done in the telemedicine field prior to this paper.
304
S. Singh et al.
Fig. 1 IoT in healthcare space
. The fourth part of the paper describes our proposed telemedical ecosystem model. . The fifth part of the paper describes the implementation of the telemedical ecosystem model. . The sixth part of the paper discusses the various technologies used in this field. . The seventh part describes the superiority of the proposed model and the constraints related to it. . The eighth part of the paper describes the future scope of telemedicine. . The ninth part of the paper describes the various obstacles faced in telemedicine. . The tenth part describes the specific challenges faced in the field of telemedicine. . The eleventh part of the paper describes the current situation of telemedicine. . The last part of the paper contains the overall sum-up of research.
2 Background and Related Work The foundation for the development of the Telemedicine platform was the 1940s radiography picture transmission between two locations. AIn 1950, a Canadian scientist invented the idea and constructed the first sharing hub for Montreal. Later, video conferencing was utilized in hospitals to communicate with specialists about remote surgical instructions and to address patient issues. One of the things that can enable telemedicine is the use of the Internet of Things, which enables remote health monitoring via wearables like fitness bands, remote heart rate monitors, and gadgets that monitor and deliver updates on any implanted artificial organs in the human body. In hospitals today, it’s common to find “smart beds” that track patients’ vitals as they sleep and can be used to send alarms in an emergency or adjust the
Telemedicine Enabled Remote Digital Healthcare System
305
bed’s comfort level without consulting the personnel. As of the publication date of this study, the US spends an estimated $300 billion annually on medical innovation. The experience of the end user who utilizes such systems should be easy to use and relatively user pleasant as putting sensors by the end user may be a challenge. Devices that assess physiological parameters are still being created using open source and generally available hardware and software. The work in progress right now is trying to target the limitation in the available framework which is currently being used by products. Before Telemedicine may be used widely, there are still numerous obstacles to be removed. Currently, in order to administer medications and treat patients, specialists in the United States must get licensure in each state. Furthermore, a legal priority system for remote danger has not yet been established. Additionally, repayment methods are not entirely fully understood. The principal Telemedicine procedures that consistently receive reimbursement are tele-radiology examinations. The lack of sufficient data transmission capacity in rural areas is a significant barrier to telemedicine that persists even as telecom companies across the country move up to support the national information infrastructure (NII). Additionally, the ongoing expenditures of telecom administrations and the underlying costs of hardware are enormous. However, the Digital Imaging and Communications in Medicine (DICOM) standard and the late acknowledgment and transmission of picture archiving and communications systems (PACS) in hospitals can provide a basis to support the use of telemedicine frameworks. In order to standardize the digital transmission of medical images, the American College of Radiology and the National Electrical Manufacturers Association began creating the DICOM standard in 1983. The resulting ACRNEMA standard was initially published in 1985 and was updated to version 2.0 in 1988. The name was changed to Digital Imaging and Communications in Medicine with the introduction of version 3.0 in 1993. DICOM defines information objects for patients, research, and other entities in addition to information items for pictures. DICOM strives to encourage compatibility and increase workflow effectiveness between imaging equipment and other information systems in healthcare facilities all over the world. The majority actively contribute to the standard’s improvement, and every large vendor of diagnostic medical imaging in the world has included it into the design of its devices. The vast majority of professional societies around the world support the standard’s improvement and are actively participating in it.
3 Telemedical Ecosystem Model and Interfaces This section covers the network model and its interfaces. As shown in Fig. 2, the system is viewed as a platform that uses servers to connect a large number of patients with a large number of clinicians. The system consists of two GUIs, one for the patient and the other for the doctor. The doctor’s GUI enables them to monitor their patients’ health and advise them on the best course of treatment [8]. In case the patient has any queries for the doctor, there is also a conventional conversation application available.
306
S. Singh et al.
Fig. 2 Telemedicine network framework
Due to its ability to work with large amounts of data and complex matrix calculations, MATLAB is used to design the GUIs. However, a high level language like C# was needed to deal with the SQL server using Visual Studio in order to make the server connection between the doctor and the patient capable of handling other types of data such as the patient picture, chat, and previously mentioned medical data. This was necessary in order to send more complex data using a created database [9, 10, 11]. Since it is a many-to-many system—meaning that many patients can be examined through the system and there are many doctors available—the system comprises two databases, one for the doctors and the other for the patients. Additionally, every entry in the database tables of the system represents a session between a doctor and a patient. It is automatically assigned when the patient’s session becomes available and can be changed by the doctor whether they are both online or offline [12, 13, 14]. The dialogue box also features active or visible flags that allow users to send messages for each other asking for guidance or making requests. The doctor can begin the session by clicking the (visible) flag, and the active flag indicates that the session is now ongoing. Other columns in the database can hold additional data, such as a distinct (Visit ID) to stop two doctors from attending the same session. The database’s current patient image is kept in the other column, “Patient Image,” which is a straightforward binary array. Only one photo should be supplied at the start of each conversation due to space restrictions. The final column is in charge of tracking the chat between the doctor and the patient as well as each party’s online or offline state. There are only two strings and two flags, one for each. The goal of this study is to provide the database engine the ability to choose which doctor to call for each patient. The C# code discussed later is used to accomplish this. The
Telemedicine Enabled Remote Digital Healthcare System
307
programme was designed to be both a doctor and patient portal for simplicity, but depending on the user’s selected role, there are certain variations in the programme [15]. A webcam, keyboard, and medical kit that measures the patient’s vital signs all provide input to the patient portal. The application subsequently transmits all of this data to the database server over the cloud. A database server maintains and makes data available to doctors.
4 Implementation of the Telemedical Ecosystem Model This section illustrates how the network model may be used to link a doctor with a patient using various wireless communication technologies, as well as to monitor the patient and deliver consultation messages in real time [16]. Figure 3 shows the network model that helps to keep the network running smoothly. To give a thorough picture of the communication channels available in this network, a network model utilizing three wireless communication technologies has been built. With heart rate, humidity, temperature, blood pressure, and alcohol sensors, the device has Wi-Fi, GSM, and Bluetooth connectivity and can assess four different medical vital indicators. For testing, three Arduino ATmega 32 boards are utilized. Each controller has a solar cell and an RX and TX (one UART) to keep the system running even when the power is off. A Wi-Fi module is used to relay the data from the humidity sensor. Connect your heart rate sensor to your Bluetooth and GSM modules. The temperature and alcohol sensors are connected by GSM as well [17]. The device contains a push button that enables the patient to call for an ambulance if necessary. The system made use of the ESP8266 (ESP-01) Wi-Fi module, SIM 908 GSM module, and Hc05 Bluetooth module. An alcohol sensor, an LM35 temperature sensor, a heart-rate sensor, and a humidity sensor were all utilized. The hardware for the system is shown in Fig. 4. Utilizing several communication technologies has the aim of testing their effectiveness to determine which technology best satisfies the requirements of the planned system.
5 Discussion The three technologies are contrasted in Table 1. To compare the three approaches, transmission latency, throughput, price, range, and infrastructure were taken into account. GSM can cover quite far distances (1 m–40 km, based on base station type). However, it only offers slow data speeds and requires cellular infrastructure (in terms of kbps). Additionally, as compared to other technologies, the transmission cost is high and the transmission delay is low [18]. With proper routing and internet infrastructure, Wi-Fi can operate at intermediate distances of up to 90 m for outdoor networks and 1 to 40 m for inside networks. There is a large transmission delay, but it is possible to get high data rates (54 Mbps) at a reasonable cost for the router
308
S. Singh et al.
Fig. 3 Flowchart for telemedicine ecosystem model
and internet registration costs. The range of Bluetooth is constrained (1–10 m). But no infrastructure is necessary. However, with a small amount of transmission delay and price, modest data rates (24 Mbps) can be achieved. This study suggests using Bluetooth technology in networks where all system components are in the same room and internet connection is not required [19]. In contrast, Wi-Fi is advised in a system that needs internet access since it enables high data rates with less infrastructure and cost.
Telemedicine Enabled Remote Digital Healthcare System
309
Fig. 4 A sample implementation of the system model
Table 1 Comparison among relevant technologies used in telemedicine GSM
Wi-Fi
Bluetooth
Transmission Delay
Moderate
High
Low
Throughput
Low
High
Moderate
Cost
High
Moderate
Low
Range (Distance)
Very long
Moderate
Limited
Infrastructure
Cellular network is required
Routing and Internet are required
Not Required
6 Superiority of Telemedical-Ecosystem Model Adopting the most recent telemedicine ideas can benefit your clinic in many ways. Telemedicine can improve patient access to healthcare services, cut healthcare expenses, increase efficiency and income, and ultimately result in patients who are happier and healthier and stay with your organization [20]. 1. 2. 3. 4. 5.
More Convenient and Accessible Patient Care Telemedicine Cost Effectiveness & Healthcare Savings Virtual Care Can Expand Your Patient Base. Virtual Care is Cost-Efficient. Engage Patients and Get Better Patient Outcomes.
While many in the industry remain optimistic about the possibilities of virtual care, others have reservations. Virtual care technology has improved tremendously, albeit it is far from flawless. The regulatory framework has struggled to keep up with the rapid advancement of telehealth technologies [21].
310
S. Singh et al.
The most noticeable disadvantages of virtual care are the continual need for more precise, simplified laws and standards for telehealth practice, which will facilitate adoption. 1. Regulatory and Industry Obstacles. 2. The physical examination is restricted. 3. Telemedicine Equipment and Technology.
7 Future Scope 7.1 Telemedicine Will Develop into a Common Service There is no going back for patients in our contemporary environment who are acclimated to the amount of access that telemedicine offers. Telemedicine has developed steadily over the past ten years, becoming an important and disruptive force not just from a technology standpoint but also from a social and economic one because it offers solutions to pressing problems. These difficulties include the rising need for health care, the aging of the population, the need to manage vast volumes of data, etc.
7.2 Will Develop Medical Services for Connected Health in Remote Areas The development of new technologies is a key component in the expansion of telemedicine and the use of the ideas of globality and interoperability in healthcare organizations. In the end, it permits and encourages structured work environments that go beyond the simple use of telemedicine for healthcare services, particularly in geographically isolated areas.The prospect of providing healthcare in remote areas as well as the introduction of a multidisciplinary approach that can strengthen the function of the primary care physician will both be made possible by this new trend and telemedicine.
7.3 Remote Patient Care Will Be a Reality Remote medical services provided by connections to professionals in other areas are one of the most frequently used applications of telemedicine. In most nations, specialized medical professionals (let’s say physicians) are concentrated in a single area, forcing patients in remote areas to travel great distances in order to access the proper care they require.
Telemedicine Enabled Remote Digital Healthcare System
311
7.4 Telemonitoring It will allow for the incorporation of biological, physiological, and biometric parameters in the follow-up of patients (oftentimes chronic patients). In order to provide patients more control over their own health, telemonitoring might be a crucial tool. As a result, telemonitoring enables patients to actively participate in their care while requiring the least amount of hospital time.
7.5 Teleconsultation It benefits the patients by offering timely diagnosis, therapy, or remote patient monitoring. However, it necessitates the sharing of clinical data. The transmission of x-rays or other comparable pictures (teleradiology), lab or electronic health record, and its use in specialties like dermatology, psychiatry, or cardiology, among others, are some of the services that may be accessed under this category.
8 Main Obstacles Faced in Telemedicine Healthcare has been reinvented with the implementation of telemedicine services— telemedicine services. The terrible epidemic has hastened the expansion of an already fast growing enterprise. The telemedicine sector is anticipated to grow from $3 billion to $250 billion as expenditures in virtual care increase and new government rules extend remote healthcare capabilities [22]. However, there are other roadblocks to telemedicine’s effectiveness. This essay investigates the obstacles of telemedicine and looks for answers to present issues. The epidemic has not only increased telemedicine adoption, but it has also created a need for telemedicine services. In a 2020 McKinsey poll, over half of her patients preferred telemedicine over canceling medical sessions. The epidemic has enabled individuals who were previously afraid to experiment with telemedicine. However, there are several obstacles to be overcome.
8.1 Misdiagnoses From 2014 to 2018, 66% of telemedicine malpractice claims were attributable to misdiagnosis, according to CRICO’s CBS National Database. Misdiagnosis is undeniably a severe issue that slows healing and affects patient health. High misdiagnosis rates impact patient health and healthcare provider costs. Wrong diagnoses are costly
312
S. Singh et al.
in medication and lead to inappropriate and potentially dangerous treatment adjustments. Still, telemedicine is part of healthcare and not without common problems. Misdiagnosis is not the only drawback of telemedicine screening.
8.2 Compensation and Parity Telemedicine has two significant legal limitations. They are based on the payment parity and coverage parity methods. Laws governing fee equality and coverage are subject to both federal and state authority. The Payment Parity Act mandates that both onsite and remote medical telemedicine services be compensated equally. Telemedicine coverage is required under the Cover Parity Act. On the one hand, coverage parity has greatly improved. 43 states will have implemented service parity by 2021. (also known as utility parity). In comparison to 2019, just roughly 20 states had comparable coverage. When it comes to payment equality, though, the picture is a little different. Only 14 states will have implemented Equal Pay by 2021, which ensures equal compensation for telemedicine and in-person medical consultations. Legal ambiguity jeopardizes financial investments and diverts focus away from healthcare providers.
8.3 Cybersecurity With all-encompassing personal data digitization comes the issue of data security. Since 2017, the healthcare business has ranked first in reported data breaches, according to the 2021 study. Medical records include sensitive and private information. Unfortunately, this makes them a valuable target for hackers. New technologies, such as telemedicine and video communication, create new opportunities for data breaches. As a result, telemedicine integration necessitates the development of new security technologies, encryption methods, and data interoperability platforms. It necessitates increased security expenditure for medical practitioners. Additional costs, notably in the security sector, are viewed as a key barrier to telehealth deployment.
8.4 Privacy Rule and Telemedicine Any conversation about health data security will always include the HIPAA privacy regulation. The no-compromise HIPAA regulations are designed to protect privacy and sensitive personal information, with severe and costly consequences for violators. For example, the maximum penalty for violating privacy regulations is $1.5 million
Telemedicine Enabled Remote Digital Healthcare System
313
per year, and some fraud is considered a crime. During the pandemic, HIPAA deregulated telemedicine services by removing penalties for some types of video and voice communications. However, these exceptions are temporary. Much of the current law needs to be reformed to ensure a long-term safe investment climate for telemedicine.
9 Specific Challenges of Telemedicine The obstacles to telemedicine implementation differ depending on area and stakeholder. Here are some instances of issues confronting particular telemedicine areas.
9.1 Rural Areas Unfortunately, there is still a disparity in healthcare between metropolitan and rural areas. There is no exemption in telemedicine. The digitalization of data, as well as widespread skepticism of technology-related medical procedures, is a heated subject, particularly in rural regions.
9.2 On-Demand Nursing Accreditation and accreditation, in particular, is one of the biggest challenges caregivers face in telemedicine. Licenses are often state-specific, and there are no multistate licenses, preventing nurses from caring for patients in various states Caregivers have a significant challenge.
9.3 Patient Satisfaction Patient hurdles to telemedicine adoption have already been highlighted. Another intriguing similarity. Despite the fact that over 75% of patients report a pleasant first telemedicine experience, many remain hesitant to substitute “conventional” treatment with telemedicine. The primary problems inhibiting telemedicine adoption are a lack of computer literacy, sophisticated skills, and distrust.
314
S. Singh et al.
10 Current Telemedicine Situation Overcoming huge difficulties takes time, money, and collaboration. However, as the story shows, we are making progress. Here are the main ideas for accelerating telemedicine integration and removing the barriers described in this article.
10.1 Government Cooperation Cooperation between federal and state governments is critical to ensuring multistate licensing, parity of payments, and expansion of telemedicine services.
10.2 Viable Software Healthcare providers should invest in software that is accessible to both physicians and nurses, as well as patients. As a result, a team of programmers with expertise primarily in the healthcare business and an understanding of the complexity of medical services on both sides must be assembled.
10.3 Software Legal Compliance Legal difficulties are significant roadblocks that exist at several stages of telemedicine integration. In addition to being user-friendly, software must adhere to HIPAA requirements.
10.4 Staff Training Training is required for new approaches and equipment. The same is true for telemedicine. Healthcare providers should create interesting telemedicine software training sessions. This strategy addresses the issue of being unable to use new tools and minimizes distrust of new digital ways. Physicians who are familiar with telemedicine software now have additional options for providing care. It can help reassure patients that remote therapy is equally successful as in-person counseling.
Telemedicine Enabled Remote Digital Healthcare System
315
10.5 Identifying Sectors Although remote care provides limitless alternatives, certain services can only be provided in person. It is critical to understand when remote counseling is possible and when you must visit the hospital.Telemedicine is drastically revolutionizing the United States’ healthcare system by blurring the lines between patients and medical personnel. During the epidemic, telemedicine demonstrated its advantages and showed its incredible potential. However, in order to be really integrated, the telemedicine business must be aware of the barriers that are impeding progress and apply suitable solutions.
11 Conclusion We presented a platform for IoMT in this study, where patients may interact remotely with one of their doctors. Additionally, while the session is open, sensors may be attached to upload the patient’s vital signs to the system. Doctors can remotely check measurement data and advise patients on suitable therapy by viewing measurement data online. This system was tested with a total of five sensors. With three of her ATmega 32, she can measure heart rate, humidity, temperature, blood pressure, and alcohol levels (Arduino boards). Furthermore, Bluetooth, GSM, and WiFi were utilized to link the sensors to the server. Describes her comparative analysis of three medical sensor-to-server methods. We will recommend the best technology based on the nature of the application. Bluetooth and Wi-Fi technologies are recommended for comparable networks that may or may not require Internet connection. Similar networks do not need to cover great distances and may support high data speeds for video streaming. The suggested network architecture may be expanded by incorporating patient-to-physician payment methods inside a secure framework. In addition, including her GPS into such a system will increase accuracy for looming emergencies and assist rescue services in locating her in the case of an emergency.
References 1. Manogaran, G., Thota, C., Lopez, D., & Sundarasekar, R. (2017). Big data security intelligence for healthcare industry 4.0. In L. Thames & D. Schaefer (Eds.), Cybersecurity for Industry 4.0. Springer Series in Advanced Manufacturing. Springer International Publishing. ISBN 9783319506593. 2. Pace, P., Aloi, G., Gravina, R., Caliciuri, G., Fortino, G., & Liotta, A. (2018). An edge-based architecture to support efficient applications for Healthcare Industry 4.0. IEEE Transactions on Industrial Informatics, 15, 481–489.
316
S. Singh et al.
3. Srinivasu, P. N., Ijaz, M. F., Shafi, J., Wo´zniak, M., & Sujatha, R. (2022). 6G driven fast computational networking framework for healthcare applications. IEEE Access, 10, 94235– 94248. https://doi.org/10.1109/ACCESS.2022.3203061 4. Mohapatra, S. K., Mishra, S., Tripathy, H. K., & Alkhayyat, A. (2022). A sustainable datadriven energy consumption assessment model for building infrastructures in resource constraint environment. Sustainable Energy Technologies and Assessments, 53, 102697. 5. Mishra, S., Jena, L., Tripathy, H. K., & Gaber, T. (2022). Prioritized and predictive intelligence of things enabled waste management model in smart and sustainable environment. PLoS ONE, 17(8), e0272383. 6. Hathaliya, J., Sharma, P., Tanwar, S., & Gupta, R. (2019, December 13–14). BlockchainBased Remote Patient Monitoring in Healthcare 4.0. In Proceedings of the 2019 IEEE 9th International Conference on Advanced Computing, IACC 2019 (pp. 87–91). 7. Yan, H., Da Xu, L., Bi, Z., Pang, Z., Zhang, J., & Chen, Y. (2015). An emerging technology wearable wireless sensor networks with applications in human health condition monitoring. Journal of Management Research and Analysis, 2, 121–137. 8. Jiang, X., Xie, H., Tang, R., Du, Y., Li, T., Gao, J., Xu, X., Jiang, S., Zhao, T., Zhao, W., & Sun, X. (2021). Characteristics of online health care services from China’s largest online medical platform: Cross-sectional survey study. Journal of Medical Internet Research, 23, e25817. 9. McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (2006). A proposal for the Dartmouth summer research project on artificial intelligence. AI Magazine, 27, 12–14. 10. Anane-Sarpong, E., Wangmo, T., Ward, C. L., Sankoh, O., Tanner, M., & Elger, B. S. (2017). You cannot collect data using your own resources and put It on open access: Perspectives from Africa about public health data-sharing. Developing World Bioethics, 18, 394–405. 11. Van Panhuis, W. G., Paul, P., Emerson, C., Grefenstette, J., Wilder, R., Herbst, A. J., Heymann, D., & Burke, D. S. (2014). A systematic review of barriers to data sharing in public health. BMC Public Health, 14, 1144. 12. Aboalshamat, K. T. (2020). Awareness of, beliefs about, practices of, and barriers to teledentistry among dental students and the implications for Saudi Arabia Vision 2030 and coronavirus pandemic. Journal of International Society of Preventive & Community Dentistry, 10, 431. 13. Mohaya, M. A. A., Almaziad, M. M., Al-Hamad, K. A., & Mustafa, M. (2021). Telemedicine among oral medicine practitioners during covid-19 pandemic and its future impact on the specialty. Risk Management and Healthcare Policy, 14, 4369. 14. Al Mutlaqah, M. A.,Baseer, M. A., Ingle, N. A., Assery, M. K., & Khadhari, M. A. A. (2018). Factors affecting access to oral health care among adults in Abha City, Saudi Arabia. Journal of International Society of Preventive & Community Dentistry, 8, 431. 15. Al-Jaber, A., & Da’ar, O. B. (2016). Primary health care centers, extent of challenges and demand for oral health care in Riyadh, Saudi Arabia. BMC Health Services Research, 16, 628. 16. Awaji, N. N. A., AlMudaiheem, A., & Mortada, E. M. (2022). Changes in speech, language and swallowing services during the Covid-19 pandemic: The perspective of speech-language pathologists in Saudi Arabia. PLoS ONE, 17, e0262498. 17. Sahoo, P. K., Mishra, S., Panigrahi, R., Bhoi, A. K., & Barsocchi, P. (2022). An improvised deep-learning-based mask R-CNN model for Laryngeal cancer detection using CT images. Sensors, 22(22), 8834. 18. Praveen, S. P., Jyothi, V. E., Anuradha, C., VenuGopal, K., Shariff, V., & Sindhura, S. (2022). Chronic kidney disease prediction using ML-based Neuro-Fuzzy Model. International Journal of Image and Graphics, 2340013. https://doi.org/10.1142/S0219467823400132 19. Srinivasu, P. N., Sandhya, N., Jhaveri, R. H., & Raut, R. (2022). From blackbox to explainable AI in healthcare: Existing tools and case studies. Mobile Information Systems, 2022(8167821), 20. https://doi.org/10.1155/2022/8167821 20. Lewis, A., Cave, P., Stern, M., Welch, L., Taylor, K., Russell, J., Doyle, A.-M., Russell, A.-M., McKee, H., Clift, S., & Bott, J. (2016). Singing for lung health—A systematic review of the literature and consensus statement. NPJ Primary Care Respiratory Medicine, 26, 16080.
Telemedicine Enabled Remote Digital Healthcare System
317
21. Tripathy, H. K., Mishra, S., Suman, S., Nayyar, A., & Sahoo, K. S. (2022). Smart COVID-shield: An IoT driven reliable and automated prototype model for COVID-19 symptoms tracking. Computing, 1–22. 22. Praveen, S. P., Ali, M. H., Jaber, M. M., Buddhi, D., Prakash, C., Rani, D. R., & Thirugnanam, T. (2022). IOT-enabled healthcare data analysis in virtual hospital systems using Industry 4.0 smart manufacturing. International Journal of Pattern Recognition and Artificial Intelligence. https://doi.org/10.1142/S0218001423560025