Innovation in Medicine and Healthcare: Proceedings of 8th KES-InMed 2020 [1st ed.] 9789811558511, 9789811558528

This book presents the proceedings of the KES International Conferences on Innovation in Medicine and Healthcare (KES-In

276 100 10MB

English Pages XV, 222 [219] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Front Matter ....Pages i-xv
Front Matter ....Pages 1-1
Vision Paper for Enabling Internet of Medical Robotics Things in Open Healthcare Platform 2030 (Yoshimasa Masuda, Donald S. Shepard, Osamu Nakamura, Tetsuya Toma)....Pages 3-14
Stumbling Blocks of Utilizing Medical and Health Data: Success Factors Extracted from Australia–Japan Comparison (Yoshiaki Fukami, Yoshimasa Masuda)....Pages 15-25
Digital Financial Incentives for Improved Population Health in the Americas (Donald S. Shepard, Yoshimasa Masuda)....Pages 27-37
Front Matter ....Pages 39-39
Trial Run of a Patient Call System Using Mobile Devices (Kei Teramoto, Hiroshi Kondoh)....Pages 41-46
Advance Watermarking Algorithm Using SURF with DWT and DCT for CT Images (Saqib Ali Nawaz, Jingbing Li, Uzair Aslam Bhatti, Muhammad Usman Shoukat, Anum Mehmood)....Pages 47-55
Improving Depth Perception using Multiple Iso-Surfaces for Transparent Stereoscopic Visualization of Medical Volume Data (Daimon Aoi, Kyoko Hasegawa, Liang Li, Yuichi Sakano, Satoshi Tanaka)....Pages 57-66
Design and Simulation of a Robotic Manipulator for Laparoscopic Uterine Surgeries (H. A. G. C. Premachandra, K. M. Thathsarana, H. M. A. N. Herath, D. L. F. M. Liyanage, Y. W. R. Amarasinghe, D. G. K. Madusanka et al.)....Pages 67-79
Self-Skill Training System for Chest Compressions in Neonatal Resuscitation Workshop (Noboru Nishimoto, Reiji Watanabe, Haruo Noma, Kohei Matsumura, Sho Ooi, Kogoro Iwanaga et al.)....Pages 81-89
Front Matter ....Pages 91-91
Comparative Study of Pattern Recognition Methods for Predicting Glaucoma Diagnosis (Louis Williams, Salman Waqar, Tom Sherman, Giovanni Masala)....Pages 93-103
Research on Encrypted Face Recognition Algorithm Based on New Combined Chaotic Map and Neural Network (Jiabin Hu, Jingbing Li, Saqib Ali Nawaz, Qianguang Lin)....Pages 105-115
A 3D Shrinking-and-Expanding Module with Channel Attention for Efficient Deep Learning-Based Super-Resolution (Yinhao Li, Yutaro Iwamoto, Yen-Wei Chen)....Pages 117-125
Dynamic Facial Features in Positive-Emotional Speech for Identification of Depressive Tendencies (Jia-Qing Liu, Yue Huang, Xin-Yin Huang, Xiao-Tong Xia, Xi-Xi Niu, Lanfen Lin et al.)....Pages 127-134
Hand-Crafted and Deep Learning-Based Radiomics Models for Recurrence Prediction of Non-Small Cells Lung Cancers (Panyanat Aonpong, Yutaro Iwamoto, Weibin Wang, Lanfen Lin, Yen-Wei Chen)....Pages 135-144
Weakly and Semi-supervised Deep Level Set Network for Automated Skin Lesion Segmentation (Zhuofu Deng, Yi Xin, Xiaolin Qiu, Yeda Chen)....Pages 145-155
Front Matter ....Pages 157-157
A Transcriptional Study of Oncogenes and Tumor Suppressors Altered by Copy Number Variations in Ovarian Cancer (Giorgia Giacomini, Gabriele Ciravegna, Marco Pellegrini, Romina D’Aurizio, Monica Bianchini)....Pages 159-169
Analysis of Acoustic Features Affected by Residual Food in the Piriform Fossa Toward Early-Detection of Dysphagia (Tomoki Hosoyama, Masahiro Koto, Masafumi Nishimura, Masafumi Nishida, Yasuo Horiuchi, Shingo Kuroiwa)....Pages 171-177
Automatic Joint Space Distance Measurement Method for Rheumatoid Arthritis Medical Examinations (Tomio Goto, Yoshiki Sano, Koji Funahashi)....Pages 179-189
Development of an Active Compression System for Venous Disease (L. S. Paranamana, S. K. M. M. Silva, M. A. S. V. Gunawardane, Indrajith D. Nissanka, Y. W. R. Amarasinghe, Gayani K. Nandasiri)....Pages 191-200
Design and Development of a Droplet-Based Microfluidics System Using Laser Fabrication Machining Techniques for a Lab on a Chip Device (W. H. P. Sampath, S. P. Hettiarachchi, N. H. R. G. Melroy, Y. W. R. Amarasinghe)....Pages 201-210
Design of a Novel MEMS-Based Microgripper with Hybrid Actuation to Determine Circulating Tumor Cell (CTC) Progression (M. P. Suriyage, T. A. B. Prabath, Y. L. G. C. L. Wickramathilaka, S. K. M. M. Silva, M. A. S. V. Gunawardane, J. A. D. N. Jayawardana et al.)....Pages 211-220
Back Matter ....Pages 221-222
Recommend Papers

Innovation in Medicine and Healthcare: Proceedings of 8th KES-InMed 2020 [1st ed.]
 9789811558511, 9789811558528

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Smart Innovation, Systems and Technologies 192

Yen-Wei Chen Satoshi Tanaka Robert J. Howlett Lakhmi C. Jain   Editors

Innovation in Medicine and Healthcare Proceedings of 8th KES-InMed 2020

123

Smart Innovation, Systems and Technologies Volume 192

Series Editors Robert J. Howlett, Bournemouth University and KES International, Shoreham-by-sea, UK Lakhmi C. Jain, Faculty of Engineering and Information Technology, Centre for Artificial Intelligence, University of Technology Sydney, Sydney, NSW, Australia

The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/8767

Yen-Wei Chen Satoshi Tanaka Robert J. Howlett Lakhmi C. Jain •





Editors

Innovation in Medicine and Healthcare Proceedings of 8th KES-InMed 2020

123

Editors Yen-Wei Chen College of Information Science and Engineering Ritsumeikan University Kusuatsu, Shiga, Japan Robert J. Howlett Bournemouth University Poole, UK

Satoshi Tanaka College of Information Science and Engineering Ritsumeikan University Kusatsu, Shiga, Japan Lakhmi C. Jain Faculty of Engineering and Information Technology University of Technology Sydney Broadway, NSW, Australia Liverpool Hope University Liverpool, UK

ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-15-5851-1 ISBN 978-981-15-5852-8 (eBook) https://doi.org/10.1007/978-981-15-5852-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

InMed 2020 Organization

Honorary Chair Lakhmi C. Jain, University of Technology Sydney, Australia; Liverpool Hope University, UK and KES International, UK

Executive Chair Robert J. Howlett, Bournemouth University, UK

General Chair Yen-Wei Chen, Ritsumeikan University, Japan

Program Chair Satoshi Tanaka, Ritsumeikan University, Japan

International Program Committee Members Marco Anisetti, University of Milan, Italy Ahmad Taher Azar, Prince Sultan University, Saudi Arabia Adrian Barb, Penn State University, USA

v

vi

InMed 2020 Organization

Smaranda Belciug, University of Craiova, Romania Jenny Benois-Pineau, Université Bordeaux 1, France Isabelle Bichindaritz, State University of New York at Oswego, USA Christopher Buckingham, Aston University, UK Cecilia Dias Flores, UFCSPA, Brazil Amir H. Foruzan, Shahed University, Iran Arnulfo Alanis Garza, Instituto Tecnologico de Tijuana, Mexico Georgiana Gavril, National Institute of Research Development for Biological Sciences, Romania Arfan Ghani, University of Bolton, Greater Manchester, United Kingdom Juan Gorriz, University of Granada, Spain Tomio Goto, Nagoya Institute of Technology, Japan Aboul Ella Hassanien, Cairo University, Egypt Elena Hernandez-Pereira, University of A Coruna, Spain Chieko Kato, Toyo University, Japan Dalia Kriksciuniene, Vilnius University, Lithuania Liang Li, Ritsumeikan University, Japan Giosue’, Lo Bosco, University degli Studi di Palermo, Italy Francisco J. Martinez-Murcia, University of Malaga, Spain Yoshimasa Masuda, Keio University, Japan Tadashi Matsuo, Ritsumeikan University, Japan Rashid Mehmood, King Abdul Aziz University, Saudi Arabia Victor Mitrana, Polytechnic University of Madrid, Spain Susumu Nakata, Ritsumeikan University, Japan Marek R. Ogiela, AGH University of Science and Technology, Krakow, Poland Manuel Penedo, Research Center CITIC, Spain Dorin Popescu, University of Craiova, Romania Xu Qiao, Shandong University, China Ana Respício, Universidade de Lisboa, Portugal John Ronczka, SCOTTYNCC Independent Research Scientist, Australia Yves Rybarczyk, Dalarna University, Sweden Naohisa Sakamoto, Kobe University, Japan Donald Shepard, Brandeis University, USA Catalin Stoean, University of Craiova, Romania Ruxandra Stoean, University of Craiova, Romania Kenji Suzuki, Tokyo Institute of Technology, Japan Kazuyoshi Tagawa, Aichi University, Japan Eiji Uchino, Yamaguchi University, Japan Zhongkui Wang, Ritsumeikan University, Japan Junzo Watada, Waseda University, Japan Yoshiyuki Yabuuchi, Shimonoseki City University, Japan Shuichiro Yamamoto, Nagoya University, Japan Hiroyuki Yoshida, Harvard Medical School/Massachusetts General Hospital, USA

InMed 2020 Organization

vii

Organization and Management KES International (www.kesinternational.org) in partnership with the Institute of Knowledge Transfer (www.ikt.org.uk)

Preface

The 8th KES International Conference on Innovation in Medicine and Healthcare (InMed-20) being held as a Virtual Conference, on June 17–19, 2020. The InMed-20 is the 8th edition of the InMed series of conferences. The conference focuses on major trends and innovations in modern intelligent systems applied to medicine, surgery, healthcare, and the issues of an aging population including recent hot topics on artificial intelligence for medicine and healthcare. The purpose of the conference is to exchange the new ideas, new technologies, and current research results in these research fields. We received submissions from many countries in the world. All submissions were carefully reviewed by at least two reviewers of the International Program Committee. Finally 20 papers were accepted to be presented in this proceeding, which covers a number of key areas in smart medicine and healthcare including: (1) Biomedical Engineering, Trends, Research and Technologies; (2) Advanced ICT for Medicine and Healthcare; (3) Statistical Signal Processing and Artificial Intelligence; and (4) Support System for Medicine and Healthcare. In addition to the accepted research papers, we organised the services of leading researchers to present keynote address at the conference. We would like to thank Dr. Kyoko Hasegawa and Ms. Yuka Sato of Ritsumeikan University for their valuable editing assistance for this book. We are also grateful to the authors and reviewers for their contributions.

Kusuatsu, Japan Kusuatsu, Japan Poole, UK Sydney, Australia/Liverpool, UK June 2020

Editors: Yen-Wei Chen Satoshi Tanaka Robert J. Howlett Lakhmi C. Jain

ix

Contents

Biomedical Engineering, Trends, Research, and Technologies Vision Paper for Enabling Internet of Medical Robotics Things in Open Healthcare Platform 2030 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yoshimasa Masuda, Donald S. Shepard, Osamu Nakamura, and Tetsuya Toma

3

Stumbling Blocks of Utilizing Medical and Health Data: Success Factors Extracted from Australia–Japan Comparison . . . . . . . . . . . . . . Yoshiaki Fukami and Yoshimasa Masuda

15

Digital Financial Incentives for Improved Population Health in the Americas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Donald S. Shepard and Yoshimasa Masuda

27

Advanced ICT for Medicine and Healthcare Trial Run of a Patient Call System Using Mobile Devices . . . . . . . . . . . Kei Teramoto and Hiroshi Kondoh Advance Watermarking Algorithm Using SURF with DWT and DCT for CT Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saqib Ali Nawaz, Jingbing Li, Uzair Aslam Bhatti, Muhammad Usman Shoukat, and Anum Mehmood Improving Depth Perception using Multiple Iso-Surfaces for Transparent Stereoscopic Visualization of Medical Volume Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daimon Aoi, Kyoko Hasegawa, Liang Li, Yuichi Sakano, and Satoshi Tanaka

41

47

57

xi

xii

Contents

Design and Simulation of a Robotic Manipulator for Laparoscopic Uterine Surgeries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. A. G. C. Premachandra, K. M. Thathsarana, H. M. A. N. Herath, D. L. F. M. Liyanage, Y. W. R. Amarasinghe, D. G. K. Madusanka, and M. A. M. M. Jayawardane Self-Skill Training System for Chest Compressions in Neonatal Resuscitation Workshop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Noboru Nishimoto, Reiji Watanabe, Haruo Noma, Kohei Matsumura, Sho Ooi, Kogoro Iwanaga, and Shintaro Hanaoka

67

81

Statistical Signal Processing and Artificial Intelligence Comparative Study of Pattern Recognition Methods for Predicting Glaucoma Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Louis Williams, Salman Waqar, Tom Sherman, and Giovanni Masala

93

Research on Encrypted Face Recognition Algorithm Based on New Combined Chaotic Map and Neural Network . . . . . . . . . . . . . . . . . . . . 105 Jiabin Hu, Jingbing Li, Saqib Ali Nawaz, and Qianguang Lin A 3D Shrinking-and-Expanding Module with Channel Attention for Efficient Deep Learning-Based Super-Resolution . . . . . . . . . . . . . . . 117 Yinhao Li, Yutaro Iwamoto, and Yen-Wei Chen Dynamic Facial Features in Positive-Emotional Speech for Identification of Depressive Tendencies . . . . . . . . . . . . . . . . . . . . . . 127 Jia-Qing Liu, Yue Huang, Xin-Yin Huang, Xiao-Tong Xia, Xi-Xi Niu, Lanfen Lin, and Yen-Wei Chen Hand-Crafted and Deep Learning-Based Radiomics Models for Recurrence Prediction of Non-Small Cells Lung Cancers . . . . . . . . . 135 Panyanat Aonpong, Yutaro Iwamoto, Weibin Wang, Lanfen Lin, and Yen-Wei Chen Weakly and Semi-supervised Deep Level Set Network for Automated Skin Lesion Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Zhuofu Deng, Yi Xin, Xiaolin Qiu, and Yeda Chen Support System for Medicine and Healthcare A Transcriptional Study of Oncogenes and Tumor Suppressors Altered by Copy Number Variations in Ovarian Cancer . . . . . . . . . . . . 159 Giorgia Giacomini, Gabriele Ciravegna, Marco Pellegrini, Romina D’Aurizio, and Monica Bianchini

Contents

xiii

Analysis of Acoustic Features Affected by Residual Food in the Piriform Fossa Toward Early-Detection of Dysphagia . . . . . . . . . 171 Tomoki Hosoyama, Masahiro Koto, Masafumi Nishimura, Masafumi Nishida, Yasuo Horiuchi, and Shingo Kuroiwa Automatic Joint Space Distance Measurement Method for Rheumatoid Arthritis Medical Examinations . . . . . . . . . . . . . . . . . . 179 Tomio Goto, Yoshiki Sano, and Koji Funahashi Development of an Active Compression System for Venous Disease . . . 191 L. S. Paranamana, S. K. M. M. Silva, M. A. S. V. Gunawardane, Indrajith D. Nissanka, Y. W. R. Amarasinghe, and Gayani K. Nandasiri Design and Development of a Droplet-Based Microfluidics System Using Laser Fabrication Machining Techniques for a Lab on a Chip Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 W. H. P. Sampath, S. P. Hettiarachchi, N. H. R. G. Melroy, and Y. W. R. Amarasinghe Design of a Novel MEMS-Based Microgripper with Hybrid Actuation to Determine Circulating Tumor Cell (CTC) Progression . . . . . . . . . . . 211 M. P. Suriyage, T. A. B. Prabath, Y. L. G. C. L. Wickramathilaka, S. K. M. M. Silva, M. A. S. V. Gunawardane, J. A. D. N. Jayawardana, H. M. N. W. Bandara, W. P. V. V. Withanapathirana, and Y. W. R. Amarasinghe Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

About the Editors

Yen-Wei Chen received his Ph.D. degree from Osaka University, Japan, in 1990. He is currently a Professor at the College of Information Science and Engineering, Ritsumeikan University, Japan. He is also a Visiting Professor at the College of Computer Science and Technology, Zhejiang University, China. His research interests include medical image analysis and pattern recognition. Satoshi Tanaka received his Ph.D. in Theoretical Physics at Waseda University, Japan, in 1987. After serving as an Assistant Professor, Senior Lecturer and Associate Professor at Waseda and Fukui Universities, he became a Professor at Ritsumeikan University in 2002. His current research focuses on computer visualization of complex 3D shapes such as 3D scanned cultural heritage objects, internal structures of the human body, and fluid simulations. Robert J. Howlett is the Executive Chair of KES International, a non-profit organization that facilitates knowledge transfer and the dissemination of research findings in areas such as intelligent systems, sustainability and knowledge transfer. He is a Visiting Professor at Bournemouth University in the UK. His has technical expertise in the use of intelligent systems to solve industrial problems, and his current research focuses on the use of smart microgrids to reduce energy costs and lower carbon emissions in areas such as housing and protected horticulture. Lakhmi C. Jain, Ph.D., M.E., B.E. (Hons) Fellow of Engineers Australia, works for the University of Technology Sydney, Australia, and Liverpool Hope University, UK. He also serves at KES International, providing the professional community with opportunities for knowledge exchange, cooperation and teaming. Involving around 5,000 researchers drawn from universities and companies worldwide, KES facilitates international cooperation and generates synergy in teaching and research.

xv

Biomedical Engineering, Trends, Research, and Technologies

Vision Paper for Enabling Internet of Medical Robotics Things in Open Healthcare Platform 2030 Yoshimasa Masuda, Donald S. Shepard, Osamu Nakamura, and Tetsuya Toma

1 Introduction There is a movement toward the increasing integration of robot functionalities with the Internet of Things (IoT), Ambient Assisted Living (AAL) systems, and Electronic Health Record (EHR) in the field of healthcare robotics. The concept of cloud robotics was introduced in 2010 when Kuffner defined the concept of robots to rely on data or code from a network to support their operations with cloud platforms [1, 2]. Modern medical robots have been deployed and used broadly to provide all kinds of services from monitoring of the patients, to the operation of critical and unstable tasks such as nursing, diagnosis, and surgical applications [3]. The IoMRT is also playing an important role in medical fields to enhance the effectiveness and speed of using medical devices and improve operational accuracy. The IoMRT can be utilized to collect patients’ health data using sensors and devices connected to health monitoring systems via the Internet through online networks. The concept of IoMRT is the integration of robotic technology interface with computer networks, web, and digital platforms, which collect patients’ health information in real time and allows doctors to direct the services required for the patients [3–6].

Y. Masuda (B) The School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA e-mail: [email protected]; [email protected] D. S. Shepard The Heller School for Social Policy and Management, Brandeis University, Boston, MA, USA Y. Masuda · O. Nakamura Graduate School of Media and Governance, Keio University, Kanagawa, Japan T. Toma Graduate School of System Design and Management, Keio University, Kanagawa, Japan © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_1

3

4

Y. Masuda et al.

Continual change is a hallmark of information societies, such as the development of new technologies, globalization, shifts in customer needs, and new business models. Significant changes in cutting-edge IT technology due to recent developments in Cloud computing and Mobile IT (such as progress in big data technology) have emerged as new trends in information technology. Furthermore, major advances in these technologies and processes have created a “digital IT economy,” bringing about business opportunities along with business risks, and forcing enterprises to innovate or face the consequences [7]. Enterprise Architecture (EA) usefully contributes to the design of large integrated systems, helping to address a major technical challenge toward the era of cloud, mobile IT, big data, and digital IT in digital transformation. From a comprehensive perspective, EA encompasses all enterprise artifacts, such as businesses, organizations, applications, data, and infrastructure, to establish the current architecture visibility and future architecture/roadmap. On the other hand, EA frameworks need to embrace change in ways that consider the emerging new paradigms and requirements affecting EA, such as mobile IT and the cloud [8, 9]. In light of these developments, a previous study proposed the “Adaptive Integrated EA framework,” which should align with IT strategy promoting cloud, mobile IT, Digital Platform, and verified this in the case study [10]. Those authors named the EA framework suitable for the era of Digital IT as “Adaptive Integrated Digital Architecture Framework—AIDAF” [11].

2 Recent Trends in Digital Healthcare, IoMRT and EA 2.1 Cloud Computing, Big Data, Internet of Things, and Digital Healthcare Cloud computing is an economical option of acquiring strong computing resources to deal with large-scale data, with substantial adoption in the healthcare industry [12]. The main cloud-based healthcare and biomedicine applications have been reviewed, too [13], for instance, the “Veeva systems” is a cloud computing-based software-asa-service (SaaS) company focused on pharmaceutical industry applications. The implementation of big data analytics in healthcare is making progress, enabling the examination of large data sets involving EHR to uncover hidden patterns, unknown correlations, and other useful information [14, 15]. The advances in big data analytics can help transform research situations from being descriptive to predictive and prescriptive [16]. The big data analytics in healthcare can contribute to evidence-based medicine and patient profile analyses, and many other applications. Furthermore, big data analytics can effectively reduce healthcare concerns, such as the improvement of healthcare-related systems [17].

Vision Paper for Enabling Internet of Medical Robotics Things …

5

The term of “Internet of Things (IoT)” is used to mean “the collection of uniquely identifiable objects embedded in or accessible by Internet hosts” [7], such as interaction devices, smart homes, and other SmartLife scenarios. The current state of research for the Internet of Things architecture [18] lacks an integral understanding of EA and management [19–21], and shows a number of physical standards, methods, tools, and a large number of heterogeneous IoT devices [22]. Zimmerman et al. proposed a first Reference Architecture (RA) for the IoT [22]. It can be mapped to a set of open source products. This RA covers aspects like “cloud server-side architecture,” “monitoring/management of IoT devices, services,” “specific lightweight RESTful communications,” and “agent, code on small, low-power devices.” Layers can be instantiated by suitable technologies for the IoT [22]. IoT can be the main enabler for distributed healthcare applications [23], therefore, potentially can contribute to the overall decrease of healthcare costs while increasing health outcomes, although behavioral changes of the stakeholders are required [16, 23]. Medical, diagnostic, and imaging sensor devices with wireless technologies constitute a core part of the IoT [24], though general-purpose smart devices such as smartphones or PDAs are leveraged in several healthcare applications [24, 25]. The Internet of Medical Things (IoMT) [26] refers to applications enabled with personal healthcare systems and consists of wearable sensors and devices connected to a personal health hub (e.g., a smartphone) that is connected to the Internet.

2.2 IoRT and IoMRT Internet of Things (IoT) can change the face of robotics by proposing next generation class of intelligent robotics titled “Internet of Robotic Things (IoRT),” in the near future in collaboration with artificial intelligence, machine learning, deep learning, and cloud computing [27]. The primary element of the design and development of IoRT is “cloud robotics.” The cloud robotics are designed as a transition state between preprogramed robotics and networked robotics [27]. IoRT can be positioned on top of the cloud robotics paradigm, while leveraging aspects of cloud computing like virtualization technology, and three cloud service models (i.e., SaaS, platform as a service (PaaS), infrastructure as a service (IaaS)), while utilizing IoT and related technologies in designing and implementing new applications for networked robotics [28]. The Internet of Medical Robotics Things (IoMRT) is playing a crucial role in medical environments to enhance the effectiveness of using medical devices, speed, and operating accuracy. The IoMRT can be utilized to collect the patients’ health data with sensors and devices connected to the Internet-based health monitoring systems through online networks [3]. They particularly incorporate clinical, diagnostic, and therapeutic services into IoMRT, which will lead to extension to surgical operation services, including nursing, laboratory test analysis, rehabilitation services, clinical diagnosis [29, 30], therapeutic [3, 31], medical information transformation [32, 33], operating diagnosis devices, treatment, and surgical assistance. Robotic technology

6

Y. Masuda et al.

with IoT enables physician or other medical professionals to utilize a smartphone, a wearable device to control the robots’ operation, and perform specific works in real time [3]. The concept of IoMRT should be the integration of a robotic technology interface with a network through the web that can collect patient’s information in real time, and can also enable a physician or other medical professional to direct the services they ordered for their patient [3–6]. Cloud robotics can be treated as a growing area of robotic autonomy established in the distributed computing and services that enables robots to take advantage of the effective computational, storage, and communication methods with the cloud, while removing overheads for assignments and increment activity costs by cloud information exchange rates [3, 28].

2.3 AIDAF Framework In the past 10 years, EA has become an important framework to model the relationship between overall images of corporate and individual systems. In ISO/IEC/IEEE42010:2011, an architecture framework is defined as “conventions, principles, and practices for the description of architecture established within a specific domain of application and/or community of stakeholders” [34]. EA is an essential element of corporate IT planning and offers benefits to companies, like coordination between business and IT [35]. Chen et al. have discussed the integration of EA with Service-Oriented Architecture (SOA) [36]. OASIS, a public standards group [37], introduces an SOA reference model. Meanwhile, attention has been focused on microservice architecture, which allows rapid adoption of new technologies like Mobile IT, IoT, and cloud computing [38]. SOA and microservice vary greatly from the viewpoint of service characteristics [39]. Microservice is an approach for dispersed systems defined from the two basic forms of functional services through an Application Programming Interface (API) layer and infrastructure services [38]. In terms of cloud computing, many mobile IT applications operate with SaaS cloud-based software [40]. Traditional EA approaches require months to develop an EA to achieve a cloud adoption strategy, and organizations will demand adaptive EA to iteratively develop and manage an EA for cloud technologies [41]. Moreover, few studies discussed EA integration with mobile IT [11]. From the standpoint of EA for cloud computing, there should be only an adaptive EA framework that is supporting elements of cloud computing [42]. Moreover, according to the previous survey research [42], when promoting cloud/mobile IT in a strategic manner, a company that has applied TOGAF or FEAF can adopt the integrated framework using the adaptive EA framework supporting elements of cloud computing. In the aforementioned approach, another preliminary research of this paper proposed an adaptive integrated EA framework depicted in Fig. 1 of this preliminary research paper, which should integrate with IT strategy promoting cloud, mobile IT,

Vision Paper for Enabling Internet of Medical Robotics Things …

7

digital IT, and verified this in the case study [10]. The proposed model is an EA framework integrating an adaptive EA cycle with TOGAF or a simple EA framework for different business units in the upper part of the diagram in [10]. The primary author named this EA framework as “Adaptive Integrated Digital Architecture Framework (AIDAF) [11].” In the adaptive EA cycle, project plan documents including architecture for new digital IT projects should be made on a short-term basis in the context phase by referring to materials of the defining phase (e.g., architectural guidelines aligned with IT strategy) per business needs. During the assessment/architecture review phase, the Architecture Board (AB) reviews the architecture in the initiation documents for the IT project. In the Rationalization Phase, the stakeholders and AB decide upon replaced or decommissioned systems by the proposed new information systems. In the realization phase, the project team begins to implement the new IT project after deliberating issues and action items [10, 11]. In the adaptive EA cycle, corporations can adopt an EA framework such as TOGAF and simple EA framework based on an operational division unit in the top part of the Fig. 1 of [10, 11] in alignment between EA guiding principles and each division’s ones, which can correspond to differing strategies in business divisions in the mid-long term [10, 11].

3 Architecture of IoRT and IoMRT 3.1 Overview of IoRT Architecture Figure 1 depicts a general architecture for IoRT-based robotic systems to show the indepth architecture of IoRT these days as below. All applications covered in IoRT can facilitate robotics to operate in smart environments with high accuracy to perform everything [27]. The conceptual architecture of IoRT should be divided into four layers: (1) the hardware layer, (2) the network and Internet layers, (3) infrastructure layer, and (4) the application layer [28]. The bottom most layer of the conceptual architecture of IoRT is the hardware layer, which can consist of various robots and additional devices such as sensors, smartphones, defense equipment, weather sensors, underwater/personal equipment, home appliances, vehicles, and industrial sensors [28]. The second-most bottommost layers are the network layer and Internet layer. The network layer/connectivity can enable connectivity between sensors and robots and even facilitate machine-to-machine communication. Various network connectivity options are available at the network layer such as Cellular, Short Range, and Medium-Long Range [27, 28]. The Internet layer/connectivity plays a central role in all communication in the IoRT architecture. As IoRT uses IoT, so, too, various IoT

8

Y. Masuda et al.

Fig. 1 Conceptual architecture of IoRT [27, 28]

standardized protocols are utilized to facilitate machine-to-machine and machineto-human communication. Protocols that operate in Internet layer are: MQTT, IPv6, UDP, uIP, and DDS [28]. The second top layer is the IoRT infrastructure layer, which is the most valuable architectural layer of all, in terms of service-centric approaches of cloud, middleware, business process, and big data altogether. Furthermore, this layer consists of five different, but related elements such as robotic cloud platform, Machine-to-Machineto-Actuator (M2M2A) cloud platform support, IoT business cloud services, Big Data services, and IoT cloud robotics infrastructure, as shown in Fig. 1 [27, 28]. The topmost layer of the IoRT conceptual architecture is the application layer [27]. Robots when combined with IoT, cloud can be utilized in the fields like data centers, EC sites, WSDL [43] interface, business shows, research and development centers, and many more [28].

3.2 Overview of IoMRT Architecture The architecture of IoMRT is split into four layers: (1) the Sensor/Actuator Layer,(2) Network Layer and Internet Layer, (3) IoMRT Infrastructure Layer (transport layer, service layer), and (4) the Application Layer, as shown in Fig. 2 [3]. The bottom layer shown in Fig. 2 is the sensor/actuator Layer consisting of robotic tools like sensors, scrub nurse, surgical tool, 3D camera, etc. The 3D camera captures patient/surgical robot movements and sends to the physician room with instructions via voice recognition. The second bottommost layers are the network and Internet layers, which have many methods of network connectivity like cell network, e.g., 3G and LTE/4G,

Vision Paper for Enabling Internet of Medical Robotics Things …

9

Fig. 2 Conceptual architecture of IoMRT [3]

short-run correspondence innovations, e.g., Wi-Fi and Li-Fi, Bluetooth Low Energy (BLE), etc., medium-long range correspondence advancements, e.g., WiMAX, ZWave, 5G Network Slicing, etc. This layer supports some protocols and uses mainly Li-Fi, which has faster access than Wi-Fi and transmits the data at low cost. IoMRT communication ones have been included in this layer for vitality effective, ability constraint, and data processing in robotic systems with TCP protocols. The second top layer is the IoMRT infrastructure layer, which consists of transport layer and service layer. This layer is an aggregate of five related parts: robot platform support, M2M cloud platform support, QOS device management, IoMRT security support, and big data services (service layer) as shown in Fig. 2. The above QOS device management covers identity management, authentication, event tracing and privacy security policies, etc. The IoMRT security support involves, for example, image processing, video processing, face/voice recognition, and robot operation monitoring [3]. The uppermost layer is the application layer, which covers patient data accessing, remote monitoring, drug administration, medical academic research, and healthcare policymaking, among others. The IoMRT design will lower the client practice with investigating introduced robotic surgical operations that can be performed using the above robotics technology [3].

4 AIDAF Application for IoMRT and Digital Platforms The author of this paper proposed an adaptive integrated EA framework to align with IT strategy, promoting Cloud/Mobile IT/Big Data/Digital IT, and verified by our case study [10]. Furthermore, the author has named the EA framework suitable

10

Y. Masuda et al.

for the era of Digital IT as an “Adaptive Integrated Digital Architecture Framework— AIDAF” [11]. Figure 2 of [44] illustrates this AIDAF proposed model in the Open Healthcare Platform 2030 (OHP2030) community [44]. The OHP2030 community is comprised of healthcare organizations such as hospitals, medical development, and healthcare companies serving aged patients, pharmaceutical companies, as well as the OHP2030 initiative and government, as depicted in Fig. 2 of [44]. AIDAF will be applied to the above-mentioned cross-functional healthcare community in the OHP2030. AIDAF begins with the context phase, while referring the defining phase (i.e., architecture design guidelines related to digital IT and IoMRT aligned with digital platform strategy). During the assessment and architecture review, the AB reviews the initiation documents and related architectures for platform projects in the healthcare community in the OHP2030.

5 Cases of Enabling Medical Robotics and Digital Platforms 5.1 Robotic Nursing Related Case According to Fosch-Villaronga [1], robotic nursing is one of the most important areas for medical robotics, cloud platforms. In this section, characteristics of the robotic nursing are described with challenges. General Data Protection Regulation (GDPR) is a crucial element for security (Table 1). Challenge There could be challenges in identifying controller and processor responsibilities and roles in cloud robotics ecosystems, data security and safety in cloud robotics, etc. Summary Robotic nursing can become popular as humanoid robots in cloud robotics, such as Zenbo, NAO and Unibo, etc. with the above advantages as cloud robotics [1]. Table 1 Characteristics for robotic nursing Robot type

Examples

Security

Issues

Advantages, use case summary

Cloud robot, Humanoid

Zenbo NAO Unibo

Data security is significant and should be ensured for personal data with Art. 32 GDPR with encryption of SSL, TLS

– GDPR is difficult – Can react in to apply emergency cases. – Transparency, user (Zenbo) rights for metadata – Helps children in – Increased data the autism security in cloud spectrum (NAO) robotics – Interacts with people in more natural ways (Unibo)

Vision Paper for Enabling Internet of Medical Robotics Things …

11

Table 2 Characteristics for the robotic surgery Robot type

Examples

Cloud robotic with da Vinci Li-Fi Verb surgical ROBODOC MazorX

Security

Issues

Inflexible – Longer setup procedures are times required to protect – Emerging technology’s integrity, trust, and risks privacy to secure – Minimal haptic IoMRT feedback information

Advantages, summary – Less incident of wound infection – Excellent patient outcome – Less postoperative pain – Extremely precise – Bloodless surgery

5.2 Robotic Surgery Related Case According to Guntur et al. [3], the robotic surgery market is expected to be worth US $20 billion by next year. In this section, characteristics of the robotic surgery is described with challenges. Medical robotics covering “ROBODOC” is widely used in orthopedic surgery [45] (Table 2). Challenge Challenges include accuracy of data acquisition, interoperability between hardware and software, design optimization, quality of health services, etc. Summary Robotic surgery gets acceptance in many surgical specialties because it is often less expensive and does not fatigue like a human surgeon.

5.3 Other Medical Robotics Cases Such as Robotic Diagnosis, Robotics Rehabilitation Robotic Diagnosis Some robotics systems are used for diagnosis of data from specific environments, such as medical practice, and can facilitate projection for diseases at an initial stage, diagnosis, and treatment with management. Robotic systems are able to serve the long-term clinical diagnostic practices and communicate with doctors [3]. Robotic Tele-echography (TER) can help in real-time diagnosis of a patient at a remote location using the generated tele-echography data [46], where an expert operator performs examination [45]. Robotics Rehabilitation Rehabilitation robots have promising capacity after various kinds of surgery. Robotic systems can serve the long-term rehabilitation practices and communicate with doctors [3]. Rehabilitation robots are widely used in the healthcare sector and can restore normal form and function after injury and illness. Rehabilitation engineering is intended to give assistive equipment to the injured. The

12

Y. Masuda et al.

devices used in rehabilitation include tactile sensors such as force sensors, tactile skin, thermal sensors, and touch reception [45].

6 Discussion and Challenges IoMRT-related standardized cloud platforms are challenging to design due to the many types of medical robots, cloud robotics, and humanoid robots. It will be more pragmatic to design the standardized robotics cloud platform based on each medical robot’s category and character. The robotics industry is changing very rapidly. The approach of defining the cloud platform’s specification for medical robotics from existing equivalent medical applications’ structure also exists, that covers the technology architecture structure in each IoMRT-related cloud platform, system, and projects in OHP2030. Furthermore, several case studies in IoMRT-related cloud platforms need to be undertaken and verified in the near future.

7 Final Thoughts for IoMRT, Digital Platforms in OHP2030 In this paper, we described the vision for enabling the Internet of Medical Robotics Things (IoMRT) related digital platforms for medical robotics in OHP2030 with several examples. This research initiative, named OHP2030, aims at an exploration and definition of a digital healthcare platform such as IoT and Big Data, which covers the IoMRT-based digital platforms for medical robotics. Furthermore, we would like to systematize the digital healthcare application systems and digital platforms in the healthcare and medical industry, while ensuring information security, privacy, compliance, validation, and other priorities.

References 1. Fosch-Villaronga, E., Felzmann, H., Ramos-Montero, M., Mahler, T.: Cloud services for robotic nurses? In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Spain (2018) 2. Kuffner, J.: Cloud-enabled humanoid robots. In: Humanoids, 2010 Workshop “What’s Next”. Google Research. RI CMU (2010) 3. Guntur, S.R., Gorrepati, R.R., Dirisala, V.R.: Machine Learning in Bio-Signal Analysis and Diagnostic Imaging. Elsevier, 293–318 (2019) 4. Pandikumar, S., Vetrivel, R.S.: Internet of things based architecture of web and smart home interface using GSM. IJIRSET 3 (2014) 5. Tan, L., Wang, N.: Future internet: the internet of things. In: 3rd International Conference on Advanced Computer Theory and Engineering, vol. 5, pp. 376–380 (2010)

Vision Paper for Enabling Internet of Medical Robotics Things …

13

6. Wu, M., Lu, T.J., Ling, F.Y., Sun, J., Du, H.Y.: Research on the architecture of internet of things. In: 3rd International Conference on Advanced Computer Theory and Engineering, vol. 5, pp. 484–487 (2010) 7. Boardman, S., Harrington, E.: Snapshot-open platform 3.0™. The Open Group (2015) 8. Alwadain, A., Fielt, E., Korthaus, A., Rosemann, M.: A comparative analysis of the integration of SOA elements in widely-used enterprise architecture frameworks. Int. J. Intell. Inf. Technol. 54–70 (2014) 9. Buckl, S., Matthes, F., Schulz, C., Schweda, C.M.: Exemplifying a framework for interrelating enterprise architecture concerns. In: Sicilia, M.A., Kop, C., Sartori, F. (eds.) Ontology, Conceptualization and Epistemology for Information Systems, Software Engineering and Service Science, pp. 33–46. Springer, Berlin, Heidelberg, New York (2010) 10. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Int. J. Enterp. Inf. Syst. IJEIS, (IGI Global) 13, 1–22 (2017) 11. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Architecture board practices in adaptive enterprise architecture with digital platform: a case of global healthcare enterprise. Int. J. Enterp. Inf. Syst. (IGI Global) 14, 1 (2018) 12. Aceto, G., Persico, V., Pescapéa, A.: The role of information and communication technologies in healthcare: taxonomies, perspectives, and challenges. J. Netw. Comput. Appl. 107, 125–154 (22018) 13. Calabrese, B., Cannataro, M.: Cloud computing in healthcare and biomedicine. Scalable Comput. Pract. Exp 16, 1–18 (2015) 14. Archenaa, J., Anita, E.M.: A survey of big data analytics in healthcare and government. Procedia Comput. Sci. 50, 408–413 (2015) 15. Chawla, N.V., Davis, D.A.: Bringing big data to personalized healthcare: a patient-centered framework. J. Gen. Internal Med. 28, 660–665 (2013) 16. Osmani, V., Balasubramaniam, S., Botvich, D.: Human activity recognition in pervasive healthcare: supporting efficient remote collaboration. J Netw Comput Appl 31, 628–655 (2008) 17. Jee, K., Kim, G.-H.: Potentiality of big data in the medical sector: focus on how to reshape the healthcare system. Healthcare Inf Res 19, 79–85 (2013) 18. Patel, P., Cassou, D.: Enabling high-level application development for the internet of things. J. Syst. Softw. (Elsevier), 1–26 (2015) 19. Iacob, M.E., et al.: Delivering business Outcome with TOGAF® and ArchiMate® : BiZZde-sign (2015) 20. Johnson, P., et al.: IT Management with enterprise architecture stockholm: KTH (2014) 21. The Open Group. TOGAF Version 9.1: Van Haren Publishing (2011) 22. Zimmermann, A., Schmidt, R., Sandkuhl, K., Jugel, D.: Digital enterprise architecture—Transformation for the internet of things. In: IEEE 19th International Enterprise Distributed Object Computing Workshop (EDOCW) (2015) 23. Couturier, J., Sola, D., Borioli, G.S., Raiciu, C.: How can the internet of things help to overcome current healthcare challenges. Commun. Strat. 87, 67–81 (2012) 24. Islam, S.M.R., Kwak, D., Kabir, M.H., Hossain, M., Kwak, K.S.: The internet of things for health care: a comprehensive survey. IEEE Access 3, 678–708 (2015) 25. Yeole, A.S., Kalbande, D.: Use of internet of things (iot) in healthcare: a survey. In: Proceedings of the ACM Symposium on Women in Research, pp. 71–76 (2016) 26. Jha, N.K.: Internet-of-medical-things. In: Proceedings of the Great Lakes Symposium on VLSI, GLSVLSI ’17 ACM, p. 7, New York, NY, USA (2017) 27. Nayyar, A., Batth, R.S., Nagpal, A.: Internet of robotic things: driving intelligent robotics of future-concept, architecture, applications and technologies. In: 4th IEEE International Conference, pp. 151–160 (2018) 28. Wan, J., Tang, S., Yan, H., Li, D., Wang, S., Vasilakos, A.V.: Cloud robotics: current status and open issues. IEEE Access 4, 2797–2807 (2016) 29. Annavarapu, A., Borra, S., Kora, P.: ECG signal dimensionality reduction-based atrial fibrillation detection. In: Dey, N., Ashour, A., Borra, S., (eds.) Classification in BioApps, pp. 383–406. Springer, Cham (22018)

14

Y. Masuda et al.

30. Kora, P., Annavarapu, A., Borra, S.: ECG based myocardial infarction detection using different classification techniques. In Dey, N., Ashour, A., Borra, S. (eds.) Classification in BioApps, pp. 57–77. Springer, Cham (2018) 31. Thanki, R., Borra, S., Dey, N., Ashour, A.S.: Medical imaging and its objective quality assessment: an introduction. In Classification in BioApps (Springer), pp. 3–32 (2018) 32. Dey, N., Ashour, A.S., Borra, S.: Classification in BioApps: automation of Decision Making. Springer, Cham (2017) 33. Vidyasree, P., Madhavi, G., Viswanadharaju, S., Borra, S.: A bio-application for accident victim identification using biometrics. In: Dey, N., Ashour, A., Borra S. (eds.) Classification in BioApps, pp. 407–447. Springer, Cham (2018) 34. Garnier, J.-L., Bérubé, J., Hilliard, R.: Architecture guidance study report 140430, ISO/IEC JTC 1/SC 7 Software and systems engineering (2014) 35. Tamm, T., Seddon, P.B., Shanks, G., Reynolds, P.: How does enterprise architecture add value to organizations? Commun. Assoc. Inf. Syst. 28, 10 (2011) 36. Chen, H.-M., Kazman, R., Perry, O.: From software architecture analysis to service engineering: an empirical study of methodology development for enterprise SOA implementation. IEEE Trans. Serv. Comput. 3, 145–160 (2014) 37. MacKenzie, C.M., Laskey, K., McCabe, F., Brown, P.F., Metz, R.: Reference model for SOA 1.0. Technical Report. In: Advancing Open Standards for the Information Society (2006) 38. Newman, S.: Building Microservices: O’Reilly Media (2015) 39. Richards, M.: Microservices versus. In: Service-Oriented Architecture, 1st edn. O’Reilly Media (2015) 40. Muhammad, K., Khan, M.N.A.: Augmenting mobile cloud computing through enterprise architecture: survey paper. Int. J. Grid Distrib. Comput. 8, 323–336 (2015) 41. Gill, A.Q., Smith, S., Beydoun, G., Sugumaran, V.: Agile enterprise architecture: a case of a cloud technology-enabled government enterprise transformation. In: Proceedings of the 19th Pacific Asia Conference on Information Systems (PACIS), pp. 1–11 (2014) 42. Masuda, Y., Shirasaka, S., Yamamoto, S.: Integrating mobile IT/cloud into enterprise architecture: a comparative analysis. In: Proceedings of the 21th Pacific Asia Conference on Information Systems (PACIS), p. 4 (2016) 43. Gao, S., Gui, Z., Wu, H.: Extending WSDL for describing complex geodata in GIS service. In: Proceedings of the IEEE 3rd International Conference on Agro-Geoinformatics (AgroGeoinformatics), pp. 1–6 (2014) 44. Masuda, Y., Toma, T.: A Vision paper for enabling digital healthcare applications in OHP2030: KES2018, In: 6th International KES Conference on Innovation in Medicine & Healthcare (2018) 45. Daneshmand, M., Bilici, O., Bolotnikova, A., Anbarjafari, G.: Medical robots with potential applications in participatory and opportunistic remote sensing: a review. Elsevier 95, 160–180 (2014) 46. Vilchis, A., Troccaz, J., Cinquin, P., Masuda, K., Pellissier, F.: A new robot architecture for tele-echography. IEEE Trans. Robot. Autom. 19, 922–926 (2003)

Stumbling Blocks of Utilizing Medical and Health Data: Success Factors Extracted from Australia–Japan Comparison Yoshiaki Fukami

and Yoshimasa Masuda

1 Background The use of information technology is effective in improving the efficiency of medical welfare measures [1–3]. The OECD uses two indicators, (1) technical and operational readiness and (2) data governance readiness, to assess health data use in member countries. Countries such as Scandinavian countries, North America, the United Kingdom, and Singapore are both highly evaluated by two indicators. However, both Japan and Australia have a low reputation although Australia has introduced the nationwide Electric Health Record (EHR) known as “Personally Controlled Electronic Health Record” (PCEHR) relatively early in the country, and Japan is known for its high aging rate and population decline rate [4]. Despite the poor reputation for the country from the OECD, there are some regional health and medical information platforms that have been successful in Japan. Tamba City, Hyogo Prefecture, has succeeded the immunization determination system and is extending it to a regional comprehensive care support platform [5]. In this paper, we compare the PCEHR in Australia with the comprehensive community care platform in Tamba City, Hyogo Prefecture, to examine the factors that hinder the utilization of healthcare data and the factors that enable effective institutional and technological point of view (Fig. 1).

Y. Fukami (B) Keio University, 5322 Endo, Fujisawa city, Kanagawa 2520882, Japan e-mail: [email protected] Y. Masuda Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_2

15

16

Y. Fukami and Y. Masuda

Fig. 1 Technical, operational, and governance readiness to use EHR data, 2016 [4]

2 Related Work The use of data to maintain and improve medical standards has been promoted for a long time. An Electronic Medical Record (EMR) is a real-time patient health record with access to evidence-based decision support tools that can be used to aid clinicians in decision-making [6]. The ISO standard defined an EHR as a repository of information regarding the health status of each care in computer processable form [7]. EHR can include past medical histories and medications, immunizations, laboratory data, radiology reports, vital signs as well as patient demographics [8]. More specifically, an EHR is a longitudinal electronic record of patient health information generated by one or more encounters in any care delivery setting, and reporting episodes of care across multiple care delivery organizations within a community, region, or state [6]. EHR design is essentially a consolidation of data held by diverse medical institutions, since not everyone is tested and consulted at a single medical institution for a lifetime. Therefore, security for data transaction/ sharing and interoperability are very important issues. At the same time, healthcare finances are tight, and tend to be designed to ensure security at low cost. There are obstacles to implementation, such as lack of funding and interoperability of current systems, which decelerate the adoption of EHRs [9]. At the same time, there are barriers to the introduction of the EHR, as it aggregates important medical and health information. In addition, there is a high risk that the collected data will not be used effectively or will be used improperly. The goal is not to introduce EHR itself, but to make data-based medical services more efficient and higher quality.

Stumbling Blocks of Utilizing Medical and Health Data …

17

3 Research Methodology In this paper, we conduct a comparative analysis between the PCEHR in Australia and the comprehensive community care platform in Tamba City, Japan with OECD’s indicators: data governance readiness and high technical and operational readiness [10]. In addition, we decided to use three additional indicators: data analysis mechanism, security and privacy protection, and diffusion and coverage to discuss the driving factors for progress in diffusion and utilization.

4 Overview of Each Case 4.1 Tamba City, Japan Tamba City, Hyogo Prefecture is located in a mountainous area near the center of Honshu, Japan. Tamba’s efforts to build a medical and health information platform began with the introduction of a vaccination ledger-based management system managed by the city. The tablets owned by the city are distributed to vaccination medical institutions throughout the city, and connected via a closed mobile network of the system in the city hall. At the same time, IC cards were distributed to those who were vaccinated. At the time of vaccination, the person to be vaccinated is authenticated with IC card by the reader of the tablet. Then, when the vaccination is available, and the target vaccine are be displayed to prevent mistakes in vaccination [5]. The immunization determination system has succeeded to reduce the number of vaccination accidents to almost zero (Table 1). This success encouraged to extend the system to expand the system into a regional comprehensive care support platform. The system is developed to provide history of vaccinations, results of a medical examinations, and prescriptions to electric chart in hospitals, clinics, visiting care offices, and pharmacies. All data is also provided with tablets owned by the municipal office through closed network in the same manner as the immunization records. However, the function of importing information on electronic medical charts to the platform is not implemented. Tamba City has adopted a gradual and evolutionary approach, and this approach does not share any medical information, including diagnoses, which doctors are resistant to sharing. The available information is limited Table 1 Changes in the number of vaccination accidents (Actual count/ Rates per 100,000 population)

Year

Hyogo pref. (Excluding Tamba city Tamba city)

FY2015

268/4.89

12/17.95

FY2016

305/5.58

4/6.05

FY2017

427/7.83

0/0.00

FY2018 (by November) 281/5.17

1/1.55

18

Y. Fukami and Y. Masuda

to prescription and immunization records that originally were reported and shared information between organizations. Collection of prescription data has begun in September 2019, ahead of other features. Tamba City, the medical association, the dental association, the pharmacists’ association, the four hospitals, the social welfare council, and the system vendor established the Tamba Medical Care Collaboration (MCC) promotion organization to develop and operate the regional comprehensive care system. The prescription data is analyzed by city officials, and the results of the analysis are made available to MCC member companies and organizations.

4.2 Australian National Government The PCEHR, an EHR system of Australian national government, is known as “My Health Record” (MyHR) among residents. The PCEHR is designed by the National Electronic Health Transition Authority (NEHTA) with the Australian Government. The PCEHR system enables the secure sharing of health information among healthcare providers while enabling the individual to control who can access their PCEHR. The PCEHR provides global access controls, set of Medicare and healthcare record for patients, sets of PCEHR clinical documents in hospitals, and a consolidated view created from these data.

5 Comparative Analysis 5.1 Data Governance Readiness Tamba’s system is with a very centralized architecture. Tamba City has distributed tablets to core hospitals, clinics, medical the checkup center, pharmacies, and visiting care offices that are connected to a database of the municipal office via a closed network. Tablets are also owned by the municipal office. The system is owned and operated by the municipal office. The responsibility for privacy protection rests only with the municipal office. On the other hand, Australian PCEHR system is designed to interface with a number of existing and new systems. These interfaces include consumer-oriented systems, healthcare provider-oriented systems, repositories like Medicare data and pharmaceutical data and foundation infrastructure such as Healthcare Identifiers and Clinical Terminology [11]. While Tamba City adopts centralized governance model, that the municipal office owns all facilities, network, and data, Australia government has built a platform that connects systems managed by multi-stakeholders. This makes PCEHR difficult to implement disciplined governance. This evaluation must be premised on the limited

Stumbling Blocks of Utilizing Medical and Health Data …

19

geographic coverage, the type of data collected, and the limited stakeholders of the platform of Tamba, that enable to implement such a centralized architecture.

5.2 High Technical and Operational Readiness Tamba City plans that data of regional comprehensive care system consists of vaccination records from vaccination ledger in the municipal office; results of medical examinations in the municipal medical checkup center; and prescriptions from hospitals, clinics, and pharmacies. In addition, prescription data that has been already digitized and collected by the government is utilized. On the other hand, the import function of electronic medical record chart of hospitals and clinics is not implemented. Although there is a function of storing nursing care records, it has not been fully operational, as of January 2020. In other words, the platform of Tamba has realized the unification and expansion of the data operated and collected by the government, and no other data has been collected. PCEHR is designed to aggregate and manage data collected from diverse sources such as EMRs at each medical institution, personally managed PHR, and Medicare history. Assuming that data from various origins is handled in an integrated manner, the Australian government has formulated various specifications for the construction and operation of PCEHR, including data models, [11] reference model [13], and interoperability framework [14]. The platform in Tamba is designed to share the data owned and generated by the municipal office with distributed clinical sites. Therefore, the system is designed with an integrated way from technical and operational perspectives. While Australian government has developed specifications and models for multistakeholder participation, there remain challenges to realize interoperability among institutes. Lwin et al. (2019) pointed out that Computerized Physician Order Entry (CPOE) decreases the cost of medicine and reduces accidents [12]. On the other hand, Tamba City does not implement a function to feed prescription history data back to doctors. Also, due to the small administrative area, it is not possible to accumulate enough data for CPOE to work. In order to improve the efficiency of diagnosis and prescription, it is necessary to increase learning data through expansion of covered area.

5.3 Data Analysis Mechanism The Immunization Determination System of Tamba is developed for the operation of the subsidy system. Therefore, the analysis functions required by the government is already implemented since the beginning. In expanding applications to regional comprehensive care, the first implementation was the import of prescription histories that already had machine-readable

20

Y. Fukami and Y. Masuda

data compatible with national standard specifications of the New Standard Interface of Pharmacy-system Specifications (NSIPS), which allow the in-pharmacy system to respond to electronic prescriptions, and the specification of Japan Association of Healthcare Information Systems Industry (JAHIS), which is an industry organization of medical information system vendors. By comparing the two types of prescription data, it is possible to confirm the difference between the doctor’s prescription and the actual medication, and to realize more effective medical guidance and analysis of drug efficacy. However, unstructured data has not been utilized. PCEHR is compatible with HL7, and there are data models, [11] reference model [13], and interoperability framework [14]. However, the sharing of machine-readable structured data between medical institutions has not yet been fully implemented because the system is a collection of systems built by three system vendors. The introduction of PCEHR was supposed to be a clue to solving the problem, but it has not been solved yet. While Tamba City has succeeded to implement structured data architecture because of its small and centralized system. Australia is still struggling to introduce data specifications to realize interoperability among EHRs.

5.4 Security and Privacy Protection Because of its centralized architecture, the system of Tamba City is very secure for privacy protection. Distributed devices are Android tablets to medical institutions that do not store records of users in their storages, and show records accumulated in the server of the municipal office only after identification of users. PCEHR is developed with emphasis on personal data control rights. However, due to the nationwide platform on which personal information is shared between medical institutions, the risk of “secondary use” of personal information by the 900,000 healthcare workers who can access the system has become apparent. Additionally, health records can be created without the person’s consent [15]. EHR’s privacy chief in Australia quit to take responsibility for the above security and privacy issues [16]. The platform of Tamba City is provided only in a small area with limited functions. These restrictions allow unified management of the city and enhance the effectiveness of privacy protection. By contrast, Australian platform accumulates various kinds of medical and healthcare information from medical institutes all over the nation. However, there is a risk of data misuse by many healthcare professionals. Although data control right is owned by individual, data itself can be generated by healthcare personnel. There is a trade-off between functionality and security.

Stumbling Blocks of Utilizing Medical and Health Data …

21

5.5 Diffusion/Coverage Registration for the platform of Tamba City is optional. Nevertheless, it has succeeded in accumulating enough data to contribute to the improvement of administrative operations and the planning of policies by suppressing accidents involving vaccination for children and starting the analysis of prescription data for the elderly. The system has been developed in two stages, firstly as a vaccination determination system, then extended to a comprehensive community care system. The registration rate for children under 3 years exceeded 89% in 2018 because public health nurse registers e-mail address when visiting newborn. As of December 2019, there are 2,270 system registrants (ID card issuers), and there are 21,675 resident registrations aged 65 and over. This means that the registration rate for the elderly is just under 10%. On the other hand, registration of elderly people is carried out through solicitation for consultation at medical institutions and promotion at shopping malls and the like. Tamba City has started prescription data analysis by staffs in September 2019, aiming to balance administrative spending and maintain and improve public health standards, mainly by improving the use of generic drugs in patients with lifestylerelated diseases. The result of the analysis is shared within member organizations of the Tamba Medical Care Collaboration. PCEHR is designed to aggregate medical and health data such as medical institution EMR and personally managed PHR, as well as Medicare history (Table 2). The data to be aggregated is diverse and includes many unstructured data. Since it is assumed that various data will be accumulated in an integrated manner, the system is implemented in such a way that individual data control rights are maintained. In other words, a sufficient amount of data will not be stored unless many people agree to provide data. PCEHR initially registered users by opt-in, but switched to opt-out in 2015 due to the lack of registered users. Medical records were available for 71,132 people for 1.7 million people who had agreed to utilize their records (4.2%) [17]. In 2016, following the introduction of the opt-in format, the number of individual registrants and registered medical institutions both increased significantly. At the end of 2019, 22.68 million of the 25 million registered as individuals [18]. 51% of community pharmacies registered on June 2018. 74% of public hospitals and health services have been connected to PCEHR in 2018 [19]. However, there are articles which report that analysis and utilization have not progressed although data accumulation has progressed [20]. While the use of structured data managed by the government in small areas is progressing in Tamba, Australian national government has collected diverse data from the entire country. However, the diversity of data, especially lots of unstructured data is barrier to utilization [21].

22 Table 2 Document types registered to PCEHR [19]

Y. Fukami and Y. Masuda Clinical documents Shared health summary Discharge summary Event summary Specialist letter eReferral note Pathology report Diagnostic imaging report Prescription and dispense documents Prescription Dispense Consumer documents Consumer entered health summary Consumer entered notes Advance care directive custodian report Advance care planning document Medicare documents Australian immunization register Australian organ donor register Medicare/DVA benefits report Pharmaceutical benefits report Child my health record documents Personal health observation Personal health achievement Child parent questionnaire

6 Discussion We have conducted a comparative analysis between two cases, which are contrasting, but commonly evaluated as low by the OECD with five indicators. We conclude that each of the two cases has advantages and disadvantages (Table 3). Tamba City has succeeded in improving the efficiency of local medical and welfare measures by utilizing existing structured data in a small area. However, there are significant barriers to generating and utilizing new data and expanding the target area. Also, as long as the centralized operation system of the city hall is maintained, it is difficult to realize a system in which accumulated data is analyzed by various experts. The case of Tamba City is to maximize the optimization to the current situation through the best usage of existing data, to the extent that it complies with the existing legal system. On the other hand, PCEHR has been developed as a nationwide

Stumbling Blocks of Utilizing Medical and Health Data …

23

Table 3 Comparison table of two cases Data governance readiness

Platform of Tamba

PCEHR of Australia





Connecting systems built by multi-stakeholders

Highly integrated operation  of the municipal office/ High barriers to transition to multi-stakeholder operation

While specifications and models are being developed for multi-stakeholder participation

High technical and  operational readiness

Realized by centralized architecture of the municipal office

Data analysis mechanism



Efficient analysis using existing structured data only (no unstructured data)



There remain challenges to realize interoperability among institutes

Security and privacy protection



Unified governance of the municipal office



Fostering individual’s data control right but remains risk of secondary use

Diffusion/ coverage



Small area, limited stakeholders



Nationwide, inclusion of various kinds of professionals

: appreciated, : moderate, ✕: need improvement This evaluation must be premised on the limited geographic coverage, the type of data collected, and the limited stakeholders of the platform of Tamba, that enable to implement such a centralized architecture

system from the beginning with the aim of utilizing various data resources in an integrated manner. At the same time, it employs a design that emphasizes individual data control rights. Despite the detailed specifications including the data model, realizing interoperability between data held by various medical institutions is halfway. The problem of personal registration lag has been resolved by switching from opt-in to opt-out, but it has taken a long time to make the decision. What is needed in common from the two cases is the need for a strategy that can be applied consistently through conception, development, and social implementation, and the introduction of a method that can quickly update according to the situation. Australian government has already adopted the Adaptive Integrated Digital Architecture Framework (AIDAF) [22] to PCEHR to solve the difficulty of privacy issues recently [15]. The AIDAF is “an enterprise architecture framework model integrating an adaptive Enterprise Architecture (EA) cycle” [23] with the “STrategic Risk Mitigation Model (STRMM) for Digital Transformation” [24]. It may be one of an effective framework for Tamba’s platform. We have already used AIDAF in Tamba’s platform analysis for data architecture perspective [5]. It would be easier to overcome the barrier to diffusion and to avoid over-adaption to particular situations by developing a robust strategy for diffusion from the beginning and updating it flexibly according to the situation.

24

Y. Fukami and Y. Masuda

Acknowledgment This work was supported by JSPS Grant-in-Aid for Early-Career Scientists Grant Numbers JP18K12858.

References 1. Barrows, R.C., Clayton, P.D.: Privacy, confidentiality, and electronic medical records (1996). https://academic.oup.com/jamia/article-abstract/3/2/139/708745, https://doi.org/10. 1136/jamia.1996.96236282 2. Wang, S.J., Middleton, B., Prosser, L.A., Bardon, C.G., Spurr, C.D., Carchidi, P.J., Kittler, A.F., Goldszer, R.C., Fairchild, D.G., Sussman, A.J., Kuperman, G.J., Bates, D.W.: A cost-benefit analysis of electronic medical records in primary care. Am. J. Med. 114, 397–403 (2003). https://doi.org/10.1016/S0002-9343(03)00057-3 3. Miller, R.H., Sim, I.: Physicians’ use of electronic medical records: barriers and solutions. Health Aff. 23, 116–126 (2004). https://doi.org/10.1377/hlthaff.23.2.116 4. OECD: Health in the 21st Century: putting data to work for stronger health systems (2019) 5. Fukami, Y., Masuda, Y.: Success factors for realizing regional comprehensive care by EHR with administrative data. In: Smart Innovation, Systems and Technologies, pp. 35–45. Springer (2019). https://doi.org/10.1007/978-981-13-8566-7_4 6. Aceto, G., Persico, V., Pescapé, A.: The role of information and communication technologies in healthcare: taxonomies, perspectives, and challenges (2018) 7. International Organization for Standardization: ISO/TR 20514:2005—Health informatics— Electronic health record—Definition, scope and context (2005) 8. World Health Organization: Management of patient information: Trends and challenges in member states (2012) 9. Devkota, B., Devkota, A.: Electronic health records: advantages of use and barriers to adoption. Heal. Renaiss. 11, 181–184 (2014). https://doi.org/10.3126/hren.v11i3.9629 10. Oderkirk, J.: Readiness of electronic health record systems to contribute to national health information and research (2017). https://doi.org/10.1787/9e296bf3-en 11. National E-Health Transition Authority of Australia: High-Level System Architecture PCEHR System Version 1.35 12. Lwin, A.K., Shepard, D.S., Masuda, Y.: Monetary and health benefits from better health data: estimating lives and dollars saved from universal adoption of the leapfrog safety and quality standards. In: Innovation in Medicine and Healthcare Systems, and Multimedia, pp. 21–33. Springer (2019) 13. National E-Health Transition Authority of Australia: eHealth Reference Model (2014) 14. National E-Health Transition Authority of Australia: Interoperability Framework v2.0 (2007) 15. Masuda, Y., Shepard, D.S., Yamamoto, S.: Adaptive governance on electronic health record in a digital IT era. In: 25th Americas Conference on Information Systems, AMCIS 2019, pp. 1–10 (2019) 16. Grubb, B.: My health record’s privacy chief quits amid claims agency not listening (2018) 17. Dearne, K.: An analysis of Commonwealth Government annual reports covering e-health and PCEHR activities in 2013–2014 (2014) 18. Australian Digital Health Agency: Statistics and Insights: Mar 2019–Dec 2019 (2019) 19. Australian Digital Health Agency: Annual Report 2017–2018 (2017) 20. Cowan, P.: Most Australian GP clinics aren’t using e-health records (2016). https://www.itn ews.com.au/news/mostaustralian-gp-clinics-arent-using-e-health-records-417807 21. Black, A.S.: eHealth-as-a-Service: a service based design approach for large scale eHealth architecture (2018) 22. Masuda, Y., Viswanathan, M.: Enterprise Architecture for Global Companies in a Digital IT Era: Adaptive Integrated Digital Architecture Framework (AIDAF). Springer (2019). https:// doi.org/10.1007/978-981-13-1083-6

Stumbling Blocks of Utilizing Medical and Health Data …

25

23. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Architecture board practices in adaptive enterprise architecture with digital platform. Int. J. Enterp. Inf. Syst. 14, 1–20 (2018). https:// doi.org/10.4018/ijeis.2018010101 24. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Risk management for digital transformation in architecture board: a case study on global enterprise. In: 2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 255–262. IEEE (2017). https:// doi.org/10.1109/IIAI-AAI.2017.79

Digital Financial Incentives for Improved Population Health in the Americas Donald S. Shepard and Yoshimasa Masuda

1 Introduction The United States, like most countries around the world, is striving to improve its population’s cardiovascular health. Recent results, however, are mixed. On the favorable side, clinical medicine continues to improve. For example, a December 2019 review chronicled 39 improvements in the diagnosis and management of cardiovascular disease just in the preceding six months [1]. On the worrying side, however, several risk factors due to lifestyle continue to worsen. The National Health and Examination Survey (NHANES) documented significant increases in the prevalence of obesity from 1999–2000 to 2015–2016 (the most recent data) in both adults and youth [2]. More recent self-reported data through 2018 showed the prevalence of both obesity [3] and sedentary lifestyles has continued to worsen [4]. Obesity increases risk not only for cardiovascular disease, but also for many types of cancer [5]. These challenges occur in a turbulent environment, including new technologies, globalization, shifts in customer needs, and new business models. Significant changes in cutting edge information technology (IT) due to recent developments in cloud computing and mobile IT (such as progress in big data technology) have emerged as new trends in IT. New technologies are creating a “digital IT economy.” These advances generate both business opportunities and business risks and force enterprises to innovate or face the consequences [6]. Enterprise architecture (EA) should be effective in contributing to the design of large integrated systems, which represent a major technical challenge toward the era of cloud, mobile IT, big data, and digital D. S. Shepard (B) The Heller School for Social Policy and Management, Brandeis University, Waltham, MA, USA e-mail: [email protected] Y. Masuda The School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA Graduate School of Media and Governance, Keio University, Kanagawa, Japan © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_3

27

28

D. S. Shepard and Y. Masuda

IT in digital transformation. From a comprehensive perspective, EA encompasses all enterprise artifacts, such as business, organization, applications, data, and infrastructure, to establish the current architecture visibility and future architecture/roadmap. To be effective, EA frameworks need to incorporate mobile IT and cloud IT [7, 8]. Previous papers teamed EA with an IT strategy promoting cloud and mobile IT “Adaptive Integrated EA Framework,” [9] and, when linked with digital IT, “Adaptive Integrated Digital Architecture Framework” (AIDAF) [10]. AIDAF should be well suited to manage the proposed fitness monitoring system, described below, as it involves mobile IT, cloud and big data, and electronic health records (EHR). This paper is organized as follows: the next section presents background and related work, followed by a proposed digital architecture framework for the fitness monitoring system with EHR using the AIDAF and evaluation plans. Finally, it outlines future issues and conclusions.

2 Background 2.1 Financial Incentives In general, the challenge for improving population health is implementing and sustaining approaches to halt, and then reverse, the current disturbing trends in a rise obesity and declining cardiovascular health. Accumulating evidence suggests that smart financial incentives offer a promising approach. Several randomized trials have shown how financial incentives can improve cardiovascular indicators. For example, a 16-week trial evaluated both a deposit contract (where participants deposited money, matched by the program that could return up to $252 per month) and a lottery program (daily chances of $10 or $100) based on daily weight compared to daily monitoring without incentives. Participants in both incentive conditions had significantly greater odds of meeting the program goal of losing one pound per week than control participants. The odds ratios were 7.7 (95% CI, 1.4–42.7) for the deposit approach and 9.4 (95% CI, 1.7–52.7) for the lottery [11]. A subsequent one-year randomized trial with weekly incentives or penalties of $20 found a highly significant 6.5-pound reduction (standard error 1.92, p < 0.001) in the intervention group compared to controls [12]. A multi-site trial of financial incentives targeting lipid reductions compared incentives to physicians, patients, or both against usual care [13]. Only the shared incentives between physicians and patients achieved significant improvements over the control group (p = 0.002), obtaining an improvement about one third larger than occurred in the control group alone. In the future, smart digital systems could improve cardiovascular and cancer health through many types of reminders and incentives. Smart pill bottles can remind users to take prescribed medicines and record when the bottle is opened. Smart bathroom scales can record, time stamp, and transmit weight and body fat. Electronic blood

Digital Financial Incentives for Improved Population …

29

pressure cuffs can monitor blood pressure and help clinicians adjust medications. Continuous glucose monitors can help diabetics monitor blood sugar. Many types of organizations benefit from healthy members and could gain by implementing such incentive plans. Potential implementers include employers; medical providers; health plans; and health, life, and disability insurers. These organizations would realize the greatest gains from members for whom the scope for improvement is large. This would apply in persons currently at elevated risk due to higher age, relevant co-morbidity (e.g. diabetes, hypertension, elevated lipids), other risk factors (e.g. smoking, alcohol abuse, or obesity), past high medical costs, and sedentary lifestyles. Digital health systems could facilitate such targeted enrollment by collaborating with medical providers or health plans that already possess electronic health information. The providers and plans could ask their members whether they were interested in exploring their eligibility for such a plan. Those interested would allow relevant data to be passed through an algorithm anonymously and receive the results. Those eligible would be invited to enroll. Recently, Apple Watch published an evaluation of its system for detecting cardiac arrhythmias which alerted the wearer to seek medical advice when needed [7].

2.2 Electronic Health Records Over the 50 years that followed the first implementation of computerized patient medical records in the 1960s, technological advances in computer innovation opened the way for advancements in EHRs and health care [14]. The utilization of software applications and stand-alone computer systems migrated from paper documentation of patient data to digital forms of record keeping [15]. The International Standards Organization (ISO) defined an electronic health/healthcare record (EHR) as a repository of information regarding a patient’s health status and medical care in computer-processable form [16]. EHRs can include past medical history and medications, immunizations, laboratory data, radiology reports, and vital signs, as well as patient demographics [17]. Ideally, an EHR aligns information system strategies with operations of the hospital or other healthcare organization.

2.3 Fitness Monitoring Systems A patient-centered integrated healthcare system, such as a fitness monitoring system, is designed to monitor individuals’ health in real-time, to guide preventive care and to increase health awareness and self-monitoring [18]. Outside the area of controlled research, many commercial entitles are establishing incentive programs for customers and workers. Digital options are creating promising new alternatives. For example, the Vitality Program offered in the United States by

30

D. S. Shepard and Y. Masuda

Fig. 1 Example of a fitness monitoring system

John Hancock Insurance, offers participants a means to receive and keep a free Apple Watch [19]. That program builds on the psychological principle of loss aversion. If the member does not meet his/her fitness goal, he/she would need to pay for their previously free watch. The online bank called FitnessBank offers another innovative approach [20]. The interest rate that an account holder receives each month depends on his/her average daily number of steps in the preceding month. The bank provides a standard rate schedule for working age adults and an easier, senior schedule for savers age 65 and above. An account holder can earn the current maximum annual interest rate (2.2% in December 2019) by documenting a daily average of 12,500 (for a working age adult) or 10,000 or more steps (for a senior) in the preceding month. By contrast, the highest yielding general online account then paid only 1.85%. See Fig. 1. To access FitnessBank’s tiered rates, the saver must install the bank’s Step Tracker app on his/her phone and authorize it to access data from Apple “Health,” Google “Play,” or a similar app. These apps obtain the steps from a smart watch which has a Bluetooth-enabled wrist band (available for as little as $10), or a pedometer built into late model mobile phones. The Step Tracker automatically retrieves each day’s steps, allows the user to monitor his/her progress, and transfers the monthly average to the bank. The smart feature of this system is that it requires no ongoing user action after setting up Step Tracker. This process avoids both the potential errors of selfreported data and the inconvenience of regular visits to a gym, clinic or laboratory. The bank’s tiered interest rates provide ongoing health motivation for the customer and a source of new customers and referrals for the bank. On a global scale, a Health Awareness Monitoring Program offers real time health monitoring in India [18]. Malekian Borujeni and colleagues have developed a real time system for monitoring coronary indicators for patients at risk of heart failure [21].

Digital Financial Incentives for Improved Population …

31

2.4 Cloud Computing, Big Data, Digital Healthcare, and Smartphones Cloud computing is an economical option to access strong computing resources to analyze big data, with many applications to the healthcare industry [22]. For example, “Veeva Systems” is a cloud-computing-based software-as-a-service (SaaS) company focused on global life sciences and pharmaceutical applications [23]. Increasing implementation of big data analytics in healthcare is providing the ability to examine large EHR data sets to uncover hidden patterns, unknown correlations, and other useful information [24, 25]. These advances in big data analytics are transforming analyses from being descriptive to predictive and prescriptive [26]. Big data analytics in healthcare can contribute to evidence-based medicine, patient profile analyses, and improving healthcare systems [27]. Smartphones allow consumers to contribute and retrieve data directly from cloudbased systems, as would be required by most digital monitoring systems. According to the latest data, 65% of users in Central and South America and 80% of those in North America (the highest region worldwide) with a mobile connection had a smartphone in 2018. By 2025, these rates are forecasted to grow to 79% and 90%, respectively [28]. Thus, smartphones are becoming widely available in the Americas.

2.5 Direction of Enterprise Architecture in a Digital IT Era In the past 10 years, EA has become an important method to model the relationship between corporate and individual systems. ISO/IEC/IEEE42010:2011 defines an architecture framework as “conventions, principles, and practices for the description of architecture established within a specific domain of application and/or community of stakeholders” [29]. EA is an essential element of corporate IT planning and offers benefits to companies such as coordination between business and IT [30]. Chen et al., reported on EA integration with service-oriented architecture (SOA) [31]. OASIS, a public standards group [32], introduced a SOA reference model. Attention has also been focused on microservice architecture, which allows rapid adoption of new technologies like mobile-IT applications and cloud computing [33]. SOA and microservice vary greatly from the viewpoint of service characteristics [34]. Microservice is an approach for dispersed systems defined from the two basic forms of functional services through an application program interface (API) layer and infrastructure services [33]. Multiple microservices cooperating to function together enable implementation as a mobile IT application [35]. In terms of cloud computing, many mobile IT applications operate with SaaS cloud-based software [36]. Traditional EA approaches require months to develop an EA to achieve a cloud adoption strategy, and organizations demand adaptive EA to iteratively develop and manage an EA for cloud technologies [37]. Moreover, few studies report on EA integration with mobile IT [10]. From the standpoint of EA for

32

D. S. Shepard and Y. Masuda

cloud computing, Masuda and colleagues recommend an adaptive EA framework supporting elements of cloud computing [38]. When promoting cloud/mobile IT, companies that have applied The Open Group Architecture Framework (TOGAF) or Federal Enterprise Architecture Framework (FEAF) have adopted an integrated framework using the adaptive EA framework supporting elements of cloud computing [38]. In that approach, Masuda et al. proposed an adaptive integrated EA framework which should support IT strategies promoting cloud, mobile IT, and digital IT. In their case study, Masuda and colleagues proposed an EA framework integrating an adaptive EA cycle with TOGAF or a simple EA framework for different business units [9]. In the adaptive EA cycle, Masuda and colleagues recommend that project plan documents, including architecture for new digital IT projects, should be made on a short-term basis at the beginning of a project. They should refer to materials of the defining phase (e.g., architectural guidelines for security and digital IT, aligned with IT strategy) per business needs. The assessment/architecture review phase involves the initiation documents for the IT project. In the rationalization phase, they recommend that stakeholders and architecture board decide upon replaced or decommissioned systems by the proposed new information systems. In the realization phase, they recommend that a project team begins to implement the new IT project after deliberating issues and action items [9, 10]. In the adaptive EA cycle, corporations can adopt an EA framework such as TOGAF and a simple EA framework, based on an operational division unit in alignment between EA guiding principles and each division’s principles, can result in varying strategies in business divisions in the mid-to long-term [9, 10].

2.6 Adaptive Integrated Digital Architecture Framework—Aligned with Digital IT Strategy To be successful, digital financial incentives would need to be embedded in an appropriate IT strategy. Previous research has shown that an “Adaptive Integrated Digital Architecture Framework (AIDAF)’ provides the flexibility to meet this need [9, 10]. It can address data security, responsiveness to operational units, and the needs of the sponsoring organization.

3 Plan for Fitness Monitoring with EHR Using AIDAF Based on our literature review, Fig. 2 presents our plan for a fitness monitoring system. It could be extended to a broader health support system also to support diagnosis, medication-taking, healthy food purchases, monitoring and response to clinical signs, and other capabilities.

Digital Financial Incentives for Improved Population …

33

Fig. 2 Schema for a fitness monitoring system

A fitness monitoring system links several key items, as depicted in Fig. 2. First, it entails outreach to make the population aware of the system and encourage their participation. An employment-based system would invite and encourage workers and their families to join. A health or life insurance system, whether public or private, would first notify those covered. Social media can then be invaluable to encourage enrollment. A snowball system, which encourages and potentially incentivizes members to enroll friends and co-workers, can boost participation. Next, it entails persons joining the system to allow their fitness to be monitored. This requires wearing a device, such as a smart watch or step counter, which tracks fitness activities and transmits the data to a central, cloud-based system. The Vitality Program, offered by Discovery Insurance in South Africa and John Hancock Insurance in the United States, offers participants an Apple Watch up front on an incentivebased installment plan. Fulfilling a specified fitness goal each month waives that month’s $25 installment fee [19]. An additional, optional system is to link elements of an EHR to the system. This could entail a health practitioner’s confirmation of screening for risk factors, and progress in controlling them (e.g. blood pressure, weight, and cholesterol ratio). Buying healthy foods through linked grocery chains is another optional element in the Vitality Program. Members enter their subscriber number when making a purchase. The participating chain has previously classified items as healthy (e.g. vegetables) or unhealthy (e.g. candy). The grocery store’s electronic scanning automatically tallies the balance of healthy and unhealthy purchases—thereby extending fitness monitoring beyond the individual to the household for whom he/she is buying food. Suitably motivated and monitored, the participant regularly engages in health promoting activities, such as walking, swimming, team or individual sports. The

34

D. S. Shepard and Y. Masuda

health benefits of fitness require a long-term lifestyle commitment. Thus, the behaviors and their monitoring need to be ones that participants enjoy and can maintain. Once established, the system can then provide feedback on progress to the participant and their medical professionals. If desired, family or friends could also be included and offer encouragement. Messages can be tailored to encourage the recipient to treat themselves, if earned, or proven suggestions for improvement (e.g., one day at a time) where needed. Finally, the system provides immediate rewards. These can include an immediate gift, such as voucher for free cup of coffee at a national chain of baristas, a voucher to reduce the cost of the next grocery purchase, or other reward points to buy merchandise online or at chain stores. The proposed AIDAF model integrates fitness monitoring systems with other IT components, such as the EHR is an EA framework model.

4 Evaluation The concept here could be evaluated by a partnership between academic researchers and a commercial partner to initiate a multi-phased observational study or trial of digital financial incentives for improving population health. The first phase would be a feasibility and acceptability pilot involving a company in the Americas providing life and/or health insurance, an interested medical organization (large clinic and/or hospital) and interested subscribers covered by that insurance who receive care from the participating medical organization. The subscribers would agree to share health and physical activity data confidentially with the insurer. These would include data generated directly by the subscriber, such as steps walked or minutes of active exercise measured from a personal fitness device or smart watch, and data from medical providers on control of risk factors (e.g. blood pressure, smoking, weight, and lipid profiles). The insurer, in turn, would use this information to provide incentives to the subscriber. This could be in the form of “points” or cash credits redeemable towards merchandise, entertainment, telephone air time, or a free coffee voucher. The medical provider would also be informed so that the organization can provide encouragement or counseling. All participating organizations (the consumer, the insurer, and the medical provider) would provide feedback to the researcher about strengths, weaknesses, and suggestions about the system. The insurer should gain through lower costs from healthier subscribers and a reputation for innovation. Using that feedback, the research group would develop the next version of the system in a process similar to beta testing. Once a system has earned favorable ratings from stakeholders, it would move to the evaluation phase. In that phase, a randomized trial would be initiated. Interested consumers volunteering would be offered a financial incentive for making their data available. Those consenting to join would be randomized between intervention and control arms. Those in the intervention arm would be offered digital incentives for healthy indicators.

Digital Financial Incentives for Improved Population …

35

5 Discussion Incentive-based insurance has been well established in South Africa (through the insurer Discovery), in Canada through Manulife, and in the US through John Hancock’s Vitality Program [19]. However, few incentive programs exist with public providers now with systems in South America. There, several countries have national insurance systems for all (e.g. Colombia) or formal sector workers (e.g. Mexico). Expansion to persons covered through public programs offer important opportunities to help achieve public health goals. By using AIDAF, the proposed fitness monitoring system helps ensure that incentive-based insurance meets the goal of improving population health and allows multiple devices to interconnect. The proposed plan for digital financial incentives to improve population health faces technical, architectural, behavioral, and financial challenges. The technical challenges are obtaining the requisite data from both subscribers and health providers, and transmitting it promptly, securely, and reliably to the monitoring organization. The architectural challenge is ensuring privacy, security, and responsiveness to each country’s characteristics. The behavioral challenge is ensuring that the plan is sufficiently attractive to enroll and retain numerous subscribers. High enrollment ensures public health contributions and sufficient data for a robust evaluation. The financial challenge is to ensure that the plan is financially viable for the insurer, so that it can afford to continue to offer incentives based on the calculated savings and reputational advantages.

6 Conclusions Worsening population health affects many countries globally [39]. Fitness activities could help halt and reverse these trends. Our propose model, based on AIDAF, offers a promising approach to motivate individuals to initiate and maintain fitness activities. First, by analyzing big data, it can identify ways to use social media, newsletters and messaging from insurers and other organizations to encourage initiating fitness activities. Mobile devices can effortlessly record and relay such activities, from a few minutes of walking to an intensive gym workout. Incentives from employers and insurers can offer ongoing motivation to sustain these activities. Flexible, adaptive digital frameworks with cloud computing can link an individual’s fitness activities, advice from his/her medical provider, and financial payments from insurers and other payers in a virtuous circle. These systems also generate the data to evaluate and refine such models to build on best practices. Funding and Acknowledgments Prof. Shepard received salary support during preparation of this chapter from Centers of Biomedical Research Excellence award P20GM103644 from the National Institute of General Medical Sciences. The authors thank Clare L. Hurley for editorial assistance.

36

D. S. Shepard and Y. Masuda

References 1. Saperia, G.M., Yeon, S.B., Downey, B.: What’s new in cardiovascular medicine?. Last accessed 12 Feb 2020 2. Centers for Disease Control and Prevention. National Health and Nutrition Examination Survey, pp. 1–2, USA (2017) 3. Trust for America’s Health (TFAH). The State of Obesity (2019) 4. Yang, L., Cao, C., Kantor, E.D., Nguyen, L.H., Zheng, X., Park, Y., Giovannucci, E.L., Matthews, C.E., Colditz, G.A., Cao, Y.: Trends in sedentary behavior among the US population. JAMA 321, 1587–1597 (2019) 5. Lauby-Secretan, B., Scoccianti, C., Loomis, D., Grosse, Y., Bianchini, F., Straif, K., for the International Agency for Research on Cancer Handbook Working Group.: Body fatness and cancer—viewpoint of the IARC Working Group. N. Engl. J. Med. 375, 794–798 (2016) 6. Boardman, S., Harrington, E.: Open group snapshot-open platform 3.0TM . The Open Group (2015) 7. Alwadain, A., Fielt, E., Korthaus, A., Rosemann, M.: A comparative analysis of the integration of SOA elements in widely-used enterprise architecture frameworks. Int. J. Intell. Inf. Technol. 54–70 (2014) 8. Buckl, S., Matthes, F., Schulz, C., Schweda, C.M.: Exemplifying a framework for interrelating enterprise architecture concerns. In: Sicilia, M.A., Kop, C., Sartori, F. (eds.) Ontology, Conceptualization and Epistemology for Information Systems, Software Engineering and Service Science, pp. 33–46. Springer, Berlin, Heidelberg, New York (2010) 9. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Int. J. Enterp. Inf. Syst. IJEIS (IGI Global) 13, 1–22 (2017) 10. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Architecture board practices in adaptive enterprise architecture with digital platform. A Case Glob. Healthc. Enterp. Int. J. Enterp. Inf. Syst. (IGI Global) 14, 1 (2018) 11. Volpp, K.G., John, L.K., Troxel, A.B., Norton, L., Fassbender, J., Loewenstein, G.: Financial incentive-based approaches for weight loss: a randomized trial. JAMA 300, 2631–2637 (2008) 12. Driver, S.L., Hensrud, D.: Financial incentives for weight loss: a one–year randomized controlled clinical trial. J. Am. Coll. Cardiol. 61, 10 (2013) 13. Asch, D.A., Troxel, A.B., Stewart, W.F., Sequist, T.D., Jones, J.B., Hirsch, A.G., Hoffer, K., Zhu, J., Wang, W., Hodlofski, A., Frasch, A.B., et al.: Effect of financial incentives to physicians, patients, or both on lipid levels: a randomized clinical trial. JAMA 314, 1926–1935 (2015) 14. Turk, M.: Electronic health records: how to suture the gap between privacy and efficient delivery of healthcare. Brooklyn Law Rev. 80, 565–597 (2015) 15. Murphy-Abdouch, K., Biedermann, S.: The electronic health record. In: Fenton, S.H., Biedermann S. (eds.) Introduction to Healthcare Informatics, pp. 25–70. AHIMA Press, Chicago, IL (2014) 16. International Organization for Standardization (ISO): ISO/TR 20514 2005 Health Informatics—Electronic Health Record—Definition, Scope and Context. https://www.iso.org/sta ndard/39525.html. Last accessed 14 Feb 2020 17. World Health Organization. Management of Patient Information. http://apps.who.int/iris/bitstr eam/10665/76794/1/9789241504645_eng.pdf 18. Nedungadi, P., Jayakumar, A., Raman, R.: Personalized health monitoring system for managing well-being in rural areas. J. Netw. Comput. Appl. 42, 22 19. John Hancock Insurance. The Vitality Program. https://www.johnhancockinsurance.com/vit ality-program.html 20. Fitness Bank. Step Track and Earn. www.fitnessbank.fit. Last accessed 14 Feb 2020 21. Malekian, B.A., Fathy, M., Mozayani, N.: A hierarchical, scalable architecture for a real-time monitoring system for an electrocardiography, using context-aware computing. J. Biomed. Infom. 96, 103251 (2019) 22. Calabrese, B., Cannataro, M.: Cloud computing in healthcare and biomedicine. Scalable Comput.: Pract. Exp. 16, 1–18 (2015)

Digital Financial Incentives for Improved Population …

37

23. Veeva Systems. Veeva Medical Suite. https://www.veeva.com/products/medical-suite/ 24. Archenaa, J., Anita, E.M.: A survey of big data analytics in healthcare and government. Procedia Comput. Sci. 50, 408–413 (2015) 25. Chawla, N.V., Davis, D.A.: Bringing big data to personalized healthcare: a patient-centered framework. J. Gen. Internal Med. 28, 660–665 (2013) 26. Chang, H., Choi, M.: Big data and healthcare: Building an augmented world. Health Inf. Res. 22, 153–155 (2016) 27. Jee, K., Kim, G.-H.: Potentiality of big data in the medical sector: focus on how to reshape the healthcare system. Healthc. Inf. Res. 19, 79–85 (2013) 28. Richter F. Where smartphone adoption is still poised for growth. Statista. https://www.statista. com/chart/17148/smartphone-adoption-by-world-region/. Last accessed 22 Mar 2020 29. Garnier, J.-L., Bérubé, J., Hilliard, R.: Architecture guidance study report 140430, ISO/IEC JTC 1/SC 7 Software and Systems Engineering (2014) 30. Tamm, T., Seddon, P.B., Shanks, G., Reynolds, P.: How does enterprise architecture add value to organizations? Commun. Assoc. Inf. Syst. 28, 10 (2011) 31. Chen, H.-M., Kazman, R., Perry, O.: From software architecture analysis to service engineering: an empirical study of methodology development for enterprise SOA implementation. IEEE Trans. Serv. Comput. 3, 145–160 (2014) 32. MacKenzie, C.M., Laskey, K., McCabe, F., Brown, P.F., Metz, R.: Reference model for SOA 1.0. Technical report. In: Advancing Open Standards for the Information Society (2006) 33. Newman, S.: Building Microservices. O’Reilly Media (2015) 34. Richards, M.: Microservices Versus Service-Oriented Architecture, 1st edn. O’Reilly Media (2015) 35. Familiar, B.: Microservices, IoT, and Azure: Leveraging DevOps and Microservice Architecture to Deliver SaaS Solutions. Apress Media, LLC (2015) 36. Muhammad, K., Khan, M.N.A.: Augmenting mobile cloud computing through enterprise architecture: survey paper. Int. J. Grid Distrib. Comput. 8, 323–336 (2015) 37. Gill, A.Q., Smith, S., Beydoun, G., Sugumaran, V.: Agile enterprise architecture: a case of a cloud technology-enabled government enterprise transformation. In: Proceedings of the 19th Pacific Asia Conference on Information Systems (PACIS), pp. 1–11 (2014) 38. Masuda, Y., Shirasaka, S., Yamamoto, S.: Integrating mobile IT/cloud into enterprise architecture: a comparative analysis. In: Proceedings of the 21th Pacific Asia Conference on Information Systems (PACIS), Paper 4 (2016) 39. Institute for Health Metrics and Evaluation, IHME. http://www.healthdata.org. Last accessed 23 Mar 2020

Advanced ICT for Medicine and Healthcare

Trial Run of a Patient Call System Using Mobile Devices Kei Teramoto and Hiroshi Kondoh

1 Introduction The waiting time between registration at the reception desk of a hospital and being called for examination is indefinite and a major cause of anxiety in outpatients [1– 12]. Most of these patients must sit in the waiting room in a highly alert state of mind because they do not know when they will be called in. This waiting time is stressful as the patients are limited in terms of their ability to move around. Some medical institutions provide Personal handy-phone (PHS) or bell devices to call patients right before their examination in order to relieve the burden of waiting. However, the use of PHS and bell devices is accompanied by challenges such as limited range of usage and hardware cost. Therefore, to address these challenges, we developed a smartphone application that calls patients by sending them push notifications [13– 15]. Subsequently, we conducted a trial run in four clinical departments at the Tottori University Hospital. The patient call application has two main functions: (i) it allows the patient to leave the hospital and at the same time, remains registered as long as they stay within a radius of 500 m; and (ii) it calls the patient to the examination room prior to the starting of the examination, by causing the patient’s personal smartphone to play a sound. This paper will report the method for the design of the proposed application and retrospectively evaluate the results of a post-trial patient survey.

K. Teramoto (B) · H. Kondoh The Division of Medical Informatics, Tottori University Hospital, Nishicho 36-1, Tottori, Japan e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_4

41

42

K. Teramoto and H. Kondoh

Fig. 1 The system design of patient call system

2 Method The patient call system causes the patient’s smartphone to play a sound and sends a message to a patient’s smartphone when the doctor uses the patient call function on their electronic medical record. This chapter will provide a general overview of the patient call system and also comprehensively describe each of its functions.

2.1 System Design Steps 1–4 detail the process flow of the proposed call system (Fig. 1). • The patient installs the application on their smartphone and receives a registration ID from a Cloud Messaging Service (CMS). This ID is required for receiving messages from the application. • The patient then links their registration ID to their electronic medical record ID registered on the Electronic Medical Records (EMR). This allows the EMR to send messages to the patient using the CMS. • Using the EMR client, the doctor selects the desired patient and sends a message to call them to the examination room. • The patient checks the message displayed on their smartphone and makes their way to the waiting room. The EMR has a function that sends a message from the CMS to the patients via a gateway server. The message is in the form of a Push notification.

2.2 Evaluation of the Proposed System In order to evaluate the effectiveness of the proposed system and the number of patients using the system, we estimated the number of patients who registered during

Trial Run of a Patient Call System Using Mobile Devices

43

the 4 months of operation of the application, starting from the commencement of the trial run and subsequently conducted a user survey. The number of patients registered were calculated from the number of EMR patients IDs that were linked at the time of registration. We estimated the results of the user survey that was automatically sent to the patients after they used the application.

3 Results 3.1 Number of Users Registered The total number of patients registered on the system during the roughly four-month trial period was estimated to be 1,015. There were 689 other patients who also wanted to register but were unable to do so owing to issues related to the feature or version of their Android smartphones. The mean age of the registered patients was 48 (SD ± 18.0) years. The patients comprised of 90% male and 40% female.

3.2 Survey Results Patients who used our proposed system rated it using a 5-point scale on an anonymous web form user survey (1: bad—5: extremely good). Figure 2 shows the results of the user survey.

4 Discussions Eighty-four percent of the Japanese population uses cellular phones; 60.9% of them use smartphones [16]. Providing a smartphone application with a patient call function will enable the outpatients to effectively use their time while waiting to be called for their examination. Additionally, as the use of smartphones has eliminated the need for medical institutions to procure physical devices to call patients, the proposed system does not require any additional hardware or hardware maintenance costs. However, maintenance costs for maintaining the quality of the smartphone application are involved. The smartphone operation system specifically requires the updating of the different versions of the application as the smartphone users are generally divided depending on the type of operating system they use; iOS or Android. Androids especially require different versions depending on the OS in use. For this reason, this trial was available only for users with Android OS version 8.0 or higher. This resulted in 10% of the patients with TUH smartphones being unable to use the call function as their Android OS versions were lower than 8.0. The functions and personal

44

K. Teramoto and H. Kondoh

Fig. 2 Results of the 5-point scale survey from the patients for the proposed patient call system

customizations of each smartphone model also affect the use of the application. Therefore, the use of this application requires a significant amount of testing and analysis on the part of the application user. From the survey results, we noticed that 86% of patients rated this system favorably. We also observed through the system is effective in alleviating the stress the outpatients undergo when waiting to be called for their examinations. We were, however, unable to sufficiently verify the medical history of the patients who used the patient call system, because the web survey for this trial was anonymous. The identification of a patient’s medical history using the survey results would clarify the groups of patients (based on clinical department, illness, age, gender, etc.) who proactively use the patient call system. The system was operated for a trial run period of 4 months and this trial was conducted in a limited number of clinical departments. However, we could not confirm the percentage of patients using the system by department. Increasing the number of clinical departments using the system will enable us to understand the difference in the usage percentages by clinical condition and patient history. The number of applications that can manage and record an individual’s biological information onto their smartphones has increased with the advancement of smartphones. If this information can be collected and imported into a patient’s EMR through the patient call system, it may be valuable for outpatient examinations. The patient call system can be equipped to import smartphone information as it is linked to the EMR.

Trial Run of a Patient Call System Using Mobile Devices

45

5 Conclusions In this study, a patient call application for smartphones was developed with the aim of alleviating the stress experienced by outpatients when waiting to be called for their examinations. Over 86% of outpatients who used this developed application rated it favorably. Therefore, the patient call system was verified to be effective. This system can be used by hospitals and institutions where long outpatient waiting times are a problem.

References 1. Carter, O., Pannekoek, L., Fursland, A., Allen, K.L., Lampard, A.M., Byrne, S.M.: Increased wait-list time predicts dropout from outpatient enhanced cognitive behaviour therapy (CBT-E) for eating disorders. Behav. Res. Ther. 50, 487–492 (2012) 2. Chung, S., Johns, N., Zhao, B., Romanelli, R., Pu, J., Palaniappan, L.P., Luft, H.: Clocks moving at different speeds: cultural variation in the satisfaction with wait time for outpatient care. Med. Care 54, 269–276 (2016) 3. Cole, F.L., Mackey, T.A., Lindenberg, J.: Wait time and satisfaction with care and service at a nurse practitioner managed clinic. J. Am. Acad. Nurse Pract. 13, 467–472 (2001) 4. Conley, K., Chambers, C., Elnahal, S., Choflet, A., Williams, K., DeWeese, T., Herman, J., Dada, M.: Using a real-time location system to measure patient flow in a radiation oncology outpatient clinic. Pract. Radiat. Oncol. 8, 317–323 (2018) 5. Esimai, O.A., Omoniyi-Esan, G.O.: Wait time and service satisfaction at Antenatal Clinic, Obafemi Awolowo University Ile-Ife. East Afr. J. Public Health 6, 309–311 (2009) 6. Gjolaj, L.N., Campos, G.G., Olier-Pino, A.I., Fernandez, G.L.: Delivering patient value by using process improvement tools to decrease patient wait time in an outpatient oncology infusion unit. J. Oncol. Pract. 12, e95–e100 (2016) 7. Gjolaj, L.N., Gari, G.A., Olier-Pino, A.I., Garcia, J.D., Fernandez, G.L.: Decreasing laboratory turnaround time and patient wait time by implementing process improvement methodologies in an outpatient oncology infusion unit. J. Oncol. Pract. 10, 380–382 (2014) 8. Guerrero, E., Andrews, C.M.: Cultural competence in outpatient substance abuse treatment: measurement and relationship to wait time and retention. Drug Alcohol Depend. 119, e13–22 (2011) 9. Herd, T.J., Nopper, A.J., Horii, K.A.: Effect of a referral-only policy on wait time for outpatient pediatric dermatology appointments. Pediatr. Dermatol. 34, 369–370 (2017) 10. Lewis, A.K., Harding, K.E., Snowdon, D.A., Taylor, N.F.: Reducing wait time from referral to first visit for community outpatient services may contribute to better health outcomes: a systematic review. BMC Health Serv. Res. 18, 869 (2018) 11. Okotie, O.T., Patel, N., Gonzalez, C.M.: The effect of patient arrival time on overall wait time and utilization of physician and examination room resources in the outpatient urology clinic. Adv. Urol. 507436 (2008) 12. Parikh, A., Gupta, K., Wilson, A.C., Fields, K., Cosgrove, N.M., Kostis, J.B.: The effectiveness of outpatient appointment reminder systems in reducing no-show rates. Am. J. Med. 123, 542–548 (2010) 13. Yoo, S., Jung, S.Y., Kim, S., Kim, E., Lee, K.H., Chung, E., Hwang, H.: A personalized mobile patient guide system for a patient-centered smart hospital: lessons learned from a usability test and satisfaction survey in a tertiary university hospital. Int. J. Med. Inform. 91, 20–30 (2016) 14. Baek, M., Koo, B.K., Kim, B.J., Hong, K.R., Kim, J., Yoo, S., Hwang, H., Seo, J., Kim, D., Shin, K.: Development and utilization of a patient-oriented outpatient guidance system. Healthc Inform Res 22, 172–177 (2016)

46

K. Teramoto and H. Kondoh

15. Yoo, S., Kim, S., Kim, E., Jung, E., Lee, K.H., Hwang, H.: Real-time location system-based asset tracking in the healthcare field: lessons learned from a feasibility study. BMC Med. Inform. Decis. Mak. 18, 80 (2018) 16. Homepage. http://www.soumu.go.jp/johotsusintokei/statistics/statistics05.html, Last accessed 5 Jan 2020

Advance Watermarking Algorithm Using SURF with DWT and DCT for CT Images Saqib Ali Nawaz, Jingbing Li, Uzair Aslam Bhatti, Muhammad Usman Shoukat, and Anum Mehmood

1 Introduction With the continuous development of biomedical engineering, digitization has entered the medical field. A large number of medical images such as Computer Tomography (CT), Magnetic Resonance Imaging (MRI), and Ultrasound imaging (US) are widely used [1], and information exchange is becoming increasingly common. However, in S. A. Nawaz · J. Li (B) · M. U. Shoukat · A. Mehmood College of Information and Communication Engineering, Hainan University, Haikou 570228, China e-mail: [email protected] S. A. Nawaz e-mail: [email protected] M. U. Shoukat e-mail: [email protected] A. Mehmood e-mail: [email protected] S. A. Nawaz · J. Li State Key Laboratory of Marine Resource Utilization in the South China Sea, Hainan University, China Haikou, Hainan Province, Haikou 570228, China U. A. Bhatti Key Laboratory of Virtual Geographic Environment, MOE, Nanjing Normal University, Nanjing 210023, China e-mail: [email protected] A. Mehmood Department of Biochemistry and Molecular Biology, Hainan University, Haikou 570228, China M. U. Shoukat School of Automation and Information Sichuan University of Science and Engineering, Yibin 644000, China © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_5

47

48

S. A. Nawaz et al.

the process of medical image transmission, there is also the possibility of information destruction, and its security issues have attracted the attention of many experts and scholars. Digital watermarking technology can effectively improve the security of medical images. That is, the technology uses a certain technology to embed watermark identity information, but it does not affect the use value of the image itself, and it is not easily perceived by human eyes [2]. At present, the research of common digital watermarking is mainly divided into two methods: spatial domain and transform domain (DCT [3], DWT [4], DFT [5], SURF [6]). The transform domain watermarking algorithm has high concealment and security, and is widely used, but its algorithm complexity is also high. SURF is used in watermarking algorithms and has good characteristics [7–9]. Combining SURF and DCT can improve the invisibility of the watermark, but because the watermark is embedded in all sub-bands, the embedding and extraction process of the watermark will become complicated and time-consuming. Applying the DWT transform to get four sub-bands can greatly reduce the complexity of the algorithm. Figure 1 shows different types of watermarking methods. This paper draws on the contributions of predecessors in digital watermarking and compressed sensing and proposes a medical CT image watermarking algorithm based on compressed sensing and DWT–DCT based on the requirements of medical images for watermarking technology. The algorithm combines the advantages of DWT, DCT, and SURF in watermarking to enhance the concealment of watermark information. At the same time, the compressed sensing processing is performed on the watermark information, which can not only perform watermark encryption processing, but also reduce the amount of watermark information and reduce the complexity of the algorithm. Simulation results show that the scheme meets the requirements of

Fig. 1 Different types of watermarking techniques

Advance Watermarking Algorithm Using SURF …

49

medical image watermarking technology. While enhancing the concealment of the watermark, it can also resist common image attack methods.

2 Background 2.1 Discrete Wavelet Transform After the image is subjected to first-order discrete wavelet decomposition, four frequency bands are generated. LL is a low-frequency coefficient, which concentrates most of the energy in the image, has good robustness and strong resistance to attack. LH and HL are intermediate frequency coefficients, and HH is a highfrequency coefficient. The medium and high-frequency coefficients contain more edge detail information in the original image, which characterizes the details of the image. It contains less energy in the original image, is more vulnerable to external attacks, and has lower stability.

2.2 Discrete Cosine Transform In the field of digital image processing, discrete cosine transform has the advantages of high compression ratio, low computational complexity, low information complexity, and low bit error rate [10–13]. It can compress the energy in the data block into the upper left corner of the matrix part of the low-frequency coefficients.

2.3 SURF (Speeded Up Robust Feature) As an image local feature description operator based on scale space, SURF is an improvement of SIFT operator [14]. The SURF algorithm uses the Hessian matrix for extreme point detection, and its Hessian matrix on the scale can be expressed as follows:   L X X (X, σ ) L X Y (X, σ ) (1) H= L Y X (X, σ ) L Y Y (X, σ ) where L x x (X, σ ) is the result of Gaussian second-order differential and image, I (x, y) convolution at point x, £., L xy (X, σ ), and L Y X (X, σ ) have similar meanings [15, 16].

50

S. A. Nawaz et al.

Fig. 2 Medical image with watermark

3 Proposed Algorithm 3.1 Watermark Preprocessing The digital watermark is a self-made grayscale image with a size of 512 * 512, as shown in Fig. 2. Before performing perceptual processing, the signal must be processed with a suitable sparse basis. This paper uses wavelet transform to thin the watermark image. The compression ratio is selected as 0.8, and the perception matrix is selected as a Gaussian random matrix. In the process of extracting watermarks, we must not only select a suitable reconstruction algorithm to reconstruct the signal, but also know the perceptual matrix and other correlations. Because the perception matrix has different construction methods, if you do not understand its construction conditions, you basically cannot know the perception matrix. The compression ratio is also one of the keys. The perceptual matrix and compression ratio double-encrypt the watermark, which greatly improves the security of the watermark information.

3.2 Watermark Embedding (1) Perform DWT decomposition on the medical carrier image to obtain 4 subbands: LL1 , HL1 , LH1 , and HH1 . Select the HH sub-band as the watermark embedding sub-band to improve the invisibility of the watermark; (2) Perform the DCT transformation on the HH1 sub-band, Denoted as DCTA1 ; (3) Perform SURF on DCTA1 , that is, DCTA1 = U1 S1 V1T , S1 is the key 1;

Advance Watermarking Algorithm Using SURF …

51

Fig. 3 Watermark embedding flowchart

(4) Watermark encryption processing on the watermark image to obtain the perceptual image; here, the wavelet base, perceptual matrix, and compression ratio can be regarded as a key; (5) Perform the same (1) and (2) operations on the perceptual image obtained after the compressed sensing processing of the watermark image to obtain DC T B1 and perform SURF on DCTB1 , that is, DCTB1 = U2 S2 V2T and S2 can also be regarded as Key; (6) Watermark embedding process: SC1 = S1 + e∗ S2 S3 = S1 + e∗ S2 and e is the embedding strength; (7) Reconstruct the DCT matrix using the following formula: U1 S3 V1T = DC T A11 ; (8) Finally use IDCT and IDWT reconstruction watermarked image. The watermark embedding flowchart is shown in Fig. 3.

3.3 Extraction of Watermark (1) DWT decomposition of the watermarked image is to obtain four sub-bands: LL2 , HL2 , LH2 , and HH2 ; (2) DCT transformation of the HH2 sub-band, denoted as DCTA2 ; (3) SURF of DCTC, that is, DCTA2 = U C1 SC1 V C1T ; (4) Watermark extraction processing: SC2 = (SC1 − S1 )/e; (5) IDCT, IDWT reconstructs the watermark perception image;

52

S. A. Nawaz et al.

Fig. 4 Watermark extraction flowchart

(6) Sparse reconstruction of perceptual image using the proposed algorithm as shown in Fig. 4.

4 Simulation Experiments and Results Analysis In order to verify the effectiveness of the algorithm, we use Matlab 2015a as the experimental environment for simulation experiments. In this paper, a 512 * 512 brain CT grayscale image is used as the original carrier image, a self-made watermark image is used as the embedded watermark image. The simulation experiment is divided into two experiments: Experiment one is to verify the effect of the algorithm on watermarking and extraction. Experiment two is an attack test, that is, how robust the watermark is. We use NC, MSE, and PSNR (Peak Signal-to-Noise Ratio) to evaluate the image quality of the embedded watermark: ⎤ ⎡ M N max(I(i, j) )2 i, j ⎦ (2) PSNR = 10 lg⎣    2 i j (I(i, j) − I(i, j) ) MSE =

M N

2 1  Wi, j − Wi, j M × N i=1 j=1

  NC =



W(i, j) W(i, j)   2 i j W(i, j)

i

(3)

j

(4)

Therefore, it can be seen that the algorithm has better concealment for watermark embedding. This proves that the algorithm proposed in this paper meets the requirements of medical CT images for watermarking technology and achieves the effect of this experiment (Table 1). Experiment two is an attack test. That is, noise attack, filter attack, and geometric attack are performed on the watermarked image, and then the algorithm of this paper

Advance Watermarking Algorithm Using SURF …

53

Table 1 Results value after conventional attacks Gaussian noise

JPEG compression

Conventional attack

4%

6%

8%

10%

20%

30%

PSNR (db)

15.82

14.21

13.09

33.52

37.37

39.6

0.85

0.9

0.9

0.81

0.86

0.9

NC

is used to extract the watermark information. The NC value is approximately more than 90% which shows our algorithm is robust and secure. Figure 5 and Table 2 show results after attacks. Fig. 5 Results after different attacks. a Rotation 40% anticlockwise. b Gaussian noise (10%). c Translation left 15%. d Scaling 0.4

Table 2 PSNR and NC under geometric attacks based on SURF with DCT and DWT

Geometric attacks

Attack strength

PSNR (dB)

NC

Rotation (Clockwise)

5° 10° 15°

17.82 16.75 16.13

0.91 0.85 0.85

Rotation (Anticlockwise)

10° 30° 40°

14.14 14.84 14.16

0.90 0.81 0.94

Scaling

x 0.4 x 0.6

– –

0.73 0.81

Translation (Left)

3% 15% 25%

15.50 12.04 11.67

0.90 0.82 0.81

Translation (down)

8% 15% 20%

15.87 13.55 12.45

1 1 0.95

Clipping (Y direction)

5% 15%

– –

0.90 0.90

Clipping (X direction)

10% 30%

– –

0.90 0.85

54

S. A. Nawaz et al.

5 Conclusion A new watermarking algorithm for medical CT images is proposed. The watermarking algorithm uses compressed sensing technology to preprocess the watermarked image. Combining the advantages of DCT, DWT, SURF, and watermarking algorithms, it implements the medical CT imagery’s requirements for watermarking technology. The simulation results show that the algorithm is used for the watermarking technology of medical CT images. It has high concealment and can resist common attack methods. Acknowledgments This work is supported by Hainan Provincial Natural Science Foundation of China [No. 2019RC018], the Natural Science Foundation of Hainan [617048, 2018CXTD333], the Science and Technology Research Project of Chongqing Education Commission [KJQN201800442], and the Special Scientific Research Project of Philosophy and Social Sciences of Chongqing Medical University [201703].

References 1. Wu, X., et al.: Contourlet-DCT based multiple robust watermarkings for medical images. Multimed. Tools Appl. 78(7), 8463–8480 (2019) 2. Jayashree, N., Bhuvaneswaran, R.S.: A robust image watermarking scheme using Z-transform, discrete wavelet transform and bidiagonal singular value decomposition. Comput. Mater. Contin. 58(1), 263–285 (2019) 3. Jiansheng, M., Sukang, L., Xiaomei, T.: A digital watermarking algorithm based on DCT and DWT. In: Proceedings. The 2009 International Symposium on Web Information Systems and Applications (WISA 2009). Academy Publisher (2009) 4. Feng, J-B., et al. (2006) Reversible watermarking: current status and key issues. IJ Netw. Secur. 2(3), 161–170 (2006) 5. Bianchi, T., Piva, A.: Secure watermarking for multimedia content protection: a review of its benefits and open issues. IEEE Signal Process. Mag. 30(2), 87–96 (2013) 6. Xiang, L., Li, Y., Hao, W., Yang, P., Shen, X.: Reversible natural language watermarking using synonym substitution and arithmetic coding. Comput. Mater. Contin. 55(3), 541–559 (2018) 7. Bhatti, U., Huang, M., Di, W., Zhang, Y., Mehmood, A., Han, H.: Recommendation system using feature extraction and pattern recognition in clinical care systems. Enterp. Inf. Syst. 13(3), 329–351 (2019) 8. Wang, Y., Ni, R., Zhao, Y., Xian, M.: Watermark embedding for direct binary searched halftone images by adopting visual cryptography. Comput. Mater. Contin. 55(2), 255–265 (2018) 9. Liu, Y., Li, J., Liu, J., Bhatti, U.A., Chen, Y., Hu, S.: Watermarking algorithm for encrypted medical image based on DCT-DFRFT. In: Chen, Y.W., Zimmermann, A., Howlett, R., Jain, L. (eds.) Innovation in Medicine and Healthcare Systems, and Multimedia. Smart Innovation, Systems and Technologies, vol. 145. Springer, Singapore (2019) 10. Luo, H., et al.: A robust image watermarking based on image restoration using SIFT. Radioengineering 20(2), 525–532 (2011) 11. Liu, J., Li, J., Zhang, K., Bhatti, U.A., Ai, Y.: Zero-watermarking algorithm for medical images based on dual-tree complex wavelet transform and discrete cosine transform. J. Med. Imaging Health Inform. 9(1), 188–194 (2019) 12. Dai, Q., Li, J., Bhatti, U.A., Chen, YW., Liu, J.: SWT-DCT-based robust watermarking for medical image. In: Chen, Y.W., Zimmermann, A., Howlett, R., Jain, L. (eds.) Innovation in

Advance Watermarking Algorithm Using SURF …

13.

14. 15. 16.

55

medicine and healthcare systems, and multimedia. Smart innovation, systems and technologies, vol. 145. Springer, Singapore (2019) Dai, Q., Li, J., Bhatti, U. A., Cheng, J., Bai, X.: An automatic identification algorithm for encrypted anti-counterfeiting tag based on DWT-DCT and Chen’s Chaos. In: International Conference on Artificial Intelligence and Security, pp. 596–608. Springer, Cham (2019) Wu, X., et al.: Logistic map and contourlet-based robust zero watermark for medical images. Innov. Med. Healthc. Syst. Multimed, pp. 115–123. Springer, Singapore (2019) Liu, J., Li, J., Chen, Y., Zou, X., Cheng, J., Liu, Y., Bhatti, U.A.A.: Robust zero-watermarking based on SIFT-DCT for medical images in the encrypted domain Nawaz, S.A., Li, J., Liu, J., Bhatti, U.A., Zhou, J., Ahmad, R.M. A Feature-based hybrid medical image watermarking algorithm based on SURF-DCT. In: Liu, Y., Wang, L., Zhao, L., Yu, Z. (eds.) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2019. Advances in Intelligent Systems and Computing, vol. 1075. Springer, Cham (2020)

Improving Depth Perception using Multiple Iso-Surfaces for Transparent Stereoscopic Visualization of Medical Volume Data Daimon Aoi, Kyoko Hasegawa, Liang Li, Yuichi Sakano, and Satoshi Tanaka

1 Introduction Medical volume data, i.e., volume data of the human body, have come to be easily acquired by Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). In this situation, the 3D transparent visualization of the acquired volume data is becoming important. However, the difficulty of the medical visualization is that a human body has very complex internal 3D structures with many kinds of objects with irregular shapes, such as bones, blood vessels, and inward organs. This complexity makes the created images less comprehensible. In our recent papers [1, 2], we have reported that transparent stereoscopic visualization is effective in reducing the complexity. However, we have also found the problem that we cannot always perceive the depth of each object correctly in the transparent stereoscopic visualization. This paper aims at solving this problem. Recent researches have revealed that depth perception is affected by many kinds of information [3]. In other words, a human uses many “hints” to perceive depths when observing a 3D scene. It means the correctness of perceived depths is improved if we can provide useful hints. In this paper, we focus on providing a hint for the D. Aoi (B) Graduate School of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga, Japan e-mail: [email protected] K. Hasegawa · L. Li · S. Tanaka College of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga, Japan Y. Sakano Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology and Osaka University, 1-4 Yamadaoka, Suita, Osaka, Japan Graduate School of Frontier Biosciences, Osaka University, 1-4 Yamadaoka, Suita, Osaka, Japan © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_6

57

58

D. Aoi et al.

iso-surface visualization of human volume data. The hint we propose is to place “an additional iso-surface” near the target iso-surface to be analyzed. In other words, we overlap an iso-surface with slightly different iso-value with the target iso-surface. This proposal of ours means that we use the technique called “the multiple isosurfaces" for volume-data visualization. Mutual occlusion between the target and the additional iso-surfaces becomes the hint to improve the depth perception. In our implementation, the multiple iso-surfaces are visualized by using Stochastic PointBased Rendering (SPBR) [4, 5], which is the point-based transparent rendering. We convert the surfaces to point sets according to the prescription of reference [6] and apply SPBR to the created data.

2 Visual Assistance Based on Multiple Iso-Surfaces (Proposed Method) Based on the idea mentioned above, that is, using an additional iso-surface to improve our depth perception, the prescription of our proposed method is as follows: 1. Determine the iso-value C0 of the target iso-surface to be observed. 2. Determine an additional iso-value C1 , which is slightly different from C0 , such that the additional iso-surface with iso-value C1 is placed near and inside the target iso-surface. 3. Execute the transparent stereoscopic visualization of the target iso-surface together with the additional iso-surface. The additional iso-surface has a shape similar to the target iso-surface, and it placed near and inside the target iso-surface. The reason why we put the additional iso-surface inside is for better observation of the target iso-surface than putting it outside. Note that the inner surface also creates occlusion because we observe both the target and additional iso-surfaces in transparent visualization. In this visualization of the multiple (dual) iso-surfaces, we can also use two rendering effects to improve the depth perception. The first effect is “luminance contrast”, and the second effect is “luminance gradient”. We investigate the two effects in our experiments in the following sections. Let us explain the two effects mentioned above. The “luminance contrast” is related to the selection of background color in the visualization. It is known that the perceived depth order depends on the luminance contrast between the visualized objects and background [7–9]. An observer tends to feel an object with higher contrast against the background closer. This perceptual effect of luminance contrast corresponds to that of aerial perspective [9]. The “luminance gradient” is a kind of shading effect inherent to SPBR [10, 11]. SPBR controls surface opacity through point density. If a surface portion is inclined from the viewing direction, its apparent area for the observer becomes smaller compared with the perpendicular case. Assuming that points, which are the rendering primitives of SPBR, are distributed

Improving Depth Perception using Multiple Iso-Surfaces …

59

uniformly on the visualized surface, the inclined portion of the iso-surface becomes opaquer than the other portions. This increasing opacity works as a kind of shading effect and becomes a hint of perception.

3 Experiments 3.1 Experimental Conditions To present stimulus images, we used a 42-inch autostereoscopic display utilizing a parallax barrier (TRIDELITY Display Solutions LLC, New Jersey, United States). This 3D display presented five views that provided binocular disparity and motion parallax so that the observers could perceive the presented images in 3D without wearing special 3D goggles. The image resolution for each view was 1920 × 1080. The viewing distance for the best 3D image quality of the display was 350 cm. In the experiment, the subjects viewed the display at this distance. In the experiments, the subjects observed the images binocularly or monocularly. In the monocular conditions, the non-dominant eye was occluded by an opaque eye patch so that no binocular disparity was provided. During observing the images, the subjects fixed their heads or moved their heads laterally. In the no head-motion condition, the subjects placed their chins on a chin rest so that no motion parallax was provided. Eighteen volunteers in their 20s participated in the experiments. They wore their own glasses or contact lenses if needed. All had normal or corrected-to-normal visual acuity and normal stereo vision. Before the experiments, we confirmed that all subjects could see 3D image even only through motion parallax using the autostereoscopic 3D display. The stimulus images were presented in a random order for each subject.

3.2 Determination of Color Considering Luminance Contrast The color of the inner iso-surfaces was determined considering luminance contrast described in Sect. 2. In the experiments, the RGB values were adjusted until the images had roughly equal luminance so that the perception was not greatly influenced by luminance contrast. Consequently, the color of the outer surface was (0, 255, 0), and the color of the inner surface was (255, 125, 0). If many points overlapped, they were drawn in dark color, so the background color was white.

60

D. Aoi et al.

(a) 100 mm

(b) 150 mm

(c)200 mm

Fig. 1 Outline of the experimental images

3.3 Experiment Using Cuboid Images The test stimulus was a cuboid of surface data. The frontal surface was a square of 100 mm × 100 mm, and the depth was 100 mm (i.e., a cube), 150, or 200 mm (see Fig. 1). In some conditions, a smaller cuboid of surface data was also presented inside the cuboidal test surface to improve the accuracy of perceived depth of the test surface. The length of each side of the inner surface was half or 3/4 of that of the outer test surface. The opacity (α) of the test surface was 0.2 while that of the inner surface was 0.1 (brighter; Fig. 2a, Fig. 3a) or 0.3 (darker; Fig. 2b, Fig. 3b). Therefore, there were four conditions of the inner surface aside from the condition without the inner surface. Taken together, there were 60 experimental conditions (three magnitudes of depth of the test surface, five conditions of the inner surface, with and without binocular disparity, and with and without motion parallax). The subjects were asked to report the depth magnitude of the outer test surface, by giving a number based on the assumption that the length of one side of the front square of the test surface was one. The correct answer was 1, 1.5, or 2. The trial order of all 60 conditions was randomized for each subject. Figure 4 shows the experimental results under different conditions. As can be seen in Fig. 4, when there was no inner surface, the depth magnitude of the test surface was underestimated by approximately between 20 and 65% in all the conditions tested. This result is consistent with those of previous studies [1, 2]. However, this depth underestimation was alleviated by introducing binocular disparity and motion

(a) α = 0.1

(b) α = 0.3

Fig. 2 The central view of 150-mm deep outer surface with the 1/2 cubic inner size with different opacities

Improving Depth Perception using Multiple Iso-Surfaces …

(a) α = 0.1

61

(b) α = 0.3

Fig. 3 The central view of 150-mm deep outer surface with the 3/4 cubic inner size with different opacities

Fig. 4 Results of the cuboid experiment. The error bars indicate the Standard Error of the Mean (SEM)

parallax. This result suggests that binocular disparity and motion parallax provided by an autostereoscopic display are effective for improving accuracy of perceived depth of a transparently visualized object.

62

D. Aoi et al.

Fig. 5 Enlarged view of the upper right area of Fig. 3b

Under the condition in which neither binocular disparity nor motion parallax was provided, the inner surface alleviated the depth underestimation of the test surface irrespective of the simulated depth or opacity of the inner surface. This alleviation of depth underestimation could be due to visual contrast effect in terms of size. That is, the test surface could have been perceived to be larger by the existence of the smaller inner surface. When the inner surface was half the size of the outer surface, the depth underestimation was alleviated irrespective of the existence of binocular disparity or motion parallax. On the other hand, when the inner surface was 3/4 in the size compared with the outer surface, the alleviation effect was not consistently observed. This difference in the alleviation effect between the different sizes of the inner surface can be attributed to the fact that while the half-sized inner surface did not overlapped with the test surface in any view, the 3/4-sized inner surface did so in some views (Fig. 5). Such overlap may have precluded the visual system from detecting binocular disparity and motion parallax of the overlapped part in the views; that is, the vertical back edge of the test surface and the vertical front edge of the inner surface.

3.4 Experiments on Medical Images The medical data used in this section are three-dimensional volume data generated from CT images. The value stored in the volume data was the density value (hereafter called the “iso-value”). We prepared four kinds of medical surface data with isovalues of 78.90, 119.75, 201.44, and 242.49 as iso-surfaces. We denoted an iso-value by C0 , C1 , and the opacity by α. The outer iso-surface of all the stimulus images was C0 = 78.90 (Fig. 6a). The inner iso-surface was C1 = 119.75, 201.44 (Fig. 6b), and 242.49. We prepared a single iso-surface and multiple iso-surfaces from each figure. In the case of multiple iso-surfaces, we prepared an image with different inside sizes and inside opacities. There were three inner iso-surfaces: C1 = 119.75, 201.44, and 242.49. There were two types of opacity α: one with a darker inside (α = 0.09) and one with a lighter inside (α = 0.03) than the opacity of the outer iso-surface (α = 0.06). Fig. 7 shows the case where the inner iso-surface was lighter, and Fig.

Improving Depth Perception using Multiple Iso-Surfaces …

(a) C0 = 78.90, α = 0.06

63

(b) C1 = 201.44, α = 0.03

Fig. 6 The outer test iso-surface a and the inner iso-surface (b)

Fig. 7 Multiple iso-surfaces of the outer test iso-surface (Fig. 6a) and the inner iso-surface (Fig. 6b)

Fig. 8 Multiple iso-surfaces with the higher opacity of the inner surface (0.09)

8 shows the case where the inner iso-surface was darker and was overlapped. Those images were presented with or without binocular disparity and motion parallax. The subjects were asked to report the distance in the depth direction from the frontal cross section of the spine (the lower cylinder) to a point on the outer edge of the heart (the orange circle in Fig. 6a), by giving a number based on the assumption that the radius of the red circle was one. The correct answer was 2.45. All 28 stimulus images were presented in a random order for each subject.

64

D. Aoi et al.

Fig. 9 Results of the experiment using medical images. The error bars indicate the Standard Error of the Mean (SEM)

As in the experiment using cuboid images, the depth magnitude of the test surface was underestimated when the inner surface was not presented (Fig. 9). This depth underestimation was slightly alleviated by providing motion parallax. The depth underestimation of the test surface was alleviated by introducing an inner surface. This alleviation was clearer when the binocular disparity was provided. On the other hand, effects of the iso-value and opacity were not clearly observed.

3.5 Comparison of the Results of the Two Experiments Most importantly, it was common in the two experiments that the depth of the test surface was underestimated, yet this depth underestimation was alleviated by either introducing an inner surface or motion parallax. On the other hand, in the experiment using medical data, the alleviation of the depth underestimation seemed somewhat weaker than that in the cuboid experiment, including in the condition under which no binocular disparity was provided. Since there were many differences in the images presented in the two experiments, we

Improving Depth Perception using Multiple Iso-Surfaces …

65

Fig. 10 Enlarged view of the upper left area of Fig. 8

cannot determine what caused the differences in the results of the two experiments. Nevertheless, there would be at least two possible reasons. First, the medical inner surface was not a similar figure to the test surface (Fig. 10). The similarity could have enhanced the alleviation of the depth underestimation in the cuboid experiment. Second, in the experiment using the medical data, the inner surface could have been too close to the test surface, as in the 3/4-sized inner surface used in the cuboid experiment. It now remains to determine what caused the differences in the results of the two experiments.

4 Conclusion By conducting two psychophysical experiments, we found that perceived depth of a transparently visualized closed surface presented by an autostereoscopic 3D display could be underestimated, yet this depth underestimation can be alleviated by presenting an inner surface, which constitutes, with the outer closed surface, multiple surfaces. This alleviation effect seems pronounced when the inner surface is not too close to the outer surface. Further study may formulate the effects of the opacity and the distance between the inner and the outer surfaces on the accuracy of the perceived 3D structure of the transparently visualized medical data.

References 1. Kitaura, Y., Hasegawa, K., Sakano, Y., Lopez-Gulliver, R., Li, L., Ando, H., and Tanaka, S.: Effects of depth cues on the recognition of the spatial position of a 3d object in transparent stereoscopic visualization. In: The 5th International KES Conference on Innovation in Medicine and Healthcare (KES-InMed-17), (Smart Innovation, Systems and Technologies, vol. 71, pp. 277-282 (Short Papers)), Vilamoura, Portugal (2017)

66

D. Aoi et al.

2. Sakano, Y., Kitaura, Y., Hasegawa, K., Lopez-Gulliver, R., Li, L., Ando, H., and Tanaka, S.: Quantitative evaluation of perceived depth of transparently-visualized medical 3D data presented with a multi-view 3D display. Int. J. Model. Simul. Sci. Comput. 9(3), 1840009 (16 pages) 3. Sekuler, R., Blake, R.: Perception, 2nd edn. McGraw-Hill, New York (1990) 4. Sakamoto, N., Kawamura, T., Koyamada, K.: Improvement of particle-based volume rendering for visualizing irregular volume data sets. Comput. Graph. 34(1), 34–42 (2010) 5. Tanaka, S., Hasegawa, K., Shimokubo, Y., Kaneko, T., Kawamura, T., Nakata, S., Ojima, S., Sakamoto, N., Tanaka, H., and Koyamada, K.: Particle-based transparent rendering of implicit surfaces and its application to fused visualization, EuroVis 2012, pp. 25-29 (short paper), Vienna, Austria (2012) 6. Hasegawa, K., Ojima, O., Shimokubo, Y., Nakata, S., Hachimura, K., Tanaka, S.: Particlebased transparent fused visualization applied to medical volume data. Int. J. Model. Simul. Sci. Comput. 4:1341003, (11 pages) (2013) 7. Farne, M.: Brightness as an indicator to distance: relative brightness per se or contrast with the background? Perception 6(3), 287–293 (1977) 8. Egusa, H.: Effect of brightness on perceived distance as a figure-ground phenomenon. Perception 11(6), 671–676 (1982) 9. O’ Shea, R.P., Blackburn, S.G., Ono, H.: Contrast as a depth cue. Vis. Res. 34(12), 1595–1604 (1994) 10. Miyawaki, M., Hasegawa, K., Li, L., Tanaka, S.: Transparent fused visualization of surface and volume based on iso-surface highlighting. In: The 6th International KES Conference on Innovation in Medicine and Healthcare (KES-InMed-18), (Smart Innovation, Systems and Technologies, vol. 71, pp. 267–276 (Short Papers)), GoldCoast, Australia (2018) 11. Sakamoto, N., Koyamada, K., Saito, A., Kimura, A., Tanaka, S.: Multi-volume rendering using particle fusion. In: Poster Proceedings of IEEE Pacific Visualization Symposium 2008 (PacificVis 2008), pp. 33–34, Kyoto, Japan (2008)

Design and Simulation of a Robotic Manipulator for Laparoscopic Uterine Surgeries H. A. G. C. Premachandra, K. M. Thathsarana, H. M. A. N. Herath, D. L. F. M. Liyanage, Y. W. R. Amarasinghe, D. G. K. Madusanka, and M. A. M. M. Jayawardane

1 Introduction Laparoscopic surgery, also known as minimally invasive surgery is carried out using special instruments inserted through small incisions of the abdominal wall. The internal organs and structures are viewed by using a camera known as the laparoscope. This surgical procedure is widely used in gynaecology and obstetrics due to benefits over open surgeries. There are mainly two surgery setups: telesurgery and conventional laparoscopic surgery. In telesurgery, the surgeon carries out the surgical procedure remotely through telesurgical systems such as Da Vinci system. In the conventional setup, surgeon performs the laparoscopic surgery directly on the patient without any involvement of robotics. Either of these methods requires at least one assistant in addition to the main surgeon. One for laparoscope manipulation and another for uterus manipulation. These assistants handle the laparoscope and the uterine positioner (surgical tool) under the verbal guidance of the main surgeon. The uterine positioner is inserted into the patient’s uterus through the vaginal opening. The assistant who positions the uterus ensures its proper orientation in the pelvic region during the surgery. Limitations in verbal commands result in lowering the overall efficiency of the surgical procedure. In recent years, there have been developments to introduce robotic assistants to address this gap to achieve assistant-less solo laparoscopy. Several researches have been performed to develop robotic systems to handle the laparoscope [1, 2]. H. A. G. C. Premachandra (B) · K. M. Thathsarana · H. M. A. N. Herath · D. L. F. M. Liyanage · Y. W. R. Amarasinghe · D. G. K. Madusanka Department of Mechanical Engineering, University of Moratuwa, Moratuwa, Sri Lanka e-mail: [email protected] M. A. M. M. Jayawardane Department of Obstetrics and Gynaecology, University of Sri Jayawardenepura, Nugegoda, Sri Lanka © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_7

67

68

H. A. G. C. Premachandra et al.

However, a suitable economic solution for uterine positioning is yet to be introduced. Yip et al. [3, 4] have developed a robotic assistant for uterine positioning with three Degrees Of Freedom (DOF) with mechanically constrained Remote Center of Motion (RCM) which is controlled through a joystick. Commercially available Double ViKY® system achieves solo laparoscopy by using two robotic manipulators to control the laparoscope and the uterine positioner with interchangeable pedal control or Bluetooth voice control [5, 6]. It was noted that the ViKY® system has a conical workspace and has RCM at the entry point of the vagina and use commercially available disposable uterine positioner which is expensive and non-reusable. Cost of uterine manipulators can account for as much as 8% of the total procedure costs during laparoscopic hysterectomy [7]. The adaptability of touchless interaction systems such as LEAP® Motion Controller (LMC) and Microsoft Kinect in surgical tasks has been studied and proven to be promising as stated in [8–10]. LMC has a smaller footprint and its contact-free operation allows the surgeon for interaction with the robot without compromising sterility at the patient’s bedside. This paper presents a five DOF robotic manipulator with gesture control using LMC. A reusable uterine positioner can be attached to the proposed setup thus providing an economic solution. Ultimately, the assistant doctor for uterine manipulation is replaced by a robot which can be remotely controlled by the main surgeon.

2 Proposed System 2.1 Surgical Theatre Setup The Robotic Uterine Manipulator (RUM) is a five DOF wall-mounted serial manipulator. It uses the uterine sound surgical instrument at the end effector. Uterine sound (Fig. 1a) is used as a reusable uterine positioner. The surgical bed is often tilted to Trendelenburg position during gynaecological laparoscopy. The RUM is fixed to the Main surgeon Surgical bed Required workspace Robotic uterine manipulator (RUM)

(a)

LEAP® motion controller

(b)

Fig. 1 a Proposed surgical theatre setup b Uterine sound (end effector of the RUM)

Design and Simulation of a Robotic Manipulator …

69

surgical bed to ensure that there is no relative motion between the workspace of the manipulator and the patient during bed tilting. Usually, the LMC is placed to the left side for right-handed surgeon at lower chest height considering ergonomics (Fig. 1b).

2.2 Working Principle The assistant doctor handles the uterine positioner to perform four motions: anteversion/retroversion (transverse motion), lateral motion, tensioning and twisting. When performing these motions, large motions at the cervix inside the body may cause harm to the patient due to anatomical constraints. Hence, the motion at the cervix is constrained to move through it allowing only translation along its axis and rotation about the cervix. This is known as a Remote Center of Motion (RCM) at the cervix. The developed manipulator has a programmable RCM which makes it a flexible and space-efficient solution for different types of uterine positioners.

3 Workspace Analysis The workspace requirement of the manipulator was properly identified in order to determine the kinematic design of the manipulator. Workspace boundaries were identified in both lateral and transverse planes with the help of a medical resource person. The operating workspace was modelled using Solidworks as in Fig. 2c using identified boundaries. It is a spherical sector of which the center coincides with the cervix of the patient. The kinematics of the RUM was developed to achieve the RCM.

Fig. 2 Workspace boundaries in the (a) lateral and (b) transverse plane of the patient (c) 3D representation of the workspace

70

H. A. G. C. Premachandra et al.

4 Kinematic Analysis The inverse problem determines the joint angles, given end effector position and orientation. The proposed manipulator has five DOF (all revolute joints) such that there will be 5 variables to be determined. The inverse kinematics problem can be decoupled into inverse position kinematics and inverse orientation kinematics. Inverse position kinematics determines the position of the differential joint in this manipulator (also called as the wrist center in manipulator terminology. Refer to Fig. 3). Then the orientation of the end effector is obtained using inverse orientation kinematics. Three DOFs are needed for positioning the wrist center in the 3D workspace. Another two DOFs are needed to orient the end effector with RCM and to twist around its axis.

4.1 Inverse Position Kinematics First three joints of the manipulator are same as the articulated configuration (Fig. 4). The geometric approach was followed to derive the inverse position kinematics. The wrist center of the RUM is positioned in the workspace using Eqs. (1–4).    θ1 = tan−1 yc xc

(1)

   γ = tan−1 L 3 s3 L 2 + L 3 c3

(2)

γ > 0 → elbow up | γ < 0 → elbow down

Differential bevel base joint at the wrist center

Base joint Shoulder joint Link_1

Uterine sound RCM

Elbow joint Link_2

Workspace Frame Fig. 3 RUM terminology

Design and Simulation of a Robotic Manipulator …

71

Fig. 4 Inverse position kinematics for articulated configuration

    − L (z ) c 1 θ2 = tan xc2 + yc2 − γ −1

 θ3 = tan−1 s3 c3

(3) (4)

4.2 Inverse Orientation Kinematics The wrist has a differential bevel joint with two DOF (Fig. 5) which determines the orientation of the end effector (uterine sound) and is responsible for maintaining the RCM stationary and performing the twisting motion. Given the orientation α1 and the twist α2 , pinion angles can be calculated.

Fig. 5 Differential bevel base joint at the wrist center

θ4 = α1 + α2

(5)

θ5 = α1 − α2

(6)

72

H. A. G. C. Premachandra et al.

4.3 The Mapping Algorithm The mapping algorithm was developed in order to map four output parameters obtained from LMC to the inverse kinematics of the manipulator. This algorithm will maintain a software-based RCM instead of a mechanical RCM. Four parameters are as follows: 1. 2. 3. 4.

X coordinate of the workspace (X) Y coordinate of the workspace (Y) Uterus tensioning distance (r) Angle of twist (twist)

The LMC was programmed to give these four parameters as outputs to control the manipulator. Refer to Fig. 6a, the maximum workspace radius (R) depends upon the initial length of the uterus (D) and the length of the uterine positioner (U). R=U−D

(7)

Typical length of the uterine sound which is used as the uterine positioner (U) is 330 mm and the initial length of a normal uterus was taken as 110 mm. Step 1. Calculate the Z coordinate of P using X, Y (from LMC) and R.  Z = + R2 − X 2 − Y 2

(8)

Step 2. Find O P  , where P  is the actual position of the wrist center of the manipulator and calculate components of O P  along X, Y and Z (refer to Fig. 6b). Here r is the uterus tensioning distance (from LMC). 

OP = R −r

Fig. 6 a Geometrical parameters of the RUM b Variables in the workspace

(9)

Design and Simulation of a Robotic Manipulator …

73

Fig. 7 Wrist center movement being proportional to the uterus movement

x W = O P  sin β cos ϕ

(10)

yW = O P  sin β sin ϕ

(11)

z W = O P  cos β

(12)

These are the positions of the wrist center according to the workspace coordinate frame. P is the position, which was derived from the X, Y outputs from the LMC. P is always on the outer surface of the workspace while P  moves inside the  workspace. User controls the P position using X, Y and O P distance using r (from LMC). According to Fig. 7, there is no need to find the coordinates and control end effector tip. Uterine manipulation is achieved by controlling the wrist center inside the workspace. Known variables in Fig. 6a are as follows. S = Clearance distance (300 mm), U = Uterine positioner length (330 mm), L 1 = Distance from base to joint2 (120 mm), L 2 = Link length (340 mm), L 3 = Link length (300 mm). Step 3. Transform the wrist center coordinates from the W frame to the M frame. x M = −xw

(13)

y M = yw

(14)

z M = L 1 + S + R − zw

(15)

Use (1), (2) and (4) to find first three joint angles (θ1 , θ2 and θ3 ). The wrist center of the manipulator is positioned using inverse position kinematics. Step 4. The RCM is maintained by the differential bevel joint. First, find α1 such that the end effector (uterine sound) goes through the RCM as in Fig. 8b. Then take the angle of twist (α2 ) from the LMC output and calculate the pinion angles using (5) and (6). α1 = −θ2 − θ3 + β + 90

(16)

74

H. A. G. C. Premachandra et al.

Fig. 8 a Workspace coordinate frame W and the manipulator’s coordinate frame M b Find α1 such that RCM is obtained

5 Dynamic Analysis The maximum angular acceleration of a joint was identified by defining a suitable velocity profile in the joint space (Fig. 9). According to the defined velocity profile, the maximum angular acceleration of a joint is 150 degs−2 . Two motions were identified such that the torque at each joint gets maximum (Figs. 10 and 11). Each joint was given a constant angular acceleration of 180 (>150) degs−2 . Force applied near the tip of the uterine positioner was taken as 2 N (approximate weight of a normal uterus).

Fig. 9 Defined motor curve for the worst case (60° in 1.5 s) using Solidworks Motion

Fig. 10 Motion_1 along the YZ plane. Base joint is not rotating

Design and Simulation of a Robotic Manipulator …

75

Fig. 11 Motion_2 along the XY plane. Only the base joint is rotating

The motion_1 in Fig. 10 gives the maximum torques at last three joints (ignoring twist) whereas, motion_2 in Fig. 11 gives the maximum torque at the base joint.

5.1 Topology Optimization Topology optimization was carried out for each component in order to reduce their weights thus reducing torque requirements at joints. The optimization sequence starts from end effector to the base. The Motion_1 and Motion_2 were simulated in Solidworks Motion and boundary loads were obtained on each component. Then the topology optimization procedure was carried out for each critical component (Fig. 12). A design validation is performed using another static structural simulation under same boundary conditions with a safety factor of 1.5 for maximum von misses stress. Figure 13 shows before and after material layouts of linkage_2. Machining

Fig. 12 Flow chart of topology optimization

Fig. 13 Topology optimization example (Linkage_2)

76

H. A. G. C. Premachandra et al.

Fig. 14 Overview of the control architecture

feasibility is also considered during the optimization process. Significant reduction (around 20%) of joint torques was obtained after topology optimization.

6 Control Architecture Figure 14 depicts the overview of the proposed control architecture of the RUM.

7 Testing and Validation in the Virtual Environment The designed RUM was modelled in CoppeliaSim EDU virtual environment to test and validate the developed high-level control algorithms as in Fig. 15.

Fig. 15 Inverse kinematics testing process with CoppeliaSim EDU virtual model

Design and Simulation of a Robotic Manipulator …

77

7.1 Joint Space Versus Task Space Controlling Strategies An initial position and a goal position in the workspace were defined to visualize all possible motion patterns of the RUM and the input coordinates were given to the inverse kinematic model to calculate the joint angles for each point. Then the trajectory of the end effector was tracked. Joint space controlling is more convenient for the controlling algorithm due to less computational power and less demanding transmission rate between motor controller and the high-level controller. However, it results in ambiguity of the end effector path (Fig. 16) which cannot be tolerated. For task space controlling, a desired path between initial position and the end goal is generated which lies within the workspace. Then the motion of the end effector was controlled such that the deviations from generated path can be minimized up to the required level. This guarantees that each motion is bounded by the workspace boundary constraints. The increased sample size used here causes the joint motion between two consecutive samples to be negligibly small thus making the manipulator end effector in total control using task space control strategy. Finally, the required four motions were obtained in the task space as in Figs. 17 and 18.

(a)

(b)

Fig. 16 a Reaching the goal in joint space control b Reaching the goal in task space

(a)

(b)

Fig. 17 a Anteversion and retroversion motion b Lateral motion in the virtual model

78

H. A. G. C. Premachandra et al.

(a)

(b)

Fig. 18 a Tensioning motion b Twisting motion in the virtual model

8 Conclusion The five DOF Robotic Uterine Manipulator with a programmable RCM can mimic all four motions performed by the assistant doctor. The LMC-based gesture control provides an easy-to-learn, intuitive interface for the main surgeon to control the manipulator through hand gestures by himself eliminating the problem of miscommunication during surgery. The proposed RUM has been virtually simulated in CoppeliaSim virtual environment using task space controlling for kinematic validation. The proposed RUM is expected to be fabricated and controlled through LMC using hand gestures of the main surgeon. The deviation of RCM will be evaluated to validate that it remains stationary based on the kinematic model. The fabricated prototype will be tested using a medical manikin (female pelvis model). Additional safety features will be integrated and ex vivo experiments will be performed on human cadaver before proceeding with clinical trials. Acknowledgments The authors would like to express their gratitude to the Accelerating Higher Education Expansion and Development (AHEAD)—Development Oriented Research (DOR) grant of the Centre for Advanced Mechatronic Systems (CFAMS), University of Moratuwa for their financial contribution and the CFAMS for their valuable advices and guidance towards the success of the research.

References 1. Gilbert, J.M.: The EndoAssist robotic camera holder as an aid to the introduction of laparoscopic colorectal surgery. Ann. R. Coll. Surg. Engl. 91(5), 389–393 (2009). Jul 2. Nelson, C.A., Zhang, X., Shah, B.C., Goede, M.R., Oleynikov, D.: Multipurpose surgical robot as a laparoscope assistant. Surg. Endosc. 24(7), 1528–1532 (2010). Jul 3. Yip, H.M., Li, P., Navarro-Alarcon, D., Liu, Y.: Towards developing a robot assistant for uterus positioning during hysterectomy: system design and experiments. Robot. Biomim. 1(1), 9 (2014). Dec

Design and Simulation of a Robotic Manipulator …

79

4. Yip, H.M., Wang, Z., Navarro-Alarcon, D., Li, P., Liu, Y., Cheung, T.H.: A new robotic uterine positioner for laparoscopic hysterectomy with passive safety mechanisms: design and experiments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3188–3194 (2015) 5. Akrivos, N., Barton-Smith, P.: A pilot study of robotic uterine and vaginal vault manipulation: the ViKY Uterine PositionerTM . J. Robot. Surg. 7(4), 371–375 (2013). Dec 6. Swan, K., Kim, J., Advincula, A.P.: Advanced uterine manipulation technologies. Surg. Technol. Int. 20, 215–220 (2010). Oct 7. Croft, K., Mattingly, P.J., Bosse, P., Naumann, R.W.: Physician education on controllable costs significantly reduces cost of laparoscopic hysterectomy. J. Minim. Invasive Gynecol. 24(1), 62–66 (2017) 8. Liu, J., Tateyama, T., Iwamoto, Y., Chen, Y.-W.: A preliminary study of kinect-based real-time hand gesture interaction systems for touchless visualizations of hepatic structures in surgery. Med. Imaging Inf. Sci. 36(3), 128–135 (2019) 9. Kim, Y., Kim, P.C.W., Selle, R., Shademan, A., Krieger, A.: Experimental evaluation of contact-less hand tracking systems for tele-operation of surgical tasks. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3502–3509 (2014) 10. Fujii, R., Tateyama, T., Kitrungrotsakul, T., Tanaka, S., Chen, Y.-W.: A touchless visualization system for medical volumes based on kinect gesture recognition, In: Innovation in Medicine and Healthcare, pp. 209–215. Cham (2016)

Self-Skill Training System for Chest Compressions in Neonatal Resuscitation Workshop Noboru Nishimoto, Reiji Watanabe, Haruo Noma, Kohei Matsumura, Sho Ooi, Kogoro Iwanaga, and Shintaro Hanaoka

1 Introduction Fifteen percent of neonates need resuscitation for respiratory stability, immediately after birth [1]. Therefore, all of the staff attending the birth are expected to be trained in neonatal resuscitation procedures. Japan Society of Perinatal and Neonatal Medicine has been holding the Neonatal Cardio-Pulmonary Resuscitation (NCPR) training workshop widely since 2007 [2, 3]. The workshop consists of three parts: lecture in the classroom, basic technique exercises, and scenario training. In the lecture, an instructor explains the algorithm and basic knowledge of NCPR. Then, trainees practice chest compression and artificial respiration in the basic technique exercises. In the scenario training, the trainee performs resuscitation training using a training simulator that imitates the shape of a neonate. There is a large difference in the skills of chest compression and artificial respiration between experts and beginners [4]. Also, chest compressions for neonatal are so serious conditions that do not occur occasionally in daily clinical practice, and it is difficult to maintain the technique in daily practice. There is also a way to learn it by attending a workshop. However, health professionals are busy and the number of workshops is limited. Therefore, it is necessary to do self-training to maintain the skill level. Additionally, the current training method mainly focused on oral instruction, practical skills, and subjective evaluation. There have been reports that oral instruction tends to lead to inaccurate techniques such as depth and position of chest compression. So, an objective review is important for this kind of skill training.

N. Nishimoto (B) · R. Watanabe · H. Noma · K. Matsumura · S. Ooi College of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan e-mail: [email protected] K. Iwanaga · S. Hanaoka Department of Pediatrics, Kyoto University Hospital, Kyoto, Japan © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_8

81

82

N. Nishimoto et al.

In this study, we focused on chest compression as a basic technique in neonatal resuscitation. We have developed an objective self-training system for the workers aimed at keeping skills and demonstrated the effect of the proposed method from the experiment. We are aiming to realize an effective and inexpensive training system.

2 Proposed System We developed a personal training system that enables medical staffs to self-train at any timing and review the training results by themselves. As a use case of this system, we envision that the system is installed at nurse station in a ward, and medical staff train voluntarily while having some spare time. Figure 1 shows the overall flowchart of a training procedure using this system. First, a trainee logins to the system with their ID and password then selects “Chest Compression Training” or “View Training Log” from the menu. When the user selects “Chest Compression Training,” the system screen switches into the countdown. When the countdown reaches 0, training starts and a trainee performs chest compression action in 30 s. At that time, the monitoring system acquires timing and applies compression force. In general, chest compressions procedure is normally performed by repeating a set of three times of chest compression action and one time of artificial respiration in every 2 s. Therefore, learning the correct timing of chest compression and artificial respiration, we employed automatic timing cue indicated by sound. In Japanese hospitals, medical staff performs this procedure according to the call of “One. Two, Three and Bag!”.

Fig. 1 Proposed system flowchart

Self-Skill Training System for Chest Compressions …

83

Table 1 Review item Evaluation item

Description

Number of compression action

Total number of compression action for 30 s. The appropriate number of times is 45 times in 30 s.

Number of proper compression action Number of compression action with proper strength for the 30 s Appropriate compression rate

Appropriate number of compression action and Number of compression action ratio

Number of sets

1 set of 3 chest compressions and 1 artificial respiration. Appropriate number of sets is 15 in 30 s

Second, after the 30-second of training, the system shows a time-series graph of the last training and some objective analysis for self-review. We evaluate trainee’s action based on the following four review items as shown in Table 1, the number of compression action, the number of appropriate compression action, the proper compression rate, and the number of sets. This self-reviewer phase allows the trainee to check the quality of the last procedure. Last, after one set of training, a trainee will select retry or finish training. We designed this system so that trainees can learn not only the skills but also the knowledge about NCPR. At the end of the training, the system makes some five knowledge questions related to NCPR term and chest compression techniques. These results of skill training and the knowledge questions are stored in the local database, and the system sends them to the trainee and the instructor simultaneously. Both the trainees and their instructor can always browse past training results and check the training status. This learning history blowing function allows keeping the motivation of training skill training for trainees and maintains the skills of the entire team.

3 Chest Compression Monitoring System Chest compression monitoring system records normal chest compression behavior by attaching a film-spread pressure sensor to the chest of a simulation model imitating newborn baby (Fig. 2). The system allows to measure compression rate and depth, and then, results are displayed on the LCD monitor in real time (Fig. 3). Our proposed system transmits a set of pressure sensor records to PC simultaneously. Then the system evaluates the results of the chest compression action and displays results. The results were stored in a built-in database and it is used for managing the training status. In our design, it is easy to mount the pressure sensor on any simulator model and small computers that are widely used. Therefore, this system could be developed with a very low installation cost (Fig. 4). Next, we describe the procedure for the skill evaluating process using the proposed chest compression monitoring system. Figure 5 is a schematic diagram of time-series

84

N. Nishimoto et al.

Fig. 2 Proposed chest compression monitoring system

Fig. 3 Real-time graph on LCD monitor

Fig. 4 Training mode

results from chest compression action. Here is an example waveform showing three peaks, it means that a trainee compresses the chest three times. First of all, it is necessary to push down the neonate’s sternum enough with one compression. The upper and lower limits in Fig. 5 indicate the suitable compression force. In other words, a peek of compression should be kept between the upper and lower limit. We adopt maintaining compression peek between the upper and lower limits as an evaluation item.

Self-Skill Training System for Chest Compressions …

85

Fig. 5 Schematic diagram of time-series data

Furthermore, recoil action is important in chest compressions. While chest compression, the trainee’s fingertips should be always in touch with the newborn’s chest. After they compress once, they need to release the compression force enough to return the sternum to the original position. Evaluating recoil action, we added two evaluation indexes which are “Threshold 1” to detect that the compression has been released sufficiently after the peak and “Threshold 2” that determines that the fingertip is always in contact with the sternum. Also, small computers acquire pressure data at 40 Hz, and when it detects peak compression, it sends peak measurement value to the host PC. In this paper, we employed the peak value and timing as the evaluation parameter as the first trial in the later discussion.

4 Evaluation Experiment To verify the effectiveness of this system from the student’s standpoint, we interviewed 21 nurses at NICU at Kyoto University Hospital who conducted chest compression training using a self-training system. In the experimental design, one subject underwent three chest compressions training session. Verifying the effect of looking at the waveform of the chest compression action on the monitor, in the second session, we asked the subjects to looks back at the waveform of the compression displayed on the computer in real time, and they looked at simulator baby only in the first and third sessions. There was no break time during the three sessions and the sessions were performed consecutively. Then, on another day, we interviewed the subjects about this system.

86

N. Nishimoto et al.

5 Consideration Firstly, we analyzed the procedure data and questionnaire results of the subjects obtained from the activity evaluation. As shown in Fig. 6, the average compression adequacy rate in subjects 14, 17, 20, and 21 shows around 80%, and it means that they did not give enough compression force. The rest of them achieved more than 90%. As the less compression force case, we found that two subjects compressed out of the position on the model where the sensor was mount and the system did not measure the whole applied force in their case. The average number of compressions time was 45 times, while the smallest was 39 times, and the largest was 50 times as shown in Fig. 7. These results show all subjects could perfume chest compression well. Secondly, we focused on performance with and without real-time monitor. Figures 8 and 9 show the results of evaluating the proper compression rate and the number of compressions for each session. T-test results show no significant difference was found in any of the combinations in the proper compression rate. The results show that the real-time waveform monitor gives any difference. However, after several proper compressions rate, there was a significant difference between Fig. 6 Average compression appropriate rate

Fig. 7 Average compression appropriate rate and the number of compression action

Self-Skill Training System for Chest Compressions …

87

Fig. 8 Proper compression rate by trial order

Fig. 9 Number of proper compression rate by trial order

the first and second times and between the first and third times. We consider that the subject had become proficient over and over. Thirdly, we describe the result of the questionnaire survey. In the questionnaire on the effectiveness of the system, we prepared eight questions and correct 17 subjects’ answers. No subject remarked that there was any risk affecting artificial chest compression skills by using the proposed training method thought all answer, therefore, we considered that there are no side effects. From question 1, “Please tell us how you felt that the training was useful or useless for acquiring or maintaining your skills,” every trainee answered this training system was useful. We got some comments as follows I can compress the chest with good rhythm, but the pressure is weaker than a real baby. (ID:009), I felt it was very useful for acquiring techniques. I got a knowledge test after the training, then I felt that I wanted to practice assessments such as setting the situation before resuscitation. (ID007), I have never operated chest compressions for a real baby, so it is good to train timing and pressing power of chest compression. However, I don’t know if I can take this training advantage when I need to do it for a real baby (ID:012).

From question 6 “If you plan to train chest compression skill using the system, how often do you want to do repeatedly?” Figure 9 shows the results. The graph

88

N. Nishimoto et al.

said that most respondents want to train once every six months, and secondarily they want to use it once every three months. In particular, some subject said, I want to use this system a daily basis (ID:008). I want to use it once or twice a year (ID:008).

These results indicate that voluntary contentious training is required to maintain the skills of the technique. In the future, we plan to verify the effects of using the system for a long time (Table 2). Finally, summary of the result of freely describing opinions about the system are shown in Table 3. The number in parentheses in the table indicates the user ID. Among them, we should focus the comment about the real-time waveform monitor while practicing. In these experimental results, we do not find any significant difference in the suitability of the procedure with or without a monitor. Needless to say, in the actual clinical setting, maximum attention must be paid to the patient rather than the Table 2 Interview on the system Q.1: Please tell us how you felt that the training was useful or useless for acquiring or maintaining your skills Q.2: Please tell us how you felt that the report such as appropriate number of compressions was useful or useless for acquiring or maintaining your skills Q.3: Please tell us the evaluation items you want to add Q.4: Did you find voice support useful or useless? Q.5: Please tell us only those who answered “useless” in Q.4. Why did you find it useless, or what kind of information presentation would you help you in the training? Q.6: If you plan to train chest compression skill using the system, how often do you want to do repeatedly?”

Table 3 Opinions on the system

• I felt less compressing pressure than a real baby in the system (ID:014) • The feeling of compression action is different from the actual baby (ID:009) • I want to add recoil as the evaluation item (ID:011) • I want to do chest compression training in different size of neonatal model such as smaller one (ID:008) • When I looked at the real-time compression depth in monitor, • I was too conscious of the display. I felt easier without looking at real-time display (ID: 014) • I want more practice question after training (ID:007) • I want to use the system on a daily basis (ID:008) • I want to use it once or twice a year (ID:008) • I want to evaluate artificial respiration (ID:021)

Self-Skill Training System for Chest Compressions …

89

display of medical devices. We should carefully consider that real-time display on monitor may have potential risks even if on the training.

6 Conclusion In this paper, we developed the self-training system and conducted experimental evaluation. From the results of the questionnaire, all subjects supported the effectiveness of the proposed training method. As a future trial, first, we will focus the chest compression of recoil as an additional evaluation item. Next, we got 15 requests for training in artificial respiration techniques to be conducted simultaneously with resuscitation procedures. We will add training for artificial respiration skills. It is possible to monitor the air pressure in the bag and mouse peace. Last, we add the support function of the remote lecture. Now, the database is located on local PC, so the instructor needs to access the system from the same PC that installed the system. If we replace the database on the Internet cloud, it is possible to access from any place and realize to maintain training results of some clinic at once. Acknowledgments This research and development work were supported by the MIC/SCOPE # 181607012.

References 1. JasmeetSoar, M., Donnino, W., Maconochie, I.: 2018 International Consensus on Cardiopulmonary Resuscitation and Emergency Cardiovascular Care Science With Treatment Recommendations Summary. https://www.sciencedirect.com/science/article/pii/S0300957218310013 (2018 2. Neonatal Cardio-Pulmonary resuscitation. McNeal, Gloria J: Simulation and Nursing Education. ABNF J. https://www.ncpr.jp/eng/ (2010) 3. Nehring, W.M.: U.S. boards of nursing and the use of high-fidelity patient simulators in nursing education. J. Prof. Nurs. (2008) 4. Kaku, N., Muguruma, T., Ide, K.: Are simulation-based programs as for resuscitation training effective for pediatric residents? https://www.jstage.jst.go.jp/article/jjaam/24/12/24_984/_art icle/-char/ja (2013)

Statistical Signal Processing and Artificial Intelligence

Comparative Study of Pattern Recognition Methods for Predicting Glaucoma Diagnosis Louis Williams, Salman Waqar, Tom Sherman, and Giovanni Masala

1 Introduction Glaucoma is a group of diseases that are characterised by degeneration of the optic nerve with particular patterns of corresponding defects in the visual field. This is usually, but not exclusively, associated with a raised intraocular pressure. Visual field defects are permanent and if the disease progresses enough, it can result in permanent blindness. It is the second leading cause of blindness globally. However, early detection and treatment can often protect against serious vision lose [1]. Repeated measurements of the visual field over time are required to detect early changes. Typically, eyes affected by glaucoma have a raised intraocular pressure. This can be related to anatomical factors affecting the depth of the front chamber of the eye. Where the front chamber is too narrow, the condition is termed chronic narrowangle glaucoma (CNAG). Where glaucoma occurs despite no narrowing, it is termed primary open-angle glaucoma (POAG). POAG accounts for the vast majority of glaucoma cases. In some cases of glaucoma, there are clear signs of optic nerve degeneration when the nerve is examined clinically or imaged. However, there may be no evidence of L. Williams University of Plymouth, Plymouth, UK e-mail: [email protected] S. Waqar · T. Sherman Royal Eye Infirmary, Derriford Hospital, Plymouth, UK e-mail: [email protected] T. Sherman e-mail: [email protected] G. Masala (B) Manchester Metropolitan University, Manchester, UK e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_9

93

94

L. Williams et al.

visual field defect, in which case preperimetric glaucoma is applied. The biological mechanism that allows the visual field to remain intact despite optic nerve damage is not fully understood, as one would expect a strong anatomical: physiological correlation. Imaging of the optic nerve is performed by a technique termed optical coherence tomography (OCT). The most common abnormality of the optic nerves in glaucoma is termed ‘cupping’. Some optic nerves display signs of cupping but never develop field defects. In this situation, the changes are termed physiologic disc cupping (PDC). Ocular hypertension (OHT) exists where there is a raised intraocular pressure but no signs of optic nerve damage. A small proportion of people with this will eventually develop glaucoma, so regular monitoring is necessary to detect these cases. There have been many previous works on the classification of glaucoma by machine classifiers [2] used a support vector machine using eight parameters achieving a AROC of 0.981. Classifying Healthy, early and advanced glaucoma. Linear discriminant analysis (LDA), generalised linear model (GDM) and generalised additive model (GAM) were also tested [3] focused on predicting preperimetric glaucoma using a DNN with 52 parameters, two hidden layers and connected with stacked denoising autoencoders thus. Producing a AUC of 92.6% and distinguishing preperimetric glaucoma visual fields from healthy visual fields. Phan et al. [4] evaluates the performance of deep convolutional neural networks (DCNNs) for glaucoma discrimination using images producing AUC of 90% or more. We aim to produce results with less parameters, as found in actual practice due to incomplete data and pre-guideline recording. Diagnosing a wider range of glaucomatous diseases than works, which just distinguish between healthy and glaucomatous visual fields. Whilst pathognomic changes in the intraocular pressure, disc appearance, disc/macula OCT and visual field tests are easily interpreted as glaucomatous, we aim to show that a system of analysing just the visual field and optic disc OCT can have similar sensitivity and specificity to clinical observer. This opens up the possibility of glaucoma diagnosis and monitoring with limited tests (whilst also opening up a discussion around prognostic indicators) and without the need for image analysis of the optic disc appearance (which to date remains a challenge to replicate for automated systems). This comparative study of Deep Neural Networks, Decision Trees, Support Vector Machine and k-Nearest Neighbours for the classification of glaucoma will determine which method performs best for this task. The proposed system aims to predict diagnosis for ocular hypertension, chronic glaucoma and healthy patients. The proceeding sections are organised as follows. Section 2 describes concepts of classification algorithms, including deep neural networks. Section 3 discusses configuration and implementation of these algorithms. Section 4 shows the dataset used for training and testing, detailing demographics and composition. Section 5 discusses the results. Finally, in Sect. 6 conclusions regarding optimal methods and possible future works are presented.

Comparative Study of Pattern Recognition Methods …

95

2 Background on Classifiers 2.1 Deep Neural Network Deep neural networks (DNN) also known as deep feedforward neural networks are a popular method for classification and regression [5]. This bio-inspired method is modelled on how our brains are structured. Built with layers of neurons (nodes) which react to a given stimulus (input), producing a response. These neurons are modelled by an activation function such as a sigmoid or Gaussian function. This method is trained by adjusting weights which connect the nodes to gain a desired decision surface, so when given input x a prediction of y is given. DNNs are artificial neural networks which have multiple layers of neurons [6]. Doing this increases, the networks ability to accurately represent a decision boundary. For multiclass classification, the Softmax function is used on the output layer. Softmax normalises the output, a discrete variable with n possible values. Equating to a categorical probability distribution, producing the probability of the classification being correct.

2.2 k-Nearest Neighbours k-Nearest Neighbours (KNN) is a simple algorithm which stores given data and produces predictions based on a similarity measure, such as euclidean distance. A prediction is given using a majority vote based on the k closest data-points [7]. For this type of deterministic classifier, it is necessary to have a training set which is not too small, and a good discriminating distance. KNN performs well in multiclass simultaneous problem solving. There exists an optimal choice for the value of the parameter k, which results in better performance of the classifier.

2.3 Support Vector Machine The support vector machine (SVM) is a kernel-based method. Given training examples labelled either ‘yes’ or ‘no’, a maximum-margin hyperplane is identified which splits the ‘yes’ from the ‘no’ training examples, such that the distance between the hyperplane and the closest examples (the margin) is maximised [8]. Since generally a larger margin equates to a lower generalisation error. There is a way to create non-linear classifiers by applying the kernel trick to maximum-margin hyperplanes. The resulting algorithm is formally similar, except that every dot product is replaced by a non-linear kernel function.

96

L. Williams et al.

The margin determines the offset of the vector to the data-points γ controls the range in which points are included. These points on the edge of the optimal hyperplane are known as support vectors, giving the method its name.

2.4 Decision Trees A decision tree consists of a set of ordered rules for classifying data. Each node in the tree addresses an input variable. Leaves assign labels or values to the data. Decision trees are quite different to the methods already described as they are not kernel based. They are intuitive when compared to ANNs, due to the fact the reasoning behind a classification can be easily followed. However, if greedy learning algorithms are used then DTs can be prone to getting stuck in local optima. Furthermore, algorithms can produce overly complex or large trees that do not generalise well (cause overfitting). Decision trees are constructed by splitting the dataset recursively into subsets based on a input variable. This is done until all subsets have the same class, the sets become too small or splitting no longer improves classification [9]. This is measured by one of the following commonly used metrics, Gini impurity or Shannon entropy. They both measure diversity in a set of discrete data, also know as purity [10].

3 Methods 3.1 DNN The deep neural network architecture used consisted of a four layer network with 6 input nodes, two hidden layers with 30 and 20 nodes, respectively, and four output nodes (one for each class) using softmax. Input features are detailed in Sect. 4.1. The hidden layers used a rectified linear unit (ReLU) [11] activation function, as this produced the best results in testing. The AdaGrad algorithm was used; a modified stochastic gradient descent was used for gradient descent optimisation [12]. AdaGrad adapts learning rates by scaling them, inversely proportional to the square root of the sum of all the historic values of the gradient. The effect of this is greater progress in more shallow sloped directions of the parameter space [5]. The addition of a dropout layer was tested, though did not provide an improvement in results. The DNN was simulated using TensorFlow [13].

Comparative Study of Pattern Recognition Methods …

97

3.2 SVM The SVM was created with the scikit-learn library [14], as where all algorithms except the DNN. A range of C values and kernels were tested. Kernels tested were sigmoid, radial basis function (RBF), polynomial and linear. A configuration with a C = 50, γ = 1/n_features and a non-linear RBF activation function, proved to be the most optimal configuration for the model. Out performing over tested configurations. C is the penalty parameter for the error term. It balances the size of the margin with the amount of data-points correctly classified. It defines how much you want to avoid misclassifying a data-point. The larger the C the smaller the hyperplane of the margin though this hyperplane will give a better decision boundary.

3.3 KNN The KNN uses k = 3 neighbours, using the Minkowski metric with a power of 2 ( p = 2) which is equivalent to the euclidean distance. Using uniform weights, where all in a neighbourhood are equal.

3.4 DT The implemented decision tree uses the Gini impurity metric as classification criteria. On creating a split, the best split is taken. A maximum depth was set to avoid all leaves becoming pure. A range of random forests was tested for performance with different number of estimators.

4 Dataset The initial dataset consisted of 119 patients OCT nerve fibre layer thicknesses and Visual Field indices of sensitivity over three visits starting in 2014/2015. To total 236 eyes. All data was completely anonymized and only included test readings (shared prior to GDPR regulations introduction). The dataset comprises multiple diagnoses OHT, PDC, Normal Tension Glaucoma (NTG), POAG, CNAG and Healthy. CNAG and PDC cases were excluded due to limited examples in the dataset, which could lead to inaccurate prediction. After cleaning the dataset size reduced to 196 eyes, Fig. 1 shows the exact breakdown of classes.

98

L. Williams et al.

Fig. 1 A breakdown of classes in dataset. A uneven class distribution within the dataset can be seen

Healthy 84 54 19

Table 1 Dataset demographics Parameter Healthy (n = 39) Gender (male/female) Age, years (mean±SD) Visual Field (MD) Visual field (PSD)

39

Glaucoma (n = 115)

OHT (n = 84)

20/19 69 ± 14.10

57/58 72 ± 9.56

43/41 62 ± 10.88

−1.78 ± 2.71 2.18 ± 1.53

−5.51 ± 6.88 4.26 ± 3.37

−2.16 ± 2.32 2.51 ± 1.40

OHT POAG NTG

Only the most recent visit readings were used in the dataset for the methods tested, as data was incomplete in previous visits. This was due to the incomplete nature of the earlier sets as not all of the tests and measurements take place at every visit. Any patterns which had missing data were removed. The following Table 1 shows the demographic of the dataset by class.

4.1 Feature Selection Features were selected based on their relevancy to diagnosis, mean deviation (MD) and pattern standard deviation (PSD) on the Visual Field, and Superior (Sup), Inferior (Inf), Nasal, Temporal (Temp) average RNFL segment thickness on the disc OCTs. Analysis of principle components (PCA) was carried out though produced the same results as the manually selected features. OCT Disc retinal nerve fibre layer thickness readings were only provided in four segment averages. Due to having limited parameters form the OCT, PCA did not produce better feature selector. Various methods were tested to normalise the dataset, centering to the mean and doing a component wise scale to the unit variance produced the best results. A small amount of noise was added to the dataset, this was done to increase the number of examples. This was implemented using a Gaussian noise with a standard deviation of available σ = 0.2, where we use an original sample X and a virtual sample X + noise. The generated noise was added to the whole set doubling the amount of examples. This resulted in a slight improvement in classifier accuracy of a couple percent.

Comparative Study of Pattern Recognition Methods …

99

5 Experimental Results All the results are described in terms of sensitivity and specificity in this case they are the weighted averages of each class’s sensitivity and specificity. Results are also measured in AUC, also known as AUROC or the area under receiver operating characteristic.

5.1 General Chronic Glaucoma Diagnosis In this experiment, all types of glaucoma are defined under the same label, e.g. POAG, CNAG → Glaucoma. This was carried out to test the general ability to classify between glaucoma, OHT and healthy cases.

5.1.1

DNN

With this dataset, a sensitivity of 77.52% and a specificity of 86.63% were achieved, with a k-Fold k = 5 cross validation (CV) on the DNN [15]. Figure 2 shows the receiver operating characteristics (ROC) curve for each fold (Fig. 2).

5.1.2

SVM

The following Table 2 shows the results using LooCV.

Fig. 2 Receiver operating characteristic for each class using DNN

100

L. Williams et al.

Fig. 3 Receiver operating characteristic for each class using SVM

Table 2 Comparison of classifier results, using LooCV (k = n) Method Sensitivity Specificity Voting (SVM + KNN) SVM DNN k-Neighbours Random forest Decision tree

81.30 ± 0.13 80.67 ± 0.14 78.99 ± 0.08 76.68 ± 0.10 68.07 ± 0.24 66.60 ± 0.19

87.92 ± 0.05 86.80 ± 0.06 86.87 ± 0.05 86.31 ± 0.04 77.28 ± 0.12 76.71 ± 0.11

AUC 0.85 ± 0.04 0.84 ± 0.04 0.83 ± 0.02 0.81 ± 0.03 0.73 ± 0.06 0.72 ± 0.04

5.2 More Detailed Glaucoma Diagnosis This experiment aims to gain resolution into diagnosing different types of glaucoma. Labels are kept with the diagnosis as given by doctors, which consists of four classes OHT, POAG, NTG and healthy cases.

5.2.1

SVM

With this dataset, the best results were produced using a SVM. With a sensitivity of 82.90%, specificity of 92.18% and AUC of 0.87, however, as shown in Fig. 4, the characteristic of the NTG ROC curve with a AUC of 0.88 does not perform as well as other classes. This is due to the low amount of patterns which are contained in the dataset for this class. Which causes a weaker ‘understanding’ in the model of said class. Table 3 shows results for all the classifiers tested.

Comparative Study of Pattern Recognition Methods …

101

Fig. 4 Class ROC for SVM. Effects of low NTG class count can be seen

Table 3 Comparison of classifier results, using LooCV (k = n) Method Sensitivity Specificity Voting (KNN + SVM) DNN SVM k-Neighbours Decision tree Random forest

81.37 ± 0.12 80.61 ± 0.12 79.85 ± 0.11 75.51 ± 0.14 64.29 ± 0.34 63.52 ± 0.28

91.93 ± 0.03 92.33 ± 0.02 91.14 ± 0.03 88.16 ± 0.06 79.40 ± 0.14 78.42 ± 0.15

AUC 0.87 ± 0.05 0.86 ± 0.05 0.85 ± 0.04 0.82 ± 0.04 0.72 ± 0.11 0.71 ± 0.08

5.3 Voting Classifier A combination of the SVM and the KNN was used to produce a voting classifier using majority rule voting, in an attempt to improve classification accuracy. As shown in Table 3, this was successful.

6 Conclusion In this paper, we show that the proposed multiple systems can predict diagnosis for ocular hypertension, chronic glaucoma and healthy patients with a reasonable confidence. Health services around the globe face challenges dealing with the growing demand placed on them by an increasing prevalence of glaucoma. Glaucoma clinics have particular difficulties compared to other eye diseases, given that the condition is a lifelong disease requiring multiple visits to the hospital to diagnose and monitor. The

102

L. Williams et al.

use of artificial intelligence to streamline the diagnosis of glaucoma means there is improved clinician availability for complex cases. Furthermore, a system such as the one we have detailed could potentially be used in triaging of community-based referrals to hospital eye services. It could ‘flag’ a referral that is more likely to be true glaucoma from an ocular hypertension to a healthy case. At present, a referral is graded as urgent or routine by an optician, but there is no grading system for ‘likelihood of glaucoma’. An alert that could be added to referrals for this would facilitate the task of triaging for the clinician, and improve timely access to glaucoma clinics for patients according to clinical need. The resolution of data collected limits the results of this study. If we had the full set of parameters then a deeper analysis could have been carried out to select better features to comprise the input space. Results are promising with DNN and SVM methods producing diagnosis with similar accuracy to healthcare professionals [16] with a AUC = 87%. This could be improved in further works, with larger sample and increased resolution of data collected. Acknowledgments Authors of this paper acknowledge the funding provided by the Interreg 2 Seas Mers Zeeën AGE’In project (2S05-014) to support the work in the research described in this publication.

References 1. Kingman, S.: Glaucoma is second leading cause of blindness globally. Bull. World Health Organ. 82, 887–888 (2004) 2. Burgansky-Eliash, Z., Wollstein, G., Chu, T., Ramsey, J.D., Glymour, C., Noecker, R.J, Ishikawa, H., Schuman, J.S.: Optical coherence tomography machine learning classifiers for glaucoma detection: a preliminary study. Investig. Ophthalmol. Vis. Sci. 46(11), 4147–4152 (2005) 3. Asaoka, R., Murata, H., Iwase, A., Araie, M.: Detecting preperimetric glaucoma with standard automated perimetry using a deep learning classifier. Ophthalmology 123(9), 1974–1980 (2016) 4. Phan, S., Satoh, S., Yoda, Y., Kashiwagi, K., Oshika, T., Group, J.O.I.R.R. et al.: Evaluation of deep convolutional neural networks for glaucoma detection. Jpn. J. Ophthalmol. 63(3), 276–283 (2019) 5. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT press Cambridge (2016) 6. Fulcher, J.: Computational intelligence: an introduction. In: Computational Intelligence: a Compendium, pp. 3–78. Springer (2008) 7. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley (2012) 8. Haykin, S.: Neural Networks: a Comprehensive Foundation. Prentice Hall PTR (1994) 9. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986) 10. Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press (2007) 11. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010) 12. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159

Comparative Study of Pattern Recognition Methods …

103

13. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016) 14. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 15. Mosteller, F., Tukey, J.W.: Data analysis and regression: a second course in statistics. In: Addison-Wesley Series in Behavioral Science: quantitative Methods, p. 1977 16. Andersson, S., Heijl, A., Bizios, D., Bengtsson, B.: Comparison of clinicians and an artificial neural network regarding accuracy and certainty in performance of visual field assessment for the diagnosis of glaucoma. Acta Ophthalmol. 91(5), 413–417 (2013)

Research on Encrypted Face Recognition Algorithm Based on New Combined Chaotic Map and Neural Network Jiabin Hu, Jingbing Li, Saqib Ali Nawaz, and Qianguang Lin

1 Introduction In the current environment of big data information transmission and processing, the use of traditional dynamic passwords and mobile phone authentication of personal identity methods will bring different levels of information security issues such as information leakage, misappropriation, and loss. Subsequently, biometric identification technology was proposed based on this problem. It has solved the information security problems brought about by traditional authentication methods such as personal information theft to a certain extent. However, the cloud computing information security problem brought about by biometric technology is a key problem that people need to solve urgently, and this has also become a major limitation of the development of biometric technology and cloud computing. At present, there are many algorithms for image encryption and face recognition. The principle of image encryption based on modern cryptography is to treat image data as a binary stream, and directly use modern cryptography for encryption and decryption, while modern cryptosystems are targeted at text data (one-dimensional J. Hu · J. Li (B) · S. A. Nawaz · Q. Lin College of Information and Communication Engineering, Hainan University, Haikou 570228, China e-mail: [email protected] J. Hu e-mail: [email protected] S. A. Nawaz e-mail: [email protected] Q. Lin e-mail: [email protected] State Key Laboratory of Marine Resource Utilization in the South China Sea, Hainan University, Haikou 570228, China Haikou, Hainan Province, China © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_10

105

106

J. Hu et al.

data). The encryption design does not combine the characteristics of image data, so it is difficult to meet the needs of practical applications [1]. Modern cryptosystems often have complex structures, large amounts of computation, and low encryption efficiency and are not suitable for image encryption [2]. The basic idea of the image encryption algorithm based on matrix transformation is to perform a limited number of elementary matrix transformations on the image matrix, which can effectively disrupt the order of the input plaintext, and then effectively cover the plaintext information to achieve the purpose of encryption. Common matrix transformation methods include magic square transformation [3] and Arnold transformation [4]. However, this type of algorithm only scrambles the pixel position and does not change the pixel value. The histogram of the image before and after scrambling has not changed, and it is difficult to resist statistical attacks. In addition, the key space of this type of algorithm is small [5], and it cannot resist key exhaustion attacks. The principle of the chaotic image encryption algorithm is to treat the initial values and parameters of the chaotic system as keys, use the keys and chaotic systems to generate real chaotic sequences, and quantize them into integer chaotic sequences (generally elements are integers between 0 and 255). The integer chaotic sequence interacts with the original image with some kind of reversible rule to achieve image encryption. The security of this type of encryption algorithm depends on the randomness of the key stream (chaotic sequence). The closer the key stream is to a random number, the higher the algorithm’s security, otherwise it is easy to be broken [6]. According to different encryption objects, image encryption can also be divided into spatial domain (time-domain) encryption and transform domain (frequency domain) encryption. However, because the correlation between pixels is destroyed by the encryption process, the effect of image compression becomes worse [7]. The image encryption algorithm based on the frequency domain encrypts the transform coefficients. It can encrypt only some important data (called selective encryption or local encryption) according to the characteristics of the human visual system, which can significantly reduce the amount of encrypted data and improve the encryption efficiency. At the same time, such algorithms can be well combined with compression algorithms to reduce the amount of data transmitted over the network. However, this type of algorithm requires image data conversion from the spatial domain to transform domain and transform domain to spatial domain, which increases the amount of additional operations, and decrypted images often have a certain degree of distortion [8]. The face image database is the focus of face recognition security. Zekeriya Erkin first proposed a fully homomorphic encryption scheme in 2009 [9], which can operate on encrypted data without leaking the original data during operation. The decrypted data has the same results as the data before encryption. It guarantees the security of the original data and is widely used in cloud data storage. Bringer completed the authentication of face images using additive homomorphism in 2013 and summarized the biometric authentication of additive homomorphism, but did not further study the recognition of encrypted face images on this basis [10]. In 2018, Wang [11] proposed a KDSPE Algorithm combined with homomorphic encryption in the field

Research on Encrypted Face Recognition Algorithm …

107

of cryptography and the unwitting transmission protocol based on identity encryption system. The accuracy of this method was improved, but the computational burden was too large and the execution efficiency was low. Aiming at the security problem of face recognition technology, from the perspective of protecting data privacy and security. In this paper, an image encryption method based on Logistic and Sine dual chaos mapping is proposed combining image encryption algorithm and face recognition algorithm. At the same time, the neural network is used to realize encrypted face recognition. Through experimental analysis, the algorithm has good encryption performance and face recognition rate, which could solve the security problems existing in the current face recognition system.

2 Theoretical Basis 2.1 Logistic Chaotic Mapping Logistic mapping is a nonlinear iterative equation. The equation is as follows: xk+1 = μxk (1 − xk )

(1)

When 3.5699 < μ ≤ 4, the system is in a chaotic state, and a random and ergodic sequence is generated. The logistic map enters a chaotic state, and when μ = 4, it is a chaotic equation xn+1 = 4xn (1 − xn ). At this time, the chaos effect is the best, but because the chaos parameter at this time is a constant, which causes a lack of data encryption security. Therefore, an expression with the upper and lower limits close to 4 is set instead of the chaotic parameter μ, and the logistic mapping equation is improved to π xn+1 = [3.5699 + (4 − 3.5699) ∗ sin( xn )]xn (1 − xn ) 2

(2)

where: The value range of xn : 0 < xn < 1 represents radians in sine. The following proves the function:y = 3.5699+(4−3.5699)∗sin( π2 x), when x ∈ (0, 1), sin( π2 x) ∈ (0, 1). Then the value range of the function y is (3.5699, 4), so the formula (2) satisfies the condition that the system enters chaos. As the parameters are changed in an endless cycle, the security of encryption is strengthened.

2.2 Sine Chaotic Mapping In the classic chaotic mapping equation, the sine function occupies an important position and is related to its own mapping. The Sine chaotic mapping is a chaotic mapping based on the sine function, which is defined as

108

J. Hu et al.

xn+1 = μ1 sin(π xn )

(3)

Sine chaotic maps appear chaotic when controlling the parameter μ1 ∈ (0.87, 0.93), μ1 ∈ (0.95, 1). The closer the control parameter μ1 is to 1, the better the chaotic performance.

2.3 Combined Chaotic Mapping Combine Sine map and Logistic map to get a new combined chaotic map: xn+1 = aμ1 sin(π xn ) + (1 − a)μ2 xn (1 − xn ), a ∈ (0, 1)

(4)

2.4 BP Neural Network The Backpropagation (BP) algorithm includes two processes: forward propagation of signals and backwardpropagation of errors. That is, the error output is calculated in the direction from input to output, and the weight and threshold are adjusted in the direction from output to input. During forward propagation, the input signal acts on the output node through the hidden layer. After the nonlinear transformation, the output signal is generated. If the actual output does not match the expected output, the error is transferred to the backward propagation process of the error. Error backpropagation is to transmit the output error layer by layer through the hidden layer to the input layer and distribute the error to all units in each layer. The error signal obtained from each layer is used as the basis for adjusting the weight of each unit. By adjusting the connection strength of input nodes and hidden layer nodes, the connection strength of hidden layer nodes and output nodes, and the threshold value, the error decreases along the gradient direction. After repeated learning and training, the network parameters (weights and threshold values) corresponding to the minimum error are determined. Training stops. At this time, the trained neural network can process the input information of similar samples by itself and process the non-linearly transformed information with the smallest output error. In this paper, the number of nodes in the input layer is 70, including 2 hidden layers, the number of nodes is 60 and 15, and the number of nodes in the output layer is 40.

3 Image Encryption Algorithm A single chaotic system has a relatively simple form, which is vulnerable to attack and deciphering. To effectively protect the original data image, Logistic and Sine dual

Research on Encrypted Face Recognition Algorithm …

109

chaotic systems are used to construct an encryption matrix to encrypt the image. The detailed encryption steps are as follows: (1) Read the plaintext image and store it as a two-dimensional matrix P, and obtain the height H and width W of the plaintext image, and calculate the sum of all element values in P. (2) Rotate P clockwise 180° to get P’, and then start with the elements P’ (1, 1) in P’, and arrange the elements one by one in the order of first and last; then take P’ (2, 2), P ‘(3, 3), …, P’ (n, n) and convert the elements of the two-dimensional matrix P’ into a one-dimensional array A. (3) Iterate Eq. (4) t (≥200) times, remove the influence of the initial value, and then iterate 3 × H×W times to generate three sequences S 1 , S 2 , and S 3 of length H × W. (4) The sequence S 1 is arranged in ascending order to obtain a new sequence S 11 and stored in a one-dimensional array K. (5) Exchange the elements A (i) and A (K (i)) in the one-dimensional array A to complete the scrambling process. (6) The element value in S 2 is converted into an integer sequence X according to the following formula. X (i) = f loor (

|S2 (i)| × A(i) × 232 ) mod 256, i ∈ [1, H × W ] SU M

(5)

(7) Convert the elements in S 3 to the integer sequence Y according to the following formula. Y (i) = floor(

|S3 (i)| × A(i) × 1013 ) mod 256, i ∈ [1, H × W ] SUM

(6)

(8) Diffusion the elements in the one-dimensional matrix A according to the following formula. C(i) = ((X (i) ⊕ A(i)) + Y (i)) mod 256, i ∈ [1, H × W ]

(7)

(9) Convert the one-dimensional matrix C into a two-dimensional matrix to obtain a digital matrix of the encrypted image. The image encryption process is shown in Fig. 1. The decryption process is the opposite of the encryption process. The detailed steps for the entire face recognition are as follows: (1) First, encrypt the original image database using the proposed encryption algorithm. (2) Use the PCA algorithm to extract features from the encrypted image database to obtain the projection matrix U.

110

J. Hu et al.

Fig. 1 Image encryption flowchart

(3) The training sample is projected by the projection matrix U as a neural network input to train the neural network. (4) Encrypt the image to be measured using the proposed encryption algorithm to obtain an encrypted image E (j). (5) Project the encrypted image E (j) through the projection matrix U to obtain the reduced-dimensional matrix E ‘(j). (6) Finally, the dimensionality reduction matrix E ‘(j) is input to the trained neural network to complete face recognition.

4 Experimental Results and Test Analysis The experiment was implemented by MATLAB 2016a. The face data image used was from the ORL database. The database contains 40 people, which are labeled as serial numbers 1~40. Each person has 10 facial images with different expressions. The pixel size of the image is 92 × 112, the gray level is 256, the first 5 facial expression faces of each person are selected as the training samples of the neural network, and

Research on Encrypted Face Recognition Algorithm …

111

(a1)

(b1)

(c1)

(d1)

(e1)

(f1)

(g1)

(h1)

(i1)

(j1)

(a2)

(b2)

(c2)

(d2)

(e2)

(f2)

(g2)

(h2)

(i2)

(j2)

Fig. 2 Original image and the corresponds to the encrypted image

the last 5 facial expression faces are used to test the neural network recognition rate. A single neural network will fall into a local minimum when there are too many input nodes, reducing the training speed and efficiency of the network. Using PCA algorithm to preprocess the input samples can simplify the data. Zhao [12] used principal component analysis and BP neural network to complete the recognition of the unencrypted face. To verify the accuracy of the encrypted face recognition algorithm, it was compared with the literature [12] in the algorithm recognition rate and robustness test.

4.1 Encryption Effect The key parameters used to encrypt the image are as follows: Logistic mapping selects the initial value x 0 = 0.135, Logistic parameter μ = 4, Sine mapping select μ1 = 0.9, and the first expression face of the top 10 people in the ORL face database is used as a test image. Original image and encrypted image are shown in Fig. 2, respectively. It can be seen from the figure that the encrypted image works well and is completely different from the original image.

4.2 Sensitivity Analysis One of the emoticons in the sample was taken as the subject for decryption analysis. The original face image and the corresponding encrypted image are shown in Fig. 3a, b, respectively. Figure 3c is the image after the decryption with the correct initial value as the key. As shown in Fig. 3c, the decrypted image is the same as the original image. When the wrong key is used for decryption, it can be seen from Fig. 3d that the wrong key cannot decrypt the encrypted image, and the image obtained is inconsistent with the original image. This shows that new combined chaotic map encryption method is highly sensitive.

112

J. Hu et al.

(a) Original face image

(b) Encrypted face image

(d) The image

(c) The image after proper

after incorrect

decryption

decryption

Fig. 3 a Original face image b Encrypted face image c The image after proper decryption d The image after incorrect decryption

Table 1 Correct recognition rate of unencrypted and encrypted algorithms

Recognition methods

Recognition rate(%)

Recognition time (s)

Number of iterations

Unencrypted face

81.5

9

1820

Encrypted faces

92.5

9.5

2000

4.3 Algorithm Recognition Rate The recognition rates of unencrypted face in reference [12] and the encrypted face algorithm proposed in this paper are shown in Table 1. As can be seen from Table 1, the recognition rate of unencrypted faces is 81.5%, and the recognition rate of the encryption algorithm proposed in this paper is 92.5%. It not only improves the accuracy of face recognition but also enhances the security of image data.

4.4 Algorithm Robustness The first expression image in ORL face database was taken as the experimental object, and the face image was subjected to conventional attacks, geometric attacks, and occlusion attacks to test its robustness. The experimental data obtained by the method in reference [12] and the proposed encryption algorithm are shown in Table 2. S, M, and L in Table 2 indicate the light intensity and the relative size of the occlusion area (L > M > S). Figure 4 shows the original image of part of the attack and the corresponding encrypted image. From the data in Table 2, it can be seen that for conventional attacks, as the intensity of Gaussian noise and JPEG compression attacks gradually increases, both the new combined chaotic map encryption algorithm proposed in this paper and the unencrypted algorithm can accurately identify the person with a serial number of 1, indicating that the new combined chaotic map encryption algorithm has good resistance to conventional attacks. Both the encrypted algorithm and the unencrypted

Research on Encrypted Face Recognition Algorithm …

113

Table 2 Experimental results of different attacks on unencrypted and encrypted face algorithms Way of attack

Attack strength coefficient

PSNR

Unencrypted

Encryption

Gaussian noise (%)

10%

10.68

True

True

20%

9.16

True

True

30%

8.13

True

True

5%

24.25

True

True

10%

26.96

True

True

15%

29.32

True

True

20°

12.25

False

True

30°

9.62

False

True

40°

8.46

False

True

50°

7.83

False

True

JPEG compression (%)

60° Occlude

illumination

7.53

False

False

S

22.32

True

True

M

20.78

True

True

L

17.64

False

True

-L

14.33

True

True

-M

17.42

True

True

-S

25.65

True

True

S

25.61

True

True

M

17.35

True

True

L

14.24

True

True

(a) 10% Gaussian noise

(d) Occlusion

(b) JPEG compression 5%

(c) Rotate 15 °

(e) Move down 10%

(f) illumination

clockwise

Fig. 4 Original image and corresponding encrypted image under different attacks

114

J. Hu et al.

algorithm can identify the first person when the percentage move down is between 5% and 9%. However, when rotated clockwise between 20° and 60°, the unencrypted algorithm was misidentified as experimental samples of other serial numbers, while the encrypted algorithm was not misidentified as experimental samples of other serial numbers until rotated clockwise to 60°. It is shown that the anti-geometric attack capability of the encryption algorithm is better than that of the unencrypted algorithm. Occlusion is a difficult problem in face recognition. The experimental data in Table 2 show that under certain conditions of occlusion attack, both the unencrypted algorithm and the new combined chaotic map encryption algorithm can identify the test subjects. It shows that both the encryption algorithm and the unencrypted algorithm have certain anti-occlusion attack capability. In summary, the new combined chaotic map encryption algorithm proposed in this paper can not only improve the security of the data but also have good robustness. Figure 4 is the original image and corresponding encrypted image under different attacks.

5 Conclusion This paper presented a new algorithm for face recognition based on combination chaotic map and neural network. First, a key was generated from the chaotic sequence, and the face was encrypted using the key, and then used PCA method and neural networks for face recognition. It is verified through experiments that the encryption algorithm had good robustness, and had good resistance to conventional attacks, geometric attacks, and occlusion attacks. It not only improved the accuracy of face recognition but also enhanced the security of image data, which had a broad application prospect. Acknowledgments This work is supported by Hainan Provincial Natural Science Foundation of China [No. 2019RC018], and by the Natural Science Foundation of Hainan[617048, 2018CXTD333], and by the Science and Technology Research Project of Chongqing Education Commission [KJQN201800442] and by the Special Scientific Research Project of Philosophy and Social Sciences of Chongqing Medical University [201703].

References 1. Sun, F., Liu, S., Li, Z., et al.: A novel image encryption scheme based on spatial chaos map. Chaos Solitons Fractals 38(3), 631–640 (2008) 2. Behnia, S., Akhshani, A., Mahmodi, H., et al.: A novel algorithm for image encryption based on mixture of chaotic maps. Chaos Solitons Fractals 35(2), 408–419 (2008) 3. Zhang, L., Ji, S., Xie, Y., et al.: Principle of image encrypting algorithm based on magic cube transformation (2005) 4. Sharma, P., Patel, D., Shah, D., et al.: Image security using Arnold method in tetrolet domain. In: 2016 4th International Conference on Parallel, Distributed and Grid Computing, pp. 312–315 (2016)

Research on Encrypted Face Recognition Algorithm …

115

5. Chen, D.: A feasible chaotic encryption scheme for image. In: International Workshop on Chaos-fractals Theories & Applications. IEEE (2009) 6. Xiaofeng, L., et al.: A novel image encryption algorithm based on self-adaptive wave transmission. Signal Process. (2010) 7. Wang, H., Wang, J., Geng, Y.-C.: Quantum image encryption based on iterative framework of frequency-spatial domain transforms. Int. J. Theor. Phys. 56(8):1–21 (2017) 8. Dubey, A.K.: Chaos based encryption and decryption of image and video in time and frequency domain 9. Erkin, Z., Franz, M., Guajardo, J., et al.: Privacy-Preserving Face Recognition. Privacy Enhancing Technologies (2009) 10. Bringer, J., Chabanne, H., Favre, M., Patey, A., Schneider, T., Zohner, M.: Faster privacypreserving distance computation and biometric Identification. In: Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security, pp. 187–198 (2014) 11. Wang, S.: Design and research of face recognition and homomorphic encryption based on image subspace and kernel sparse representation (2018) 12. Zhao, L.: Principal component analysis (PCA) face recognition algorithm based on BP neural network. Comput. Eng. Appl. 43(36), 226–229 (2007)

A 3D Shrinking-and-Expanding Module with Channel Attention for Efficient Deep Learning-Based Super-Resolution Yinhao Li, Yutaro Iwamoto, and Yen-Wei Chen

1 Introduction Recently super-resolution (SR) based deep learning method as an effective approach for image restoration is a current and popular research focus [1–10]. For example, in 2014, Dong et al. proposed a neural network model, which consists of three SR convolutional layers, also known as SR convolutional neural network (SRCNN) [1]. The outcome of SRCNN laid the SR-based convolutional neural networks (CNNs) consists of the following three procedures: feature extraction, non-linear mapping, and reconstruction. In [2], Kim et al. propose a combination of residuals and 20 convolutional layers for the SR, which improves the processing power of the CNN for SR. Moreover, Shi et al. propose CNNs that learn and construct subpixel images through pixel shuffling known as an efficient subpixel CNN [3]. Zhang et al. combine the advantages in the structure of residual connection and dense skip connection to propose a residual dense network (RDN) [4]. The RDN optimizes the feature map extraction from the low-resolution image and non-linear mapping to restore a highresolution image. Nevertheless, most of the previous deep learning-based SR methods are proposed for 2-dimensional (2D) natural image restoration. Although some SR methods are proposed for 2D medical image SR [5–7], better results can be obtained using a 3D model. The work of [8] demonstrates that medical volumetric data SR using 3D CNN outperforms a slice-by-slice 2D processing. Converting the state-of-the-art 2D SR models into 3D versions and improving some structures are the most common methods to achieve better accuracy. However, training and testing 3D models present challenges because they have more parameters, consume more memory, and are more Y. Li · Y. Iwamoto · Y.-W. Chen (B) Graduate School of Information Science and Engineering, Ritsumeikan University, Shiga, Japan e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_11

117

118

Y. Li et al.

computationally intensive. Recently, DCSRN [9] and DCSRN-generative adversarial net (GAN) [10] by Chen et al. proposed visually good models with relatively few parameters. But the network with fewer layers is not enough for satisfactory results, and models based on GAN may tend to yield false details, which is a critical problem depthwise in medical image processing. Hence, motivated by existing efficient models and the squeeze-and-excitation networks [11], we propose an improved 3-dimensional (3D) CNN-based SR for medical volumetric data. Further, we reformulate the standard 3D convolution layer by shrinking the input feature dimension before mapping and expanding back afterward using a 3D pointwise convolution. Different from tensor decomposition-based convolution, all the 3D channel-wise convolutions get operated across channels. Then, the squeeze-and-excitation architecture (which is similar to channel attention model) gets utilized to improve the network accuracy without much increase in parameters. Besides, we demonstrate that the proposed model can be applied to several 3D CNNs, and the speed and accuracy significantly improved with fewer parameters by using our proposed architecture to the state-of-the-art SR models. In summary, our contributions are summarized as follows: An Improved 3D RDN Structure for SR: We propose an improved 3D RDN using shrinking-and-expanding module with channel attention for 3D volumetric image SR, which can effectively speed up processing and maintain the accuracy. Since the 3D model has more parameters than 2D, the effect of weight reduction is significant. Also, we demonstrate that our proposed method can be combined with other arbitrary SR networks to improve accuracy. The rest of the paper is organized as follows. In Sect. 2, we proposed and explained 3D SR models. We experimented and compared our results against the state-of-the-art networks in Sect. 3. Finally, in Sect. 4, we summarized and concluded our work.

2 A 3D Shrinking-and-Expanding Module with Channel Attention for Volumetric Image Super-Resolution 2.1 Standard 3D Convolution Generally, weights in 2D CNNs can be described by 3D filters as W ∈ R X ×Y ×S where S is the number of input channels, X and Y are the spatial dimensions of the filter. Specifically, the convolution for each output channel can be expressed as follows: Ft (x, y) =

X/2 

Y/2 S  

I2D (x − x  , y − y  , s)Wt (x  , y  , s),

(1)

x  =−X/2 y  =−Y/2 s=1

where I2D ∈ R H ×W ×S is a set of 2D input feature maps with a size of H × W and t is the output channel index.

A 3D Shrinking-and-Expanding Module with Channel …

119

Fig. 1 a Standard 3D convolution, b 3D shrinking-and-expanding module, c proposed module (3D shrinking-and-expanding module with channel attention)

Weights in standard 3D CNNs can be described by four-dimensional filters as W ∈ R X ×Y ×Z ×S where S is the number of input channels and X, Y, and Z are the spatial dimensions of the filter. In previous SR models, the standard 3D convolution shown in Fig. 1a can be generally expressed as follows: Ft (x, y, z) = X/2 

Y/2 

Z /2 S  

F3D (x − x  , y − y  , z − z  , s)Wt (x  , y  , z  , s),

(2)

x  =−X/2 y  =−Y/2 z  =−Z /2 s=1

where F3D ∈ R H ×W ×L×S is a set of 3D output images of the previous layer, t is the output channel index which ranges from 1 to R, and H, W, and L are the spatial dimensions of the input. The training parameters of a 2D convolution layer are X × Y × S × T , where N is kernel size, S is the number of input channels, and T is output channels. Hence, the parameters of the corresponding 3D convolution layer will be X × Y × Z × S × T , i.e., the number of parameters will increase by Z times when a 2D network is converted into a 3D model. Thus, highly efficient and accurate 3D CNN models SR for medical volumetric data is challenging.

2.2 The Architecture of 3D Shrinking-and-Expanding Module with Channel Attention Figure 1b shows the original improved architecture to reduce parameters and computational complexity. In the first layer, a 3D pointwise (1 × 1 × 1) convolution is

120

Y. Li et al.

set for the shrinking channels (feature maps) from S to R before non-linear mapping, which is expressed as S 

Ft (x, y, z) =

F3D (x, y, z, s)Wt (1, 1, 1, s),

(3)

s=1

where t is the output channel index that ranges from 1 to R. Then, the following standard 3D convolution is expressed as Ft (x, y, z) = X/2 

Y/2 

Z /2 R  

F3D (x − x  , y − y  , z − z  , r )Wt (x  , y  , z  , r ),

(4)

x  =−X/2 y  =−Y/2 z  =−Z /2 r =1

where F3D ∈ R H ×W ×L×R is a set of 3D output images of the previous layer, and H, W, and L are the spatial dimensions of the input. However, t ranges from 1 to R. In the final layer, we utilize the 1 × 1 × 1 convolution to expanding the channel from R to T. The operation is expressed as Ft (x, y, z) =

R 

F3D (x, y, z, r )Wt (1, 1, 1, r ),

(5)

r =1

where t ranges from 1 to T. Thus, the parameters in Fig. 1b require S R + X Y Z R 2 + RT parameters. With this architecture, the compression ratio E is expressed as follows: E=

X Y Z ST . S R + X Y Z R 2 + RT

(6)

Although this architecture reduces the parameters significantly, we found that the network accuracy decreases. Thus, inspired by squeeze-and-excitation networks, as shown in Fig. 1c, we added a squeeze-and-excitation block after the original improved architecture introduced earlier to maintain the accuracy without adding parameters as much as possible. Finally, the compress ratio E attention is expressed as E attention =

X Y Z ST . S R + X Y Z R 2 + RT + (σ + 1)T

(7)

where σ is a compression ratio for controlling the number of compressed channels in squeeze-and-excitation networks. From the experimental results, we demonstrate that our proposed structure can improve the accuracy to the original level when compared to the original backbone model.

A 3D Shrinking-and-Expanding Module with Channel …

121

Fig. 2 a Illustration of layers, b 3D RDN and c improved 3D RDN. By our proposed method, except for the first and last convolutional layers, the convolutional layers can be replaced by the proposed module (illustrated in Fig. 1c)

2.3 Combine Shrinking-and-Expanding Module with 3D Residual Dense Net Although several deep learning networks have been proposed for the SR as discussed in Sect. 1, among them, only the RDN outperformed best performance using 2D images. Hence, we converted 2D convolution layers to 3D convolution layers, as shown in Fig. 2b. Then, we replaced most of the 3D convolution layers using our proposed module, and the residual dense blocks (RDBs) in the original RDN are converted into 3D RDBs. A voxel shuffle (i.e., 3D up-scaling) is realized by a combination of a 3D upsampling layer and a 3D convolution layer, as shown in Fig. 2c.

3 Experiments 3.1 Data Preparation and Parameter Settings In this paper, we used T1 images in the IXI database [12], which is a large public database of brain MRI images. We used a total of 50 training, 10 validation, and 100 test data with size 250 × 250 × 150 voxels to compare the quality of our networks. These data are used as ground-truth (GND) HR images and degraded to low-resolution images in ×2 scale in X, Y, and Z directions.

122

Y. Li et al.

We implemented all the models in Keras with an NVIDIA Titan X Pascal GPU. We randomly extracted eight low-resolution patches with size 16 × 16 × 16 voxels as inputs in each training batch. The growth rate T and number of proposed modules in one RDB were practically set to be 32 and 6, respectively. The σ in squeeze-andexcitation networks was set to 0.5 empirically. The loss functions were set according to the existing papers, and our models are optimized by the l1 loss function. We used Adam as the optimizer and set the learning rate to 10−4 .

3.2 Precision Evaluation Table 1 presents the quantitative results of our proposed models (with (Fig. 1c) and without attention (Fig. 1b)) and state-of-the-art SR models. Although RDN is one of the state-of-the-art 2D SR models, its performance in medical volume data SR is worse compared to the most primitive 3D SR model (3D SRCNN). Among the 3D SR models, 3D SRCNN has the least number of parameters, but its performance is not good. DCSRN and mDCSRN as the state-of-the-art CNN models for 3D medical volume data SR outperform 3D SRCNN. Nevertheless, the 3D version of RDN outperforms these networks. In order to show the effect of attention module, we compare models without and with attention module (as shown in Fig. 1b, c, respectively). The results show that model with attention module slightly increases the accuracy and the processing time.

Table 1 Quantitative results. The best results are in bold. (The compression ratio of channels in each proposed module is set as 50%, i.e., R = T /2) Method RMSE PSNR SSIM Time (s) #param (M) Tricubic 2D RDN [4] 3D SRCNN [8] DCSRN [9] mDCSRN [10] 3D RDN Proposed method (w/o attention) Proposed method

6.752 5.511 5.187

31.54 33.31 33.83

0.9820 0.9872 0.9900

– 5.062 0.531

– 6.601 0.053

4.789 4.667

34.53 34.75

0.9915 0.9919

2.035 5.328

0.224 0.425

4.366 4.319

35.33 35.42

0.9933 0.9933

2.654 1.400

5.976 0.667

4.308

35.45

0.9933

1.532

0.736

A 3D Shrinking-and-Expanding Module with Channel …

123

By the results shown in Table 1, we demonstrated the following facts: (1) deeper 3D SR models result to higher accuracy as the parameters greatly increase; (2) our proposed method can effectively increase the performance of 3D SR CNN in terms of accuracy and speed compared to state-of-the-art deep 3D models.

3.3 Visual Quality Comparison Figure 3 shows the examples of the restored HR images for the GND, with three typical examples which is a restored 3D MRI in axial, coronal, and sagittal plane, respectively. We presented the results of several state-of-the-art methods and our proposed networks for visual comparison. To show different results more clearly,

Fig. 3 Illustration of SR results with isotropic voxel upsampling (scale factor is ×2 in each direction and the compression ratio of the channels in each proposed module is set as 50%, i.e., R = T /2). To show different results more clearly, we attach a colormap of a residual image to the GND image next to the resulting image

124

Y. Li et al.

we attach a residual image to the GND image beside the resulting image. The figure illustrates that the proposed method performs the best, and the difference of the recovered MRI data using our approaches is smaller compared to state-of-the-art methods. It is also worth nothing that the results by our proposed model show quite similar appearance to the GND HR image with only approximately 1/8 parameters and 3/5 time of the standard model 3D RDN.

4 Conclusion In this paper, we proposed an improved 3D CNN model for medical volumetric data SR and demonstrate the superiority of our proposed architecture, whereas comparing it with typical state-of-the-art SR model. The proposed architecture reduces the network parameters considerably; hence, it decreases the computational complexity. Consequently, our network is suitable while considering practical situations, such as small training samples, short processing time, and being embedded into chips. Acknowledgments This work is supported in part by Japan Society for Promotion of Science (JSPS) under Grant No. 19J13820 and the Grant-in Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant Nos. 18K18078, 18H03267.

References 1. Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016) 2. Kim, J., Kwon Lee J., Mu Lee K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646-1654 (2016) 3. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874-1883 (2016) 4. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image superresolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472-2481 (2018) 5. Zhao, C., Carass, A., Dewey, B.E., Prince, J.L.: Self super-resolution for magnetic resonance images using deep networks. In: IEEE 15th International Symposium on Biomedical Imaging, pp. 365-368 (2018) 6. Shi, J., Li, Z., Ying, S., Wang, C., Liu, Q., Zhang, Q., Yan, P.: MR image super-resolution via wide residual networks with fixed skip connection. IEEE J. Biomed. Health Inform. 23(3), 1129–1140 (2018) 7. Zhao, X., Zhang, Y., Zhang, T., Zou, X.: Channel splitting network for single MR image super-resolution. IEEE Trans. Image Process. 28(11), 5649–5662 (2019) 8. Pham, C.H., Ducournau, A., Fablet, R., Rousseau, F.: Brain MRI super-resolution using deep 3D convolutional networks. In: IEEE 14th International Symposium on Biomedical Imaging, pp. 197-200 (2017)

A 3D Shrinking-and-Expanding Module with Channel …

125

9. Chen, Y., Xie, Y., Zhou, Z., Shi, F., Christodoulou, A.G., Li, D.: Brain MRI super resolution using 3D deep densely connected neural networks. In: IEEE 15th International Symposium on Biomedical Imaging, pp. 739–742 (2018) 10. Chen, Y., Shi, F., Christodoulou, A.G., Xie, Y., Zhou, Z., Li, D.: Efficient and accurate MRI super-resolution using a generative adversarial network and 3D multi-level densely connected network. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 91–99. Springer, Cham (2018) 11. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018) 12. IXI dataset. https://brain-development.org/ixi-dataset/

Dynamic Facial Features in Positive-Emotional Speech for Identification of Depressive Tendencies Jia-Qing Liu, Yue Huang, Xin-Yin Huang, Xiao-Tong Xia, Xi-Xi Niu, Lanfen Lin, and Yen-Wei Chen

1 Introduction The depression rate among Chinese university students has recently risen to 23.8% [1]. Early identification and treatment of depression are essential for promoting the remission of this disease [2]. Depression is traditionally diagnosed through questionnaires and interviews, which rely on patients’ and clinicians’ reports. Therefore, the evaluation can be subjective and inconsistent. Additionally, early signs of depressive tendencies are difficult to detect and quantify. Numerous depression-detection approaches have been investigated to date [3–6]. Most studies are evidenced though the series of international Audio/Visual Emotion Recognition Challenges (AVEC) [7], which focus on clinically depressed patients from western cultures in the interview scenario. Therefore, they may not accurately detect depression in non-clinical individuals in experimental scenarios. The core symptom of depression is low mood. Facial expressions are the most intuitive indicators of moods. The emotions of patients with mental illness are better detected by facial expression recognition than questionnaire evaluations, because the former distinguishes the nuances among patients, whereas questionnaires can hardly Jia-Qing Liu and Yue Huang contributed equally to this work and they are co-first authors. J.-Q. Liu · Y.-W. Chen Information Science and Engineering, Ritsumeikan University, Shiga, Japan Y. Huang · X.-Y. Huang (B) · X.-T. Xia · X.-X. Niu School of Education, Soochow University, Jiangsu, China e-mail: [email protected] L. Lin · Y.-W. Chen College of Computer Science and Technology, Zhejiang University, Hangzhou, China Y.-W. Chen Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 192, https://doi.org/10.1007/978-981-15-5852-8_12

127

128

J.-Q. Liu et al.

distinguish different symptoms [8]. Recent studies have shown that facial cues might help identify depression [9, 10]. This study proposes a new approach for depressive tendencies prediction. The proposed method fuses the dynamic facial features in positive-emotional speech. This study first investigates the discriminative power of synchronized dynamic facial features for depression identification at the sentence level. It then presents a committee fusion system that combines different sentences for depression identification. Combining the sentence classifiers is intended to gather the strengths of the individual classifiers and enhance the classification performance. The main contributions of this study are twofold: (1) fusion of dynamic facial features in positive-emotional speech to predict depressive tendencies; (2) a novel pre-preprocessing step that generates synchronized faces images using the Penn Forced Aligner (P2FA).

2 Methodology The methodology proceeded in three steps: (1) Forced alignment of the speaking face data and detection of key frame faces; (2) Feature generation on a per-frame basis using the pre-trained VGG-Face [11] and feeding those features into a Long ShortTerm Memory (LSTM) model for training; (3) Combination of sentence-level predictions by committee fusion. These three components are detailed in the following subsections.

2.1 Preprocessing: Forced Alignment for Word Timings This section describes the multi-stage pipeline that automatically generates a synchronized face dataset for dynamic facial depression recognition. The processing pipeline is summarized in Fig. 1. Most of the steps are based on the described in [12, 13]. The precise timestamp of each uttered word is obtained by P2FA [14] Chinese. At every timestamp, the middle of the duration is selected as the key frame for each uttered word, and the facial data synchronized with each word are generated in a transcript file. The face appearances in each frame are detected by a Dlib [15] frontal face detector (Table 1).

2.2 Classification Model The classification model is illustrated in Fig. 2. To match the continuous evolution of the temporal information in the data, we developed architectures of Convolutional Neural Networks (CNNs) and recurrent neural networks. Feature extraction

Dynamic Facial Features in Positive-Emotional Speech …

129

Fig. 1 Pipeline of generating the dataset

Table 1 Example of the word-level forced alignment

Start_time

Stop_time

Chinese character

0.3425

0.5625



0.5625

0.7925



0.7925

0.8425

Short-pause

0.8425

1.0725



1.0725

1.2425



1.2425

1.3025

Short-pause

1.3025

1.5625



1.5625

1.9925



1.9925

2.2425



2.2425

2.4025



2.4025

2.5925



2.5925

2.8225



2.8225

3.025



3.025

3.1825



3.1825

3.6425

Short-pause

is performed by VGG-Face, a network pre-trained on 2.6 M facial images for facerecognition application [11]. The sequence learning part is a single LSTM model composed of 128-dimensional LSTM cells. The features derived from the VGGFace model are horizontally stacked, then fed into the LSTM model for training. To prevent the overfitting, dropout and weight decay are applied to the LSTM cells and the final fully connected layer, respectively. The hyperparameters such as batch size are chosen by grid search to optimize the performance.

130

J.-Q. Liu et al.

Fig. 2 Our proposed two-stage model. The first stage is a pre-trained VGG-Face [11] convolutional neural network. The second stage is a long short-term memory network. n is the number of key frames, which depends on the sentence lengths

2.3 Fusion Architecture Figure 3 shows the architecture of the committee fusion. After training, the networks can separately classify each sentence. The depression classification label m* is determined by fusing the posterior probabilities as ∗

m = argmaxm

N

Pk (m) N

k=1

(1)

where N is the number of sentences used in the experiments (N = 7 in this paper). m is the class index (health person (HP): m = 1; depressive person (DP): m = 2). Pk (m) is the posterior probability of class m for the k-th sentence (output of the k-th model).

Fig. 3 The committee fusion architecture for combining different sentences

Dynamic Facial Features in Positive-Emotional Speech …

131

3 Dataset The dataset was part of our already built multimodal behavioral dataset of Chinese university students [16]. To create this dataset, participants were recorded by a web camera (Logitech C920) while uttering emotional text materials. Three text materials with positive, negative, and neutral valences were selected based on a related work [17]. This study used only the dynamic facial features in positive-emotional speech. The positive emotional sentences are listed in Table 2. 102 valid participants (Chinese university students) were included for data analysis. They were divided into two groups: depressive persons (DP) and healthy persons (HP), according to their scores on standardized self-report questionnaires (BDI-II [18] and CES-D [19]). The DP group included 51 participants (26 male, 25 female): BDI-II ≥14 and CES-D ≥16. The HP group included 51 participants (26 male, 25 female): BDI-II