Innovation in Medicine and Healthcare: Proceedings of 9th KES-InMed 2021 (Smart Innovation, Systems and Technologies, 242) 9811630127, 9789811630125

This book presents the proceedings of the KES International Conferences on Innovation in Medicine and Healthcare (KES-In

113 54 10MB

English Pages 282 [260] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
InMed 2021 Organization
Preface
Contents
About the Editors
COVID-19
Influence of Telehealth Intervention on Knowledge of Danger Signs in Pregnancy, Childbirth and Postpartum During the Health Emergency by COVID-19 in Peru
1 Introduction
2 Methodology
2.1 Location and Duration of Research
2.2 Research Design
2.3 Participants
2.4 Variables
2.5 Data Collection Procedure
2.6 Statistical Analysis
2.7 Ethical Considerations
3 Results
4 Discussion
5 Conclusions
References
Support for COVID-19 Vaccination in Tamba City’s Regional Comprehensive Care System
1 Background
2 Related Work
2.1 Information Technology to Fight COVID-19
2.2 Utilization of Digital Identity
2.3 Utilization of Artificial Intelligence
2.4 Digital Healthcare for the Prevention of COVID-19
2.5 Enterprise Architecture for Healthcare System
3 Case Study
3.1 The Existing Immunization Determination System
3.2 System Requirements for COVID-19 Vaccination
3.3 Improved Resident Coverage Rate of the System User
3.4 Differences in Data Associated with ID of the System and Limitations
3.5 Differences in Data Associated with ID of the System and Limitation
3.6 Limitation of Shared Data
4 Desired Data Model and Architectural Governance with AIDAF
5 Discussion
References
Biological Engineering, Research and Technologies
Knowledge Distillation with Teacher Multi-task Model for Biomedical Named Entity Recognition
1 Introduction
2 Knowledge Distillation
3 Our Proposal
4 Experiments
5 Results and Discussion
6 Conclusions
References
Genomics-Based Models for Recurrence Prediction of Non-small Cells Lung Cancers
1 Introduction
2 Materials and Methods
2.1 Dataset
2.2 Gene Selection
2.3 Gene Quantization
2.4 Classification
3 Experiments
3.1 Genes Selection Results
3.2 Genes Quantization Results
4 Conclusions
References
IDH Mutation Status Prediction by Modality-Self Attention Network
1 Introduction
2 Method
2.1 Multi-Modality Attention Block
2.2 Self-Attention Network [7]
3 Experiments
3.1 Dataset
3.2 Result
4 Conclusion
References
Medical Watermarking
A Novel Robust Watermarking Algorithm for Encrypted Medical Image Based on Bandelet-DCT
1 Introduction
2 The Fundamental Theory
2.1 SIFT(scale-invariant feature transform)
2.2 Bandelet Transform
2.3 The Discrete Wavelet Transform (DWT)
3 The Proposed Method
3.1 Encryption of Original Medical Images
3.2 Watermarking Embedding Algorithm
3.3 Watermarking Extraction Algorithm
4 Experimental Results and Performance Analysis
4.1 Common Attacks
4.2 Geometric Attacks
4.3 Comparison With Unencrypted Algorithm
4.4 Comparison With Other Encrypted Algorithms
5 Conclusion
References
Robust Zero Watermarking Algorithm for Encrypted Medical Images Based on DWT-Gabor
1 Introduction
2 Basic Theory
2.1 Discrete Wavelet Transform (DWT)
2.2 Gabor Transform
3 Algorithm Process
3.1 Medical Image Encryption
3.2 Feature Vector Extraction
3.3 Watermark Embedding
3.4 Watermark Extraction
4 Experimental Results
4.1 Conventional Attack
4.2 Geometric Attack
4.3 Algorithm Comparison
5 Conclusion
References
A Zero Watermarking Scheme for Encrypted Medical Images Based on Tetrolet-DCT
1 Introduction
2 Theoretical Knowledge
2.1 DFT
2.2 Tetrolet
2.3 DCT
2.4 Logistic Mapping
3 Medical Image Encryption
4 Embedding and Extraction of Zero Watermark
4.1 Watermark Embedding
4.2 Extract Watermark
5 Experiments
5.1 Data from Various Attacks
5.2 Comparison in Plaintext and Encryption Domain
5.3 Feature Extraction Algorithm Comparison
6 Conclusion
References
A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for Medical Images in the Encrypted Domain
1 Introduction
2 Fundamental Theory
2.1 Polar Harmonic Transforms(PHTs)
2.2 Logistic Map
3 Proposed Algorithm
3.1 Medical Image Encryption
3.2 Feature Extraction and Watermark Embedding
3.3 Watermark Extraction and Decryption
4 Experiments and Results
4.1 Simulation Experiment
4.2 Attacks Results
4.3 Comparison With Unencrypted Algorithm
4.4 Comparison With Other Encrypted Algorithms
5 Conclusion
References
Support System for Medicine and Healthcare
Recent Advancements on Smartwatches and Smartbands in Healthcare
1 Introduction
2 Literature Analysis
3 Results
3.1 Healthy People
3.2 Nursing Assistance
3.3 Mood State
3.4 Heart Diseases
3.5 Tremor
3.6 Memory Loss
3.7 Sleep Pathologies
3.8 Other Pathologies and Physical Activity
4 Discussion
5 Conclusions
References
A Proposal of Architecture Framework and Performance Indicator Derivation Model for Digitalization of Quality Management System
1 Introduction
2 Background and Related Research
2.1 Quality Management System
2.2 Performance Measurement
2.3 Digitalization in Manufacturing Control and Quality Control
2.4 Enterprise Architecture Framework for the Digital Era
3 AF for Digitalization of Quality Management System and Proposal of Performance Indicator Derivation Model
4 Results
4.1 Case of Quality Management in a Manufacturer
5 Discussion
6 Issues and Future Research
7 Conclusion
References
Support System for Medical/Hospital Management
Prediction of Length of Stay Using Vital Signs at the Admission Time in Emergency Departments
1 Introduction
2 Materials and Method
2.1 Data Collection
2.2 Model Development
3 Results
4 Discussion
5 Conclusion
References
Regulated Digital Pharmacy Based on Electronic Health Record to Improve Prescription Services
1 Introduction
2 Related Works and the Direction of RDP
2.1 Current Healthcare Ecosystem in Japan
2.2 Electronic Health Record (EHR)
2.3 Computerized Physician Order Entry (CPOE)
2.4 Regulated Digital Pharmacy (RDP)
2.5 Adaptive Integrated Digital Architecture Framework (AIDAF)
3 Research Methodology
4 Results
4.1 RDP Framework
4.2 Case Study Results
4.3 Information System Selection
4.4 Comparison of Three Systems
4.5 Recommendation of an Optimal System to Implement in Japan
5 Discussion
5.1 Ensuring Patients’ Privacy
5.2 Comparison of Healthcare Insurance and Information Between Japan and America
5.3 Future Challenges and Research
6 Conclusion
References
Performance Verification of a Text Analyzer Using Machine Learning for Radiology Reports Toward Phenotyping
1 Introduction
1.1 Background
1.2 Phenotyping
1.3 Related Work
2 Data
3 Methods
3.1 Classification
3.2 Feature Selection
3.3 Feature Extraction and Evaluation
4 Results
4.1 Feature Words by Classification
4.2 Prediction for Performance Measures
4.3 Evaluation for Performance Measures
5 Discussion
6 Conclusion and Future Work
References
An Optimization Model for the Tradeoff Between Efficiency and Equity for Mobile Stroke Unit Placement
1 Introduction
2 Related Work
3 Time to Treatment Estimation Model
4 Tradeoff Between the Efficiency and Equity Perspectives
5 Scenario Study
6 Results and Discussion
7 Conclusion
References
Method for Supporting Diagnostics
Automatic Joint Position Estimation Method for Diagnosis Support System in Rheumatoid Arthritis
1 Introduction
2 Diagnostic Imaging Support System
3 Proposed Method
3.1 Finger Region Extraction Process
3.2 Finger Centerline Extraction Process
3.3 Joints Detection Process
3.4 Feature Calculation
3.5 Joints Alignment Process
4 Experimental Results
4.1 Experiments on Accuracy of Joints Detection
4.2 Experiment on Accuracy of Joint Position Estimation
5 Conclusion
References
Computer-Aided Diagnosis of Peritonitis on Cine-MRI Using Deep Optical Flow Network
1 Introduction
2 Proposed Method
2.1 Overview
2.2 Flow Layer
2.3 Classification
2.4 Region-Division Method
3 Experiments
3.1 Data Set
3.2 Results
3.3 Comparison with the State–of-the-Art
4 Conclusion
References
Automated Retrieval of Focal Liver Lesions in Multi-phase CT Images Using Tensor Sparse Representation
1 Introduction
2 Materials
3 Methods
3.1 Retrieval of Focal Liver Lesions in Multi-phase CT Images
4 Experiments and Results
4.1 Retrieval Performance Evaluation Method
4.2 Experimental Results
5 Conclusion
References
Colorization for Medical Images Based on Patient-Specific Prior Information and GAN Features
1 Introduction
2 Related Work
3 Method
3.1 Preprocessing
3.2 Extraction of Explicit Prior Features Images
3.3 Loss Function Optimization
3.4 Loss Function Optimization
4 Experiment
4.1 Data
4.2 Network Training
4.3 Result
5 Conclusion
References
Case Discrimination: Self-supervised Feature Learning for the Classification of Focal Liver Lesions
1 Introduction
2 Method
2.1 Pre-processing
2.2 Pre-training Stage
2.3 Fine-Tuning Stage
3 Experiments and Results
3.1 Dataset and Implementation
3.2 Experimental Setting
3.3 Ablation Study and Comparison of Different Transfer Learning Methods
3.4 Small Scale Annotation Training Data
4 Conclusions
References
Content-Based Retrieval of Focal Liver Lesions Using Geometrical and Textural Features of Multi-Phase CT-Scan Images
1 Introduction
1.1 A Subsection Sample
2 The Proposed Method
2.1 Input Data
2.2 Preprocessing
2.3 Conventional Feature Vectors
2.4 Geometric Feature Vector
2.5 Similarity Measure
3 Results and Discussions
4 Conclusion and Future Works
References
Author Index
Recommend Papers

Innovation in Medicine and Healthcare: Proceedings of 9th KES-InMed 2021 (Smart Innovation, Systems and Technologies, 242)
 9811630127, 9789811630125

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Smart Innovation, Systems and Technologies 242

Yen-Wei Chen Satoshi Tanaka Robert J. Howlett Lakhmi C. Jain   Editors

Innovation in Medicine and Healthcare Proceedings of 9th KES-InMed 2021

Smart Innovation, Systems and Technologies Volume 242

Series Editors Robert J. Howlett, Bournemouth University and KES International, Shoreham-by-sea, UK Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK

The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago, DBLP. All books published in the series are submitted for consideration in Web of Science.

More information about this series at http://www.springer.com/series/8767

Yen-Wei Chen · Satoshi Tanaka · Robert J. Howlett · Lakhmi C. Jain Editors

Innovation in Medicine and Healthcare Proceedings of 9th KES-InMed 2021

Editors Yen-Wei Chen Ritsumeikan University Kyoto, Japan Zhejiang Lab Hangzhou, China Robert J. Howlett ‘Aurel Vlaicu’ University of Arad Arad, Romania

Satoshi Tanaka Ritsumeikan University Kyoto, Japan Lakhmi C. Jain University of Technology Sydney Sydney, Australia Liverpool Hope University Liverpool, UK

Bournemouth University Poole, UK KES International Research Shoreham-by-Sea, UK

ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-16-3012-5 ISBN 978-981-16-3013-2 (eBook) https://doi.org/10.1007/978-981-16-3013-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

InMed 2021 Organization

Honorary Chair Lakhmi C. Jain, University of Technology Sydney, Australia and Liverpool Hope University, UK

Executive Chair Robert J. Howlett, ‘Aurel Vlaicu’ University of Arad, Romania, Bournemouth University, UK & KES International Research, Shoreham-by-Sea, UK

General Chair Yen-Wei Chen, Ritsumeikan University, Japan and Zhejiang Lab, China

Program Chair Satoshi Tanaka, Ritsumeikan University, Japan

International Program Committee Members Marco Anisetti, University of Milan, Italy Ahmad Taher Azar, Prince Sultan University, Saudi Arabia Adrian Barb, Penn State University, USA v

vi

InMed 2021 Organization

Smaranda Belciug, University of Craiova, Romania Vitoantonio Bevilacqua, Polytechnic University of Bari, Italy Isabelle Bichindaritz, State University of New York at Oswego, USA Christopher Buckingham, Aston University, UK Chunhua Dong, Fort Valley State University, USA Andrew Edwards, TAFE NSW University, Australia Massimo Esposito, National Research Council of Italy (ICAR-CNR) Amir H. Foruzan, Shahed University, Iran Yoshiaki Fukami, Keio University/Gakushuin University, Japan Luigi Gallo, National Research Council of Italy, Italy Oana Geman, Stefan cel Mare University of Suceava, Romania Arfan Ghani, University of Bolton, Greater Manchester, United Kingdom Tomio Goto, Nagoya Institute of Technology, Japan Kyoko Hasegawa , Ritsumeikan University, Japan Yutaro Iwamoto, Ritsumeikan University, Japan Titinunt Kitrungrotsakul, Zhejiang Lab, China Huiyan Jiang, Northeastern University, China Dalia Kriksciuniene, Vilnius University, Lithuania Liang Li, Ritsumeikan University, Japan Jing Liu, Zhejiang Lab, China Giosue’, Lo Bosco, University degli studi di Palermo, Italy Cristina Manresa-Yee, University of Balearic Islands, Spain Yoshimasa Masuda, Keio University, Japan Tadashi Matsuo, Ritsumeikan University, Japan Andrea Matta, The Polytechnic University of Milan, Italy Rashid Mehmood, King Abdul Aziz University, Saudi Arabia Mayuri Mehta, Sarvajanik College of Engineering and Technology, India Aniello Minutolo, Institute for High Performance Computing and Networking, ICAR-CNR, Italy Marek R. Ogiela, AGH University of Science and Technology, Krakow, Poland Manuel Penedo, Research Center CITIC, Spain Marco Pota, National Research Council of Italy (ICAR-CNR), Italy Margarita Ramirez Ramirez, Universidad Autonoma de Baja California, Mexico Ana Respício, Universidade de Lisboa, Portugal Luis Enrique Sanchez Crespo, University of Castilla-la Mancha, Spain Donald Shepard, Brandeis University, USA Yu Song, Ritsumeikan University, Japan Catalin Stoean, University of Craiova, Romania Ruxandra Stoean, University of Craiova, Romania Kenji Suzuki, Tokyo Institute of Technology, Japan Kazuyoshi Tagawa, Aichi University, Japan Eiji Uchino, Yamaguchi University, Japan Eloisa Vargiu, Eurecat, Centre Tecnòlogic de Catalunya, Spain Xiong Wei, Institute for Infocomm Research, Singapore Yoshiyuki Yabuuchi, Shimonoseki City University, Japan

InMed 2021 Organization

vii

Shuichiro Yamamoto, Nagoya University, Japan Hiroyuki Yoshida, Harvard Medical School/Massachusetts General Hospital, USA

Organization and Management KES International ( www.kesinternational.org) in partnership with the Institute of Knowledge Transfer ( www.ikt.org.uk)

Preface

The 9th KES International Conference on Innovation in Medicine and Healthcare (InMed-21) was held online on 14–16 June 2021. The InMed-21 is the 9th edition of the InMed series of conferences. The conference focuses on major trends and innovations in modern intelligent systems applied to medicine, surgery, healthcare, and the issues of an aging population including recent hot topics on artificial intelligence for medicine and healthcare. The purpose of the conference is to exchange the new ideas, new technologies, and current research results in these research fields. We received submissions from many countries around the world. All submissions were carefully reviewed by at least two reviewers of the International Program Committee. Finally, 21 papers were accepted to be presented in this proceeding, which covers a number of key areas in smart medicine and healthcare including: (1) COVID-19; (2) Biomedical Engineering, Research and Technologies; (3) Medical Watermarking; (4) Support System for Medicine and Healthcare; (5) Support System for Medical/Hospital Management; and (6) Method for Supporting Diagnostics. In addition to the accepted research papers, a number of keynote speeches by leading researchers were presented at the conference. We would like to thank Dr. Kyoko Hasegawa and Ms. Yuka Sato of Ritsumeikan University for their valuable assistance in editing this volume. We are also grateful to the authors and reviewers for their contributions. Kyoto, Japan/Hangzhou, China Kyoto, Japan Arad, Romania/Poole, UK/Shoreham-by-Sea, UK Sydney, Australia/Liverpool, UK June 2021

Yen-Wei Chen Satoshi Tanaka Robert J. Howlett Lakhmi C. Jain

ix

Contents

COVID-19 Influence of Telehealth Intervention on Knowledge of Danger Signs in Pregnancy, Childbirth and Postpartum During the Health Emergency by COVID-19 in Peru . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Augusto Felix Olaza-Maguiña and Yuliana Mercedes De La Cruz-Ramirez Support for COVID-19 Vaccination in Tamba City’s Regional Comprehensive Care System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yoshiaki Fukami and Yoshimasa Masuda

3

15

Biological Engineering, Research and Technologies Knowledge Distillation with Teacher Multi-task Model for Biomedical Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . Tahir Mehmood, Alberto Lavelli, Ivan Serina, and Alfonso Gerevini Genomics-Based Models for Recurrence Prediction of Non-small Cells Lung Cancers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Panyanat Aonpong, Yutaro Iwamoto, Weibin Wang, Lanfen Lin, and Yen-Wei Chen IDH Mutation Status Prediction by Modality-Self Attention Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xinran Zhang, Yutaro Iwamoto, Jingliang Cheng, Jie Bai, Guohua Zhao, Xian-Hua Han, and Yen-Wei Chen

29

41

51

Medical Watermarking A Novel Robust Watermarking Algorithm for Encrypted Medical Image Based on Bandelet-DCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yangxiu Fang, Jing Liu, Jingbing Li, Dan Yi, Wenfeng Cui, Xiliang Xiao, Baoru Han, and Uzair Aslam Bhatti

61

xi

xii

Contents

Robust Zero Watermarking Algorithm for Encrypted Medical Images Based on DWT-Gabor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiliang Xiao, Jingbing Li, Dan Yi, Yangxiu Fang, Wenfeng Cui, Uzair Aslam Bhatti, and Baoru Han A Zero Watermarking Scheme for Encrypted Medical Images Based on Tetrolet-DCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenfeng Cui, Jing Liu, Jingbing Li, Yangxiu Fang, Dan Yi, Xiliang Xiao, Uzair Aslam Bhatti, and Baoru Han

75

87

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for Medical Images in the Encrypted Domain . . . . . . . . . . . . . . . . . . . . . . . . 101 Dan Yi, Jingbing Li, Yangxiu Fang, Wenfeng Cui, Xiliang Xiao, Uzair Aslam Bhatti, and Baoru Han Support System for Medicine and Healthcare Recent Advancements on Smartwatches and Smartbands in Healthcare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Marco Cipriano, Gennaro Costagliola, Mattia De Rosa, Vittorio Fuccella, and Sergiy Shevchenko A Proposal of Architecture Framework and Performance Indicator Derivation Model for Digitalization of Quality Management System . . . . 129 Kasei Miura, Nobuyuki Kobayashi, Tetsuro Miyake, Seiko Shirasaka, and Yoshimasa Masuda Support System for Medical/Hospital Management Prediction of Length of Stay Using Vital Signs at the Admission Time in Emergency Departments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Amin Naemi, Thomas Schmidt, Marjan Mansourvar, Ali Ebrahimi, and Uffe Kock Wiil Regulated Digital Pharmacy Based on Electronic Health Record to Improve Prescription Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Junhao Zhong, Zhengjia Mao, Hangpeng Li, Yoshimasa Masuda, and Tetsuya Toma Performance Verification of a Text Analyzer Using Machine Learning for Radiology Reports Toward Phenotyping . . . . . . . . . . . . . . . . . 171 Takanori Yamashita, Rieko Izukura, and Naoki Nakashima An Optimization Model for the Tradeoff Between Efficiency and Equity for Mobile Stroke Unit Placement . . . . . . . . . . . . . . . . . . . . . . . . 183 Saeid Amouzad Mahdiraji, Johan Holmgren, Radu-Casian Mihailescu, and Jesper Petersson

Contents

xiii

Method for Supporting Diagnostics Automatic Joint Position Estimation Method for Diagnosis Support System in Rheumatoid Arthritis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Tomio Goto, Ryota Fujimura, and Koji Funahashi Computer-Aided Diagnosis of Peritonitis on Cine-MRI Using Deep Optical Flow Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Toshiki Kawahara, Akitoshi Inoue, Yutaro Iwamoto, Akira Furukawa, and Yen-Wei Chen Automated Retrieval of Focal Liver Lesions in Multi-phase CT Images Using Tensor Sparse Representation . . . . . . . . . . . . . . . . . . . . . . . . . 217 Jian Wang, Junlin Zhao, Xian-Hua Han, Lanfen Lin, Hongjie Hu, Yingying Xu, Qingqing Chen, Yutaro Iwamoto, and Yen-Wei Chen Colorization for Medical Images Based on Patient-Specific Prior Information and GAN Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Yonglong Zhang, Yizhou Chen, Wenbo Pang, and Huiyan Jiang Case Discrimination: Self-supervised Feature Learning for the Classification of Focal Liver Lesions . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Haohua Dong, Yutaro Iwamoto, Xianhua Han, Lanfen Lin, Hongjie Hu, Xiujun Cai, and Yen-Wei Chen Content-Based Retrieval of Focal Liver Lesions Using Geometrical and Textural Features of Multi-Phase CT-Scan Images . . . . . . . . . . . . . . . . 251 Saeed Moslehi, Amir Hossein Foruzan, Yen-Wei Chen, and Hongjie Hu Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

About the Editors

Yen-Wei Chen was born in Hangzhou, China in 1962. He received his B.E. degree in 1985 from Kobe University, Kobe, Japan. He received his M.E. degree in 1987, and his D.E. degree in 1990, both from Osaka University, Osaka, Japan. He was a Research Fellow at the Institute of Laser Technology, Osaka, from 1991 to 1994. From October 1994 to March 2004, he was an Associate Professor and a Professor in the Department of Electrical and Electronic Engineering, University of the Ryukyus, Okinawa, Japan. He is currently a Professor at the College of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan. He is also a Visiting Professor at the Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China and a Visiting Professor at the College of Computer Science and Technology, Zhejiang University, Hangzhou, China. He was a Visiting Scholar at Oxford University, Oxford, UK in 2003 and at Pennsylvania State University, Pennsylvania, USA in 2010. His research interests include medical image analysis and pattern recognition. He has published more than 300 research papers. He has received many distinguished awards including Best Scientific Paper Award of ICPR2013 and Outstanding Chinese Oversea Scholar Fund of Chinese Academy of Science. He is the Principal Investigator of several projects in biomedical engineering and image analysis, funded by the Japanese Government. Prof. Satoshi Tanaka got his Ph.D. in theoretical physics at Waseda University, Japan in 1987. After experiencing an Assistant Professor, a Senior Lecturer, and an Associate Professor at Waseda University and Fukui University, he became a Professor at Ritsumeikan University in 2002. His current research target is computer visualization of complex 3D shapes such as 3D scanned cultural heritage objects, inside structures of a human body, and fluid simulation results. Recently, he was the President of JSST (Japan Society for Simulation Technology) and the President of ASIASIM (the Federation of Asia Simulation Societies). Currently, he is working as the Vice-President of the VSJ (Visualization Society of Japan) and a cooperation member of Japan Science Council. He is the best paper winners at Asia Simulation Conference 2012, Journal of Advanced Simulation in Science and Engineering in 2014, and many others. xv

xvi

About the Editors

Robert J. Howlett is the Executive Chair of KES International, a non-profit organization that facilitates knowledge transfer and the dissemination of research results in areas including Intelligent Systems, Sustainability, and Knowledge Transfer. He is a Visiting Professor at Bournemouth University in the UK. His technical expertise is in the use of intelligent systems to solve industrial problems. He has been successful in applying artificial intelligence, machine learning and related technologies to sustainability and renewable energy systems; condition monitoring, diagnostic tools and systems; and automotive electronics and engine management systems. His current research work is focussed on the use of smart microgrids to achieve reduced energy costs and lower carbon emissions in areas such as housing and protected horticulture. Dr. Lakhmi C. Jain Ph.D., ME, BE(Hons), Fellow (Engineers Australia) is with the University of Technology Sydney, Australia, and Liverpool Hope University, UK. Professor Jain serves the KES International for providing a professional community the opportunities for publications, knowledge exchange, cooperation, and teaming. Involving around 5,000 researchers drawn from universities and companies worldwide, KES facilitates international cooperation and generate synergy in teaching and research. KES regularly provides networking opportunities for professional community through one of the largest conferences of its kind in the area of KES.

COVID-19

Influence of Telehealth Intervention on Knowledge of Danger Signs in Pregnancy, Childbirth and Postpartum During the Health Emergency by COVID-19 in Peru Augusto Felix Olaza-Maguiña and Yuliana Mercedes De La Cruz-Ramirez Abstract The health emergency due to COVID-19 has caused the restriction in access to health services, including care for pregnant women, whose prenatal control in person is limited in Peru. The objective of the research was to evaluate the influence of a telehealth intervention on the knowledge of danger signs in pregnancy, childbirth and postpartum in pregnant women during the health emergency due to COVID-19. A quasi-experimental research was carried out with 64 pregnant women attended by telehealth (experimental group) and 64 pregnant women attended by face-to-face appointments (control group) in the city of Huaraz (Ancash, Peru, 3,052 m.a.s.l.). A telehealth intervention was applied only to the experimental group between September and November 2020, having assessed the knowledge of danger signs in pregnancy, childbirth and postpartum using a previously validated questionnaire. The SPSS V22.0 statistical package and the Chi square test were used. After the intervention, it was evidenced that the proportion of pregnant women with a high level of knowledge was upper in the experimental group compared to the control group, with respect to the danger signs in pregnancy [43 (67.2%) vs. 28 (43.8%)], childbirth [42 (65.6%) vs. 27 (42.2%)] and postpartum [47 (73.4%) vs. 31 (48.4%)], determining the existence of statistically significant differences between the experimental and control groups (p < 0.05). It was concluded that the application of telehealth intervention increased the knowledge of danger signs in pregnancy, childbirth and postpartum in pregnant women during the health emergency due to COVID-19. Keywords Telehealth · Signs and symptoms · COVID-19 · Midwifery

A. F. Olaza-Maguiña (B) · Y. M. De La Cruz-Ramirez Universidad Nacional Santiago Antúnez de Mayolo, Centenario Huaraz, Peru e-mail: [email protected] Y. M. De La Cruz-Ramirez e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_1

3

4

A. F. Olaza-Maguiña and Y. M. De La Cruz-Ramirez

1 Introduction Prenatal control is an essential and important activity carried out by obstetric professionals in order to promote and ensure the health of the pregnant woman, fetus and her family [1], providing not only care, but also educational activities aimed at the prevention of complications [2, 3], in which the information provided regarding the danger signs in pregnancy, childbirth and postpartum plays a very important role, the knowledge of which is essential for the timely search for care in health centers [1, 4, 5]. However, the aforementioned has been seriously affected by the arrival of the COVID-19 pandemic, which has caused worldwide restrictions in the care of pregnant women [6, 7], whose prenatal control in person in Peru it is limited, which is why obstetricians, who are the health professionals in charge of the care of pregnant women in that country, have had to resort to different strategies aimed at ensuring the follow-up and well-being of their patients, especially in cases of social and economic vulnerability. On the other hand, telehealth has meant a great advance and has been an available resource in various countries for several years [8–9], without having had the same development opportunity in Peru [10], as a result of several factors, among them the lack of interest on the part of the authorities; a situation that has been totally modified due to the health emergency caused by COVID-19, where it has had to resort very quickly to the technological means necessary to cover the need for care that patients have in different medical specialties [6, 11], including obstetrics [1, 2, 7, 12], without to date research has been carried out in Peru on the results of the application of telehealth in the care of pregnant women, compared to face-to-face assistance in circumstances as difficult as those caused by COVID-19, especially in distant places in the interior of that country, such as the city of Huaraz, which is located in a geographical region of the Andean Peruvian highlands (3,052 m.a.s.l.), the majority of whose population is in a situation of social and economic vulnerability [13]. Due to the aforementioned considerations, the present research was carried out with the objective of evaluating the influence of a telehealth intervention on the knowledge of danger signs in pregnancy, childbirth and postpartum in pregnant women during the health emergency due to COVID-19 in the city of Huaraz.

2 Methodology 2.1 Location and Duration of Research The research was carried out between September and November 2020 in the city of Huaraz, located in the province of the same name (Ancash region, Peru) at 3,052 m above sea level [13].

Influence of Telehealth Intervention on Knowledge ...

5

2.2 Research Design A quasi-experimental research was carried out with a control group in which knowledge of danger signs in pregnancy, childbirth and postpartum was compared according to their participation or not in telehealth intervention.

2.3 Participants The participants were pregnant women in the first, second and third trimesters of pregnancy, over 18 years of age, with habitual residence in the city of Huaraz. The experimental group was made up of pregnant women who participated in telehealth intervention and the control group by pregnant women attended through face-toface appointments without participation in the intervention, for which the confirmed diagnosis of pregnancy, older age to 18 years and habitual residence in the city of Huaraz were considered as inclusion criteria in both groups; while the exclusion criteria were the diagnosis of high obstetric risk, the presence of chronic diseases and the illiteracy condition. The calculation of the sample size was performed using existing bibliographic information regarding the percentage observed in the control group (38%) and that expected in the experimental group (65%) on pregnant women with a high level of knowledge of danger signs in pregnancy, childbirth and postpartum, considering a bilateral hypothesis contrast with an alpha coefficient of 0.05 and a beta coefficient of 0.20; product of which the need to include at least 53 pregnant women in each group was obtained. Additionally, a 20% loss percentage was assessed, which meant that 64 pregnant women should be included in each group. The selection of the pregnant women was carried out through 2 stages, in whose first phase it was worked with a sample frame of 351 pregnant women who appeared as patients treated in the health center of the city of Huaraz through telephone followup (249) or through face-to-face appointments (102), sampling frame from which a random sampling was carried out through a lottery, identifying 84 pregnant women in each group. Then, in the second stage of selection, the previously identified pregnant women were invited to participate in the study consecutively by telephone communication until the necessary number of 64 pregnant women was completed for each group; doing a previous explanation of the actions to be developed. Likewise, it is important to clarify that in all cases the established inclusion and exclusion criteria were verified, guaranteeing the homogeneity of the experimental and control groups, as evidenced in the results section.

6

A. F. Olaza-Maguiña and Y. M. De La Cruz-Ramirez

2.4 Variables Independent Variables. Participation in the experimental group (telehealth intervention) or in the control group (routine clinical practice in face-to-face appointments). Age (18–30 years, ≥ 30 years), marital status (single, married, cohabiting, other), educational level (primary, secondary, higher), number of births (1–2 births, ≥ 3 births). Dependent Variables. Knowledge of danger signs in pregnancy, childbirth and postpartum (low level, medium level, high level), having measured this variable both in the pre-test and in the post-test. Description of Telehealth Intervention Telehealth intervention was developed during three months (September to November 2020) only with the experimental group, for which the authors of the present research designed and elaborated different synchronous and asynchronous educational activities, consisting of information directly related to the danger signs during pregnancy, childbirth and postpartum, through the use of free digital resources such as Google Sites, Google Meet and WhatsApp, whose links were sent to the personal email of pregnant women after their voluntary acceptance to participate in the research. Telehealth intervention included 3 weekly asynchronous review and feedback sessions of educational content hosted on the private website created for this purpose according to the time availability of each pregnant woman, having developed a weekly synchronous videoconference with all pregnant women in the experimental group, whose personal tracking and monitoring was carried out by using WhatsApp. In the case of the control group, no telehealth intervention was performed, in which the pregnant women complied with attending their prenatal control through scheduled face-to-face appointments at the health center.

2.5 Data Collection Procedure Data collection was carried out using a questionnaire prepared by the study authors to measure knowledge of danger signs in pregnancy, childbirth and postpartum, which was applied through a pre-test and post-test to the pregnant women of the control group and experimental group before and after telehealth intervention, for which an online form was used, the link of which was sent to the personal email of the pregnant women. The questionnaire consisted of 28 questions, divided into three parts regarding the danger signs in pregnancy (11 questions: headache, tinnitus, blurred vision, epigastric pain, hyperemesis gravidarum, decreased or absence of fetal movements, edema in face, hands and feet, vaginal bleeding, loss of amniotic fluid, uterine contraction pain and fever), danger signs during childbirth (9 questions: loss of amniotic fluid without the presence of uterine contractions, umbilical cord prolapse, prolapse of

Influence of Telehealth Intervention on Knowledge ...

7

fetal parts, vaginal bleeding, frequent and severe uterine contractions, decreased or absent fetal movements, headache, tinnitus and blurred vision) and danger signs in the postpartum (8 questions: vaginal bleeding, fever, abdominal pain, foul-smelling leukorrhea, headache, tinnitus, blurred vision and mastitis), whose wording of the questions was done in simple and common terms. According to the answer in each question, a score of 1 point was assigned to the correct answer and 0 points to the incorrect answer, after which, according to the final scores obtained, knowledge of danger signs in pregnancy, childbirth and postpartum was classified at low level, medium level and high level, taking as reference what was developed in previous studies [4, 5]. The content validity of the questionnaire was evaluated through expert judgment, for which 5 recognized experts with work experience in obstetric care in person and by telehealth were consulted, who completed an evaluation form made up of 8 questions, applying the Kendall concordance test, as a result of which the validity of said questionnaire was demonstrated with a significance level of 0.001. Likewise, the internal consistency of the instrument was evaluated through the application of a pilot test to 30 pregnant women, demonstrating the reliability of the questionnaire using Cronbach’s alpha index, with a result of 0.894.

2.6 Statistical Analysis The SPSS V22.0 statistical package was applied, through which a descriptive statistical analysis was performed through percentages and absolute frequencies. Likewise, Pearson’s Chi square test and Chi square test for linear trend were used in the cases of nominal and ordinal variables, respectively; to evaluate possible differences between the experimental and control groups, with a level of significance of p < 0.05, in which case the null hypothesis of no differences between the groups would be rejected, making known the confidence intervals for the difference of proportions (CI) of 95%.

2.7 Ethical Considerations The right to privacy and confidentiality of the data of all the pregnant women participating in the research was respected according to what is established by the World Medical Association and the Declaration of Helsinki [14], having timely requested along with voluntary acceptance, the declaration of informed consent, which was completed virtually through an online form. Similarly, it was complied with presenting the research protocol to the Ethics Committee of the Santiago Antúnez de Mayolo National University, authorizing its execution (Registry N° 006–2020).

8

A. F. Olaza-Maguiña and Y. M. De La Cruz-Ramirez

3 Results All the pregnant women who were members of the experimental (64) and control (64) groups completed their participation in the research, and no loss occurred during the three months that the study lasted. Likewise, Table 1 shows the results of the comparison of the characteristics of the pregnant women in both groups, not finding statistically significant differences (p > 0.05), even with respect to the pre-test results on the knowledge of danger signs in pregnancy, childbirth and postpartum at the beginning of the study, thus determining the homogeneity of the experimental and control groups. Table 2 shows the findings of the general evaluation regarding the knowledge of danger signs in pregnancy, childbirth and postpartum at the end of the research, highlighting that the highest percentage of pregnant women with a high level of Table 1 Comparison of characteristics of pregnant women Characteristic

Experimental group n

Control group %

n

p

IC del 95%

%

Agea :

0.859

− 18–30 years

36

56.3 35

54.7

- ≥ 30 years

28

43.7 29

45.3

Marital statusa :

17.2–20.3% 0.981

− Single

15

23.4 16

25.0

− Married

14

21.9 13

20.3

14.1–17.3%

− Cohabiting

30

46.9 29

45.3

17.3–20.4%

− Other

5

7.8 6

9.4

Educational levela :

14.8–18.0%

9.7–12.8% 0.855

− Primary

12

18.8 14

21.9

12.4–18.6%

− Secondary

37

57.8 34

53.1

14.1–23.5%

− Higher

15

23.4 16

25.0

Number of birthsa :

14.8–18.0% 0.849

− 1–2 births

44

68.8 43

67.2

− ≥ 3 births

20

31.2 21

32.8

Pre−test about knowledge of danger signsb :

16.2–19.3% 0.742

− Low level

19

29.7 17

26.6

14.0–20.3%

− Medium level

22

34.4 23

35.9

16.5–19.7%

− High level

23

35.9 24

37.5

16.7–19.8%

a

Pearson’s Chi square test b Chi square test for linear trend IC: Confidence interval for the difference of proportions

Influence of Telehealth Intervention on Knowledge ...

9

Table 2 General assessment about knowledge of danger signs in the post-test Knowledge

Experimental group

p*

Control group

IC del 95%

n

%

n

%

Low level

5

7.8

14

21.9

0.4–27.7%

Medium level

15

23.4

21

32.8

7.7–26.4%

High level

44

68.8

29

45.3

5.2–41.7%

Total

64

100

64

100

0.004

*

Chi square test for linear trend

knowledge is observed in the experimental group (68.8%) compared to the control group (45.3%), this difference being statistically significant (p < 0.05). IC: Confidence interval for the difference of proportions. On the other hand, Table 3 shows that according to the results of the post-test, pregnant women in the experimental group presented higher percentages of high level of knowledge than the control group in relation to danger signs corresponding to Table 3 Knowledge of danger signs in the post-test Danger signs

Low level

Medium level

High level

n

n

%

n

%

%

In pregnancy: − Experimental group

6

9.4

15

23.4

43

67.2

− Control group

15

23.4

21

32.8

28

43.8

In childbirth: − Experimental group

5

7.8

17

26.6

42

65.6

− Control group

15

23.4

22

34.4

27

42.2

− Experimental group

4

6.3

13

20.3

47

73.4

− Control group

12

18.8

21

32.8

31

48.4

In postpartum:

* Chi square test for linear trend IC: Confidence interval for the difference of proportions a Result corresponding to the category “High level” of knowledge

p*

IC del 95%a

0.005

5.1–41.7%

0.003

5.1–41.8%

0.003

7.1–42.9%

10

A. F. Olaza-Maguiña and Y. M. De La Cruz-Ramirez

pregnancy (67.2% vs. 43.8%), childbirth (65.6% vs. 42.2%) and postpartum (73.4% vs. 48.4%), finding statistically significant differences (p < 0.05) in all cases.

4 Discussion The findings observed regarding the favorable influence of telehealth intervention on the knowledge of danger signs in pregnancy, childbirth and postpartum in pregnant women of experimental group during the health emergency due to COVID-19, which was constituted in the main objective of this research, are evidence in favor of the application of telehealth in different areas and specialties of medicine, as in this case it occurred in obstetrics, evidence also demonstrated in other studies [2, 6, 8, 15], in which it is highlighted that technological advances, videoconferencing applications, the increasingly common use of smart phones and the regulatory opening of the authorities have allowed the accelerated and massive application of telehealth, whose scope and development during COVID-19 pandemic should be used for its continuity in health systems worldwide [3, 9, 11]. In this sense, the upper percentage of high level of knowledge observed in pregnant women of the experimental group about the knowledge of danger signs during pregnancy after telehealth intervention, would be due to the virtual educational activities developed as part of said intervention, where interactive information based on videos and clinical cases predominated, in addition to continuous personal monitoring, with the consequent permanent absolution of doubts and queries, a characteristic highly valued by patients [15, 9, 16], especially in health emergency situations such as those caused by COVID-19, a situation that is related to what has been described by other research developed in the obstetric specialty [2, 15], where it is highlighted the potential of telehealth to reduce geographical and economic inequalities in the care of pregnant women [17–18], so its implementation is recommended as a health policy in countries with problems in their health system [3, 15], a scenario in which Peru unfortunately finds itself. Regarding the difference evidenced in favor of pregnant women in the experimental group regarding the knowledge of danger signs during childbirth, this result would be related to the interest generated in the patients when they are cared more closely and assertively by health professionals, especially in circumstances of emotional support, as evidenced in the weekly synchronous meetings, where there was an atmosphere of solidarity and fellowship among the participants of experimental group, which according to various research [19], has been found to be favorable when additionally there is the accompaniment and guidance of a professional, who constitutes a reliable source of information for patients [20]. Likewise, in a similar way to that evidenced in the present research, other studies have concluded that the results of the use of telehealth are comparable and even more effective than those obtained with traditional medical care [9, 21], being an indispensable requirement that adequate technological accessibility is ensured, as well as the necessary

Influence of Telehealth Intervention on Knowledge ...

11

training in the management of digital resources, with the proper supervision and commitment of management personnel [9]. In relation to the higher proportion of pregnant women in the experimental group with a high level of knowledge about the danger signs during postpartum, various research have found that having a telehealth service based on the use of technological tools at the permanent reach of patients, generates a feeling of satisfaction due to its convenience and quality [16, 21, 20, 22], results that would be related to the fact that telehealth contributes to empowering pregnant women by demedicalizing the natural experience of pregnancy and providing educational support that adapts to the daily life and particular activities of people [7, 19]. Likewise, another important aspect would be related to the active and participatory exchange of information between patients and health providers, according to their own needs, through virtual self-assessment tools and flexible access to the team of professionals [23, 24]. On the other hand, it is important to note that the findings evidenced in the control group could have been due to the fact that pregnant women in said group were attended in person amid the restrictive conditions typical of the COVID-19 pandemic, which could have caused limitations regarding the time devoted to educational and orientation activities during the development of prenatal control; therefore it constitutes an aspect to be taken into account for future studies. In a complementary way to what was indicated in the previous paragraph, the limitations presented in this research should be highlighted, such as the non-control of some variables that could have influenced the knowledge of pregnant women, such as other sources of information or additional activities that could develop in their homes to search for information, a situation that despite the efforts made, unfortunately could not be fully controlled, due to the circumstances of the health emergency caused by COVID-19, added to this the limitation that means the possible occurrence of unexpected results and the placebo effect that could have occurred in pregnant women of the experimental group knowing that they would be evaluated with respect to their knowledge on a certain topic, especially if it is taken into account that during the present research, none of the actions to be carried out were hidden. However, the aforementioned limitations do not undermine the strengths of the present research, whose findings may be useful for further research on the applications of telehealth in the medical area of obstetrics, and the methodology used may be considered for the design and implementation of comprehensive educational plans by the Peruvian authorities, constituting a research experience that contributes to the recognition of the use of telehealth for the benefit of the population.

5 Conclusions The application of telehealth intervention increased the knowledge of danger signs in pregnancy, childbirth and postpartum in pregnant women participating in the research during the health emergency due to COVID-19, which is why the Peruvian health

12

A. F. Olaza-Maguiña and Y. M. De La Cruz-Ramirez

authorities should promote the implementation formal care systems based on technological resources such as telehealth, in order to meet the health needs of the population. Future research should address the urgent need to improve and/or replace traditional health services with telehealth, through experiences of intervention in different medical specialties, not only in the treatment of diseases but also in the area of prevention and promotion of the health.

References 1. Holcomb, D., Faucher, M.A., Bouzid, J., Quint-Bouzid, M., Nelson, D.B., Duryea, E.: Patient perspectives on audio-only virtual prenatal visits amidst the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic. Obstet. Gynecol. 136(2), 317–322 (2020). https://doi. org/10.1097/aog.0000000000004026 2. Futterman, I., Rosenfeld, E., Toaff, M., Boucher, T., Golden-Espinal, S., Evans, K., Clare, C.A.: Addressing disparities in prenatal care via telehealth during COVID-19: prenatal satisfaction survey in East Harlem. Am. J. Perinatol. 38(1), 88–92 (2021). https://doi.org/10.1055/s-00401718695 3. Shields, A.D., Wagner, R.K., Knutzen, D., Deering, S., Nielsen, P.E.: Maintaining access to maternal fetal medicine care by telemedicine during a global pandemic. J. Telemed. Telecare. (2020). https://doi.org/10.1177/1357633x20957468 4. Al-Ali, Z., Kadhum, S.: Knowledge about obstetric warning signs during pregnancy among mothers attending the primary health care centers in hilla city. Indian J. Forensic Med. Toxicol. 14(2), 894–901 (2020) 5. Bustamante, G., Mantilla, B., Cabrera-Barona, P., Barragán, E., Soria, S., Quizhpe, E., Jiménez, A.P., Hinojosa, M.H., Wang, E., Grunauer, M.: Awareness of obstetric warning signs in Ecuador: a cross-sectional study. Public Health 172, 52–60 (2019). https://doi.org/10.1016/j.puhe.2019. 04.013 6. Cohen, E., Cohen, M.: COVID-19 will forever change the landscape of telemedicine. Curr. Opin. Cardiol. 36(1), 110–115 (2021). https://doi.org/10.1097/hco.0000000000000806 7. Sawyer, M.R., Jaffe, E.F., Naqvi, M., Sarma, A., Barth, W.H., Goldfarb, I.T.: Establishing better evidence on remote monitoring for postpartum hypertension: a silver lining of the coronavirus pandemic. Am. J. Perinatol. Rep. 10, e315–e318 (2020). https://doi.org/10.1055/s-0040-171 5169 8. Stevens, N.R., Miller, M.L., Soibatian, C., Otwell, C., Rufa, A.K., Meyer, D.J., Shalowitz, M.U.: Exposure therapy for PTSD during pregnancy: a feasibility, acceptability, and case series study of Narrative Exposure Therapy (NET). BMC Psychol. 8, 130 (2020). https://doi.org/10.1186/ s40359-020-00503-4 9. Lowery, C., DeNicola, N.: American College of Obstetricians and Gynecologists: Implementing Telehealth in Practice. Obstet Gynecol. 135(2), e73–e79 (2020). https://doi.org/10. 1097/aog.0000000000003671 10. Marini, T.J., Oppenheimer, D.C., Baran, T.M., Rubens, D.J., Toscano, M., Drennan, K., Garra, B., Miele, F.R., Garra, G., Noone, S.J., Tamayo, L., Carlotto, C., Trujillo, L., Waks, E., Garra, K., Egoavil, M.S., Berrospi, J., Castaneda, B.: New ultrasound telediagnostic system for lowresource areas: pilot results from Peru. J. Ultrasound. Med. (2020). https://doi.org/10.1002/ jum.15420 11. Hron, J.D., Parsons, C.R., Williams, L.A., Harper, M.B., Bourgeois, F.C.: Rapid implementation of an inpatient telehealth program during the COVID-19 pandemic. Appl. Clin. Inform. 11(3), 452–459 (2020). https://doi.org/10.1055/s-0040-1713635 12. Whittington, J.R., Magann, E.F.: Telemedicine in high-risk obstetrics. Obstet. Gynecol. Clin. N. Am. 47(2), 249–257 (2020). https://doi.org/10.1016/j.ogc.2020.02.007

Influence of Telehealth Intervention on Knowledge ...

13

13. Instituto Nacional de Estadística e Informática: Encuesta nacional de hogares. INEI, Lima (2018). 14. World Medical Association. Declaration of Helsinki – Ethical principles for medical research involving human subjects. https://www.wma.net/policies-post/wma-declaration-of-helsinkiethical-principles-for-medical-research-involving-human-subjects/, last accessed 2021/01/08. 15. Kern-Goldberger, A.R., Srinivas, S.K.: Telemedicine in obstetrics. Clin Perinatol. 47(4), 743– 757 (2020). https://doi.org/10.1016/j.clp.2020.08.007 16. Kobayashi, H., Sado, T.: Satisfaction of a new telephone consultation service for prenatal and postnatal health care. J. Obstet. Gynaecol. Res. 45(7), 1376–1381 (2019). https://doi.org/10. 1111/jog.13987 17. Marcin, J.P., Shaikh, U., Steinhorn, R.H.: Addressing health disparities in rural communities using telehealth. Pediatr Res. 79, 169–176 (2016). https://doi.org/10.1038/pr.2015.192 18. Whittington, J.R., Ramseyer, A.M., Taylor, C.B.: Telemedicine in low-risk obstetrics. Obstet. Gynecol. Clin. N. Am. 47(2), 241–247 (2020). https://doi.org/10.1016/j.ogc.2020.02.006 19. Meylor, M., Hodny, R., O’Neil, D., Gardner, M., Beaver, M., Brown, A., Barry, B., Ross, L., Jasik, A., Nesbitt, K., Sobolewski, S., Skinner, S., Chaudhry, R., Brost, B., Gostout, B., Harms, R.: OB Nest: reimagining low-risk prenatal care. Mayo Clin. Proc. 93(4), 458–466 (2018). https://doi.org/10.1016/j.mayocp.2018.01.022 20. Tarqui-Mamani, C., Sanabria-Rojas, H., Portugal-Benavides, W., García, J.C., Castro-Garay, W., Escalante-Lazo, R., Calderón-Bedoya, M.: Eficacia de la tecnología móvil y ganancia de peso en gestantes en Callao, Perú. Rev. Salud Pública. 20(1), 67–72 (2018). https://doi.org/10. 15446/rsap.v20n1.63488. 21. Leighton, C., Conroy, M., Bilderback, A., Kalocay, W., Henderson, J.K., Simhan, H.N.: Implementation and impact of a maternal–fetal medicine telemedicine program. Am. J. Perinatol. 36(7), 751–758 (2019). https://doi.org/10.1055/s-0038-1675158 22. Hoppe, K., Williams, M., Thomas, N., Zella, J.B., Drewry, A., Kim, K., Havighurst, T., Johnson, H.: Telehealth with remote blood pressure monitoring for postpartum hypertension: a prospective single-cohort feasibility study. Pregnancy Hypertens. 15, 171–176 (2019). https://doi.org/ 10.1016/j.preghy.2018.12.007 23. DeNicola, N., Grossman, D., Marko, K., Sonalkar, S., Butler, Y.S., Ganju, N., Witkop, C.T., Henderson, J.T., Butler, J.L., Lowery, C.: Telehealth interventions to improve obstetric and gynecologic health outcomes: a systematic review. Obstet. Gynecol. 135(2), 371–382 (2020). https://doi.org/10.1097/aog.0000000000003646 24. Aziz, A., Zork, N., Aubey, J.J., Baptiste, C.D., D’Alton, M.E., Emeruwa, U.N., Fuchs, K.M., Goffman, D., Gyamfi-Bannerman, C., Haythe, J.H., LaSala, A.P., Madden, N., Miller, E.C., Miller, R.S., Monk, C., Moroz, L., Ona, S., Ring, L.E., Sheen, J.J., Spiegel, E.S., Simpson, L.L., Yates, H.S., Friedman, A.M.: Telehealth for high-risk pregnancies in the setting of the COVID-19 pandemic. Am. J. Perinatol. 37(8), 800–808 (2020). https://doi.org/10.1055/s-00401712121

Support for COVID-19 Vaccination in Tamba City’s Regional Comprehensive Care System Yoshiaki Fukami

and Yoshimasa Masuda

Abstract The COVID-19 vaccine was developed less than a year after the global pandemic first began to spread, but storage at ultralow temperatures and multiple vaccinations are required. The logistics for the successful acquisition of herd immunity are complex and difficult to prepare. Tamba City, Japan, is trying to quickly build a vaccination system by utilizing the existing regional comprehensive care system linked to the basic resident register. For this system to contribute to the suppression of the spread of COVID-19 infection and the efficiency of the treatment system, it is necessary to evolve the architecture, including the data architecture to handle distributed and diversified data. Keywords EHR · Regional comprehensive care · AIDAF · Data architecture · Standardization

1 Background The first case of coronavirus disease 2019 (COVID-19) was identified in December 2019. Less than a year later, the vaccine was developed, and inoculation started. However, cryopreservation and multiple vaccinations are required for the vaccine to be effective. Therefore, it is important to keep vaccination records for the subjects and manage vaccination based on the records. It is important to increase the inoculation rate for the vaccine to be effective. Vaccination must be carried out by the government, not by medical institutions individually. Municipalities need to put in place a mechanism to centrally manage the vaccination status of residents. In Japan, the government has undertaken vaccinations only for children and the elderly. However, the COVID-19 vaccine needs to be taken Y. Fukami (B) Gakushuin University, 1-5-1 Mejiro Toshima, Tokyo 1718588, Japan e-mail: [email protected] Y. Masuda Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_2

15

16

Y. Fukami and Y. Masuda

universally by people in a wider range of age groups. Many municipalities have not built a means of vaccination management for a wide range of age groups. On the other hand, Tamba, Japan, has already introduced an immunization implementation determination system connected to the Basic Resident Register that will be also used for COVID-19 vaccination. It is not yet clear how long the efficacy of the COVID-19 vaccine will last. In addition, each time a variant of the virus emerges, it may be necessary to develop an additional vaccine corresponding to that species. Therefore, the system for COVID19 vaccination is likely to be in continuous operation. The existing system operated by Tamba City is planned to be refurbished in a short period of time to support COVID-19 vaccination. This is an advanced example of the use of master data managed by the government externally, and its use has been expanded to support the comprehensive community care within the region [1]. The response to COVID-19 vaccination has the potential to significantly increase the proportion of residents covered by the system but at the same time poses new privacy and ethical issues. In this paper, we analyze the impact of the COVID-19 pandemic on regional medical information systems through a case analysis of Tamba City and examine the system architecture requirements for regional medical and public health measures.

2 Related Work 2.1 Information Technology to Fight COVID-19 The response to COVID-19, which spreads infection by droplets, is very costly in terms of both precaution and treatment. Therefore, since the pandemic began to spread across the world, various projects have been initiated to control the spread of infection and support the treatment of infected people [2]. The concept of the Internet of Medical Things, which utilizes IoT technology in the medical field, has been advocated, and the research and development of the use of drones, artificial intelligence, blockchain, and 5th generation mobile communication technology has been promoted [3]. There are some security issues to be resolved regarding the utilization of digital technologies, especially when devices are connected to the Internet [4]. Contact tracing applications have been widely deployed around the world to prevent the spread of infection. They utilize the Bluetooth communication function and the API of Android and iOS to report on the contact history of infected persons. Since mobile history data are utilized, the merit of suppressing the spread of infection is realized by a trade-off with privacy protection [5]. In addition, since it is necessary for governments to collect data on infection history, there are some issues, such as data management, proximity estimation and attack vulnerability [6]. Deficiencies in privacy protection have been found in the contact tracing apps developed by many countries [7]. Not only technical measures but also legal measures

Support for COVID-19 Vaccination in Tamba City’s Regional …

17

were rushed after the pandemic began. In Japan, part of the Act on the Protection of Personal Information was amended to permit the provision of personal information to a third party without obtaining a principal’s consent, only in the case of COVID-19 compliance.

2.2 Utilization of Digital Identity Technologies to prevent infection with wearable devices have also been developed [8]. To integrally process the data collected by multiple devices to maintain social distance and prevent secondary infections from infected persons, it is necessary to identify the data subject and integrate the data. The use of blockchain technology has been advocated to solve the problems of personal information protection and data management vulnerabilities [9]. In particular, the introduction of distributed ID technology has been advocated as a digital ID technology related to personal authentication [10], and standardization is being promoted by the World Wide Web Consortium and the Decentralized Identity Foundation. The issuance of digital IDs by the state has been implemented as part of the promotion of digital government [11]. Government-issued digital IDs, such as the Estonian government’s e-Estonia and India’s Aadhaar, are already in operation. Moreover, the United Nations Sustainable Development Goals include the goal of “By 2030, provide legal identity for all, including birth registration (16.9)”. The UN promotes the use of a digital identity to achieve this goal.

2.3 Utilization of Artificial Intelligence Attempts to utilize artificial intelligence (AI) in the medical field are spreading to various fields [12, 13]. Avoiding bias is a major issue for the utilization of AI for medicine [14]. However, the development of methods to avoid bias and safely utilize AI in medical practice has been promoted [15, 16]. In response to COVID-19, it is necessary to take measures to prevent infection, that is, to collect data for all unaffected citizens and intervene in their daily lives. In addition, in the treatment of infected persons, it is constantly necessary to deal with unknown situations, such as new coronavirus variants, so it is necessary to collect as much data as possible and use machine learning to support decisionmaking in treatment. Clinical decision-support systems (CDSSs) [17] are needed to update the treatment of COVID-19.

18

Y. Fukami and Y. Masuda

2.4 Digital Healthcare for the Prevention of COVID-19 People with lifestyle-related diseases and the elderly are at high risk of more severe COVID-19 infection. Therefore, data regarding individuals’ history and lifestyle are effective for prevention and treatment. The computerization of data generation and storage in the medical field has existed since the 1960s. Technological advances in computer innovations opened the way for advancements in electronic medical records (EMRs) and health care [18]. An EMR is a real-time patient health record with access to evidence-based decision support tools that can be used to aid clinicians in decision-making [19]. More specifically, an electric health records (EHR) is a longitudinal electronic record of patient health information generated by one or more encounters in any care delivery setting and the reporting of episodes of care across multiple care delivery organizations within a community, region, or state [19]. EHR design is essentially a consolidation of the data held by diverse medical institutions, since not everyone receives consultations and tests a single medical institution over their lifetime.

2.5 Enterprise Architecture for Healthcare System In many countries and regions, medical and health information are considered different, and measures constructed for using the data are as disparate as EHRs and personal health records (PHRs). However, EHRs and PHRs are managed separately. To overcome these barriers to the realization of the effective prevention and treatment of COVID-19, a framework must be developed for the system while strategic planning, implementation and operation continue in line with regional circumstances. To solve such problems, studies on enterprise architecture in the construction of medical and health information platforms have been conducted [20]. The Adaptive Integrated Digital Architecture Framework (AIDAF) [21] is a suitable tool for this situation. The AIDAF is “an enterprise architecture framework model integrating an adaptive EA cycle in the bottom portion with a TOGAF or simple EA (framework) by business division units in the upper portion” [22].

3 Case Study 3.1 The Existing Immunization Determination System The Immunization Implementation Determination System Tamba City was developed for vaccination subsidies for children between 0 and 15 years old. To carry out vaccination, it is necessary to refer to the patient’s vaccination history and resident

Support for COVID-19 Vaccination in Tamba City’s Regional …

19

registration status. The system enables clinics in the city to inquire whether subjects are to be vaccinated and what vaccines to apply. IC cards are distributed to children who are eligible for vaccination. Android tablets owned by the city hall and connected to the city hall’s vaccination ledger via a closed network were distributed to the clinics (Fig. 1). The IC card reader/writer is mounted on the tablet (Fig. 2), identity verification is performed by reading the IC card distributed by the city, and the availability of vaccines and the vaccine to be administered are displayed. With the introduction of the regional comprehensive care system, the subject of this system was expanded to elderly individuals. A QR code is used to verify the identity of the elderly. Cards printed with QR codes are distributed to elderly people eligible for vaccination. At the time of vaccination, the QR code is read with the camera of the tablet to verify an individual’s identity.

Fig. 1 Immunization determination system with a closed network in Tamba City

Fig. 2 Tablet with IC card reader and QR code reader

20

Y. Fukami and Y. Masuda

Fig. 3 Assessment meta-model in architecture board

3.2 System Requirements for COVID-19 Vaccination So far, the populations targeted for existing public vaccination have been limited to certain age groups. The inoculation time differed across individuals. On the other hand, the COVID-19 vaccine needs to be administered to more residents as soon as possible. Therefore, a QR code, which costs less than an IC card, is used for personal authentication. Vaccination vouchers with printed QR codes are planned to be sent to residents who have not been covered by the immunization determination system and the regional comprehensive care system. In addition, since the vaccine needs to be stored at an ultralow temperature, vaccination should be carried out by gathering a large number of residents in a small number of venues rather than being distributed among clinics. For this reason, tablets will be brought to the newly established vaccination site. The city needs to buy new tablets, but the number of new tablets needed is small. In other words, the additional cost of responding to COVID-19 vaccination is small.

3.3 Improved Resident Coverage Rate of the System User The target population of the immunization determination system was those aged 15 or younger, and the target population of the regional comprehensive care system was those aged 65 or older. Therefore, the population between the ages of 16 and 64 was not covered by the Tamba City system. Although the system’s population coverage has been low due to its limited use [23], it is expected that the addition of the COVID-19 vaccination management function will cause a significant increase in the ID prevalence in the population.

Support for COVID-19 Vaccination in Tamba City’s Regional …

21

Tamba City operates its own health examination facility and conducts regular examinations for National Health Insurance holders. As a result, it will be possible to comprehensively manage prescription data, medical examination records, and COVID-19 vaccination records from insurance medical treatment for National Health Insurance holders. In addition, it is expected that the ID registration rate for people aged 65 and older, which is the target population of the comprehensive community care system, will improve.

3.4 Differences in Data Associated with ID of the System and Limitations Japan has introduced a universal health insurance system, but it is divided into two systems: national health insurance managed by municipalities and employer-based health insurance covered by business organizations. Municipalities directly provide medical insurance services and regular medical examination services only to National Health Insurance members. Therefore, the health examination records and prescription data that can be handled by this system are only those of National Health Insurance members. Therefore, by managing COVID-19 vaccination with this system, Tamba City will be able to issue and manage the IDs of most residents. However, only for National Health Insurance and nursing-care insurance subscribers are health examination records, vaccination history, and prescription data linked to the ID, while for other residents, only the COVID-19 vaccination record is linked to the ID.

3.5 Differences in Data Associated with ID of the System and Limitation The system in Tamba City was initially developed for the purpose of the efficient implementation of vaccination assistance and the prevention of vaccination accidents. The COVID-19 vaccine is ineffective with a single inoculation. Therefore, it is necessary to inoculate the residents who have not been vaccinated the specified number of times, encourage revaccination within the deadline, and encourage residents to continue to reduce contact with others. In other words, for COVID-19 vaccination to be effective, it is not enough to gather the vaccination history, but it is also necessary to have functions that allow the state to intervene in the behavior of the residents. On the other hand, for people suffering from lifestyle-related diseases, the sharing of the health condition, medical examination results, medication status, and vaccination history between the family doctor and the health center is desirable. The integrated management of medical and health data can only be realized through the aggregation of the data stored in EHRs and PHRs. The regional comprehensive

22

Y. Fukami and Y. Masuda

care system is developed to provide efficient and appropriate medical and nursing care by sharing data among doctors, dentists, care workers, pharmacies, and governments. The number of people eligible for ID issuance is planned to increase to support COVID-19 vaccination, but to control COVID-19 infection in the region and improve public health, cooperation among the stakeholders in the region will be required. It is necessary to add functions that enable this cooperation and develop the operational methods of the system.

3.6 Limitation of Shared Data Since the system was originally developed for regional comprehensive care, stakeholders such as city halls, medical associations, dentist associations, home-visit care providers, and others jointly established an organization (the Tamba Medical Care Collaboration: MCC) for the collaborative governance of the system. This organization supports the cooperation between stakeholders that is necessary for the treatment and infection control of COVID-19. Tamba City staff started prescription data analysis in September 2019, aiming to balance administrative spending and maintain and improve public health standards, mainly by improving the use of generic drugs among patients with lifestyle-related diseases. The results of the analysis are shared within member organizations of the MCC [23]. The sharing of more diverse data is necessary to improve diagnostic efficiency and prevent institutional infections.

4 Desired Data Model and Architectural Governance with AIDAF The conversion of the Immunization Implementation Determination System in Tamba City to support COVID-19 vaccination operation management will realize rapid preparation at low cost. In addition, when the vaccine is administered to the 1664-year-old population, which was previously not targeted, the coverage of residents by the system will significantly increase. As COVID-19 spreads, a large number of infected people cannot be accommodated in the ward for infectious diseases due to a lack of beds. The daily health status of patients with lifestyle-related diseases, the results of regular health examinations, long-term care records for the elderly, etc. are utilized to make triage decisions for the many infected patients hospitalized in infectious disease wards and to determine the treatment policy. In addition, information such as vaccination history and preexisting illnesses is very useful for preventing secondary infections in hospitals when examining patients with fever. Data sharing among healthcare stakeholders is very important in areas where healthcare resources are limited.

Support for COVID-19 Vaccination in Tamba City’s Regional …

23

In Tamba City, there is already an information sharing platform common to medical institutions. However, there are no common data architectures, procedures or rules for utilizing the data on the platform at each facility [24]. However, for people suffering from lifestyle-related diseases, it is desirable to share health conditions, medical examination results, medication status, and other data. Due to the COVID-19 pandemic, the demand for data sharing among medical and health professionals and operational support for utilizing data has become apparent. This demand is not transient but continuous in order to maintain local public health and reduce health care costs. The application of the governance model proposed by AIDAF [25] to the MCC is shown in Fig. 3. The model argues that each new project planning document should be made by or submitted to the Architecture Board for review and evaluation. Additionally, the review results were published on the portal, and the project was endorsed after the necessary action items were addressed [25]. With this model, it is possible to solve the urgent issues associated with controlling COVID-19 infection and maintaining a treatment system. It also enables the efficient management of regional medical resources and the maintenance of and improvement in public health and welfare standards over the long term.

5 Discussion Tamba City has realized rapid vaccine dissemination and made progress in the prevention of accidents associated with vaccination. However, due to the limited functions, target ages, and target areas, the data to be accumulated and shared remain limited. Until the present, the importance of data sharing has not been widely recognized among healthcare professionals, and many doctors have resisted the introduction of EHRs. However, Tamba City has been able to quickly prepare the information infrastructure necessary for COVID-19 vaccination because a platform specializing in vaccination management was originally established by the government. Since this system is connected to the basic resident register and has functions for running comprehensive community care, it not only helps manage vaccination but also suppresses the spread of infections and supports the treatment of infected persons. The need for COVID-19 compliance helps many stakeholders recognize the need to utilize the system and share data. Since the system is closely linked to the resident management work of the municipal office, it is difficult to extend the coverage outside the administrative area. The system is designed on the premise of ID management, which is different from the national digital identity management system of Japan. How to expand the scale is an important issue for future research. Infectious diseases, such as COVID-19, require coordinated responses across regions and borders. In addition, to utilize AI analysis of big data from PHR and massive data from distributed sensors and medical instruments for diagnosis of infected people and behavioral restrictions of residents, it is necessary to expand

24

Y. Fukami and Y. Masuda

the amount and diversity of aggregated data, and unified data architecture among medicine, healthcare and other fields through AIDAF based governance. Acknowledgements This work was supported by JSPS Grant-in-Aid for Early-Career Scientists Grant Numbers JP18K12858.

References 1. Fukami, Y., Masuda, Y.: Success Factors for Realizing Regional Comprehensive Care by EHR with Administrative Data. In: Chen, Y.-W., Zimmermann, A., Howlett, R.J., and Jain, L.C. (eds.) Smart Innovation, Systems and Technologies. pp. 35–45. Springer (2019) 2. Ting, D.S.W., Carin, L., Dzau, V., Wong, T.Y.: Digital technology and COVID-19, https://doi. org/10.1038/s41591-020-0824-5, (2020) 3. Chamola, V., Hassija, V., Gupta, V., Guizani, M.: A Comprehensive Review of the COVID-19 Pandemic and the Role of IoT, Drones, AI, Blockchain, and 5G in Managing its Impact. IEEE Access. 8, 90225–90265 (2020) 4. Yaacoub, J.P.A., Noura, M., Noura, H.N., Salman, O., Yaacoub, E., Couturier, R., Chehab, A.: Securing internet of medical things systems: Limitations, issues and recommendations. Futur. Gener. Comput. Syst. 105, 581–606 (2020) 5. Cho, H., Ippolito, D., Yu, Y.W.: Contact Tracing Mobile Apps for COVID-19: Privacy Considerations and Related Trade-offs. arXiv (2020) 6. Ahmed, N., Michelin, R.A., Xue, W., Ruj, S., Malaney, R., Kanhere, S.S., Seneviratne, A., Hu, W., Janicke, H., Jha, S.K.: A Survey of COVID-19 Contact Tracing Apps. IEEE Access. 8, 134577–134601 (2020) 7. Segal, E., Zhang, F., Lin, X., King, G., Shalem, O., Shilo, S., Allen, W.E., Alquaddoomi, F., Altae-Tran, H., Anders, S., Balicer, R., Bauman, T., Bonilla, X., Booman, G., Chan, A.T., Cohen, O., Coletti, S., Davidson, N., Dor, Y., Drew, D.A., Elemento, O., Evans, G., Ewels, P., Gale, J., Gavrieli, A., Geiger, B., Grad, Y.H., Greene, C.S., Hajirasouliha, I., Jerala, R., Kahles, A., Kallioniemi, O., Keshet, A., Kocarev, L., Landua, G., Meir, T., Muller, A., Nguyen, L.H., Oresic, M., Ovchinnikova, S., Peterson, H., Prodanova, J., Rajagopal, J., Rätsch, G., Rossman, H., Rung, J., Sboner, A., Sigaras, A., Spector, T., Steinherz, R., Stevens, I., Vilo, J., Wilmes, P.: Building an international consortium for tracking coronavirus health status. Nat. Med. 26, 1161–1165 (2020) 8. Elgendi, M., Bulgheroni, M.V., Acus, A., Massaroni, C., Seshadri, D.R., Davies, E. V, Harlow, E.R., Hsu, J.J., Knighton, S.C., Walker, T.A., Voos, J.E., Drummond, C.K.: Wearable Sensors for COVID-19: A Call to Action to Harness Our Digital Infrastructure for Remote Patient Monitoring and Virtual Assessments. Front. Digit. Heal.| www.frontiersin.org. 1, (2020) 9. Abd-alrazaq, A.A., Alajlani, M., Alhuwail, D., Erbad, A., Giannicchi, A., Shah, Z., Hamdi, M., Househ, M.: Blockchain technologies to mitigate COVID-19 challenges: A scoping review. Comput. Methods Programs Biomed. Updat. 100001 10. Eisenstadt, M., Ramachandran, M., Chowdhury, N., Third, A., Domingue, J.: Covid-19 antibody test/ vaccination certification there’s an app for that. IEEE Open J. Eng. Med. Biol. 1, 148–155 (2020) 11. Jacobovitz, O.: Blockchain for Identity Management. (2016) 12. Ramesh, A.N., Kambhampati, C., Monson, J.R.T., Drew, P.J.: Artificial intelligence in medicine. Ann. R. Coll. Surg. Engl. 86, 334–338 (2004) 13. He, J., Baxter, S.L., Xu, J., Xu, J., Zhou, X., Zhang, K.: The practical implementation of artificial intelligence technologies in medicine (2019). https://doi.org/10.1038/s41591-018-0307-0 14. Sipior, J.C.: Considerations for development and use of AI in response to COVID-19. Int. J. Inf. Manage. 55, (2020)

Support for COVID-19 Vaccination in Tamba City’s Regional …

25

15. Parikh, R.B., Teeple, S., Navathe, A.S.: Addressing Bias in Artificial Intelligence in Health Care. JAMA - J. Am. Med. Assoc. 322, 2377–2378 (2019) 16. Challen, R., Denny, J., Pitt, M., Gompels, L., Edwards, T., Tsaneva-Atanasova, K.: Artificial intelligence, bias and clinical safety. BMJ Qual. Saf. 28, 231–237 (2019) 17. El-Sappagh, S.H., El-Masri, S.: A Proposal of Clinical Decision Support system Architecture for Distributed Electronic Health Records. In: Proceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP). p. 1 (2011) 18. Turk, M.: Electronic Health Records: How to Suture the Gap Between Privacy and Efficient Delivery of Healthcare. Brooklyn Law Rev. 80, 565–597 (2015) 19. Aceto, G., Persico, V., Pescapé, A.: The role of Information and Communication Technologies in healthcare: taxonomies, perspectives, and challenges, (2018) 20. Toma, T., Masuda, Y., Yamamoto, S.: Vision Paper for Enabling Digital Healthcare Applications in OHP2030. In: Smart Innovation, Systems and Technologies. pp. 186–197 (2019) 21. Masuda, Y., Viswanathan, M.: Enterprise Architecture for Global Companies in a Digital IT Era: Adaptive Integrated Digital Architecture Framework (AIDAF). Springer (2019) 22. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Architecture Board Practices in Adaptive Enterprise Architecture with Digital Platform. Int. J. Enterp. Inf. Syst. 14, 1–20 (2018) 23. Fukami, Y., Masuda, Y.: Stumbling blocks of utilizing medical and health data: Success factors extracted from Australia-Japan comparison. In: Chen, Y.-W., Tanaka, S., Howlett, R.J., Jain, L.C. (eds.) Innovation in Medicine and Healthcare. Smart Innovation, Systems and Technologies, vol. 192, pp. 15–25. Springer, Singapore (2020) 24. Fukami, Y., Masuda, Y.: Governance for realization of medical, nursing and administration data integration system. In: Proceedings of Biennial Conference of the Asia-Pacific Association for Medical Informatics., Hamamatsu, Shizuoka (2020) 25. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: An Adaptive Enterprise Architecture Framework and Implementation. Int. J. Enterp. Inf. Syst. 13, 1–22 (2017)

Biological Engineering, Research and Technologies

Knowledge Distillation with Teacher Multi-task Model for Biomedical Named Entity Recognition Tahir Mehmood, Alberto Lavelli, Ivan Serina, and Alfonso Gerevini

Abstract A Multi-task model (MTM) learns specific features using shared and task specific layers among different tasks, an approach that turned out to be effective in those tasks where limited data is available to train the model. In this research, we utilize this characteristic of MTM using knowledge distillation to enhance the performance of a single task model (STM). STMs have difficulties in learning complex feature representations from a limited amount of annotated data. Distilling knowledge from MTM will help STM to learn more complex feature representations during the training phase. We use feature representations from different layers of a MTM to teach the student model during its training. Our approach shows distinguishable improvements in terms of F1-score with respect to STM. We further performed a statistical analysis to investigate the effect of different teacher models on different student models. We found that a Softmax-based teacher model is more effective for token level knowledge distillation than a CRF-based teacher model. Keywords Biomedical named entity recognition · Multi-task learning · Knowledge distillation · Deep learning · Long short-term memory

T. Mehmood (B) · I. Serina · A. Gerevini University of Brescia, 25121 Brescia, Italy e-mail: [email protected]; [email protected] I. Serina e-mail: [email protected] A. Gerevini e-mail: [email protected] A. Lavelli Fondazione Bruno Kessler, 38123 Povo, Trento, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_3

29

30

T. Mehmood et al.

1 Introduction The biomedical named entity recognition (BioNER) task identifies biomedical entities and categorizes them into predefined categories such as disease, gene, chemical etc. Tasks such as relation extraction (e.g., chemical induced disease relation, drug-drug interaction, …) include BioNER as a preliminary task [8]. However, biomedical texts are more complex than normal texts and carry unusual characteristics, e.g. spelling alternations (10-Ethyl-5-methyl-5,10-dideazaaminopterin vs 10EMDDA), long multi-word expressions (10-ethyl-5-methyl-5,10-dideazaaminopterin), and ambiguous words (TNF alpha can be used for both DNA and Protein) [4]. Such characteristics make the BioNER task more challenging than the standard named entity recognition task. State-of-the-art methods used in BioNER are based on the deep learning (DL) methods and have shown promising results [7]. However, unusual naming conventions in the biomedical literature and limited annotated data are still challenging aspects for DL systems. DL methods also depend on the amount of the training data and produce promising results with more training data. Furthermore, in most of the cases, state-of-the-art DL systems are based on complex structures. Sutskever et al. [10] proposed a DL model that utilizes 4-layers (each layer had 1000 hidden units) of long short-term memory (LSTM). Similarly, a multi-level LSTM based model is proposed by Zhou et al. [14], which had 512 hidden units at each level. Much computational power is required to train these DL models that have millions of parameters. These complex models also require more storage space, which is also not very suitable to deploy on the systems where available storage capacity is limited, e.g. cell phones. In such situations, the implementation of these complex models requires compression without much degradation in performance. In this regard, a knowledge distillation technique [5] is used, where one model teaches another model through its learned knowledge. This supervision is done through predictions, where the learning student model mimics the predictions of the teacher model. In this way, the simple model can learn the same generalization as of its teacher model, and, therefore, represents a more compact form of the teacher model. The learning model uses two gradients, i.e., the gradient of itself and the gradient of the teacher model, and for this reason, it can produce better results. This research proposes to use the knowledge distillation approach to improve the performance of STM for BioNER task using MTM’s feature representations. The MTM learns shared and task specific features while simultaneously training on different tasks. This helps MTM to generalize well for any specific task. This generalization can be used to teach STM to learn more complex and useful features. The logits (input to the Softmax layer)[5] of MTM are used to perform knowledge distillation. Logits are the unnormalized predictions and carry more information compared with the final Softmax layer since their value ranges from [−∞, +∞]. In other words, the STM (student) matches the true labels as well as the logits of the MTM (teacher) during its training. We also utilized the shared and task specific feature representations of the intermediate layers of the teacher MTM for the student

Knowledge Distillation with Teacher Multi-task Model …

31

STM. We also adopted an ensemble approach where Softmax-based and CRF-based MTMs knowledge are used for the training of the student model.

2 Knowledge Distillation In transfer learning, the learned representation from a source domain is utilized in another related domain. In contrast, the objective of knowledge distillation is to train a model with the knowledge learned by another model. The idea of the knowledge distillation is to train a simple (student) model on the knowledge learned by the complex (teacher) model. The complex models or ensemble methods (combining several models) usually produce better results than the simple STM, but they are more computationally expensive to train. The knowledge distillation approach helps the simple model (student) to produce better results than the standalone single model and the ensemble models. In this way the student model can be trained on fewer training examples since it will also consume the knowledge learned by the teacher model during training. The idea is that the complex model has already been generalized on the data during its training and this helps the student model to achieve or nearly achieve the generalization level of the teacher model. The student model not only learns through the gradient of itself but also through the gradient of another source of knowledge. Transferring knowledge from a teacher model is usually done in the shape of the probabilities predicted by the teacher model. The objective of any learning model is to predict the correct class for the input example and assign a high probability to the target class whereas assigning small probability values to the rest of the classes. Associating the probabilities to the rest of the incorrect classes is not performed randomly but depicts how a specific model has generalized the classes presented in the dataset. For instance, there is very little chance of misclassifying a motorbike image into a car image but the probability of misclassifying it into a truck image is usually significantly higher. This inaccurate information, which is referred as “dark” knowledge [5] , can be distilled to another model. The Softmax activation function outputs the probability distribution of the possible classes for the specific instance. The sum of these Softmax probability distributions sums to 1. These Softmax probabilities give more information compared to the one-hot “hard labels”. For instance, the Softmax probabilities, [0.7, 0.2, 0.1], show the ranking of the classes. Such information cannot be found in the hard labels, e.g., [1, 0, 0]. Therefore, the posterior probabilities can pass an extra useful signal to the student model during its training.

32

T. Mehmood et al.

3 Our Proposal Figure 1 introduces our proposed knowledge distillation approach. The teacher model is a MTM with the word and character input of the sentences. We use a bidirectional LSTM (BiLSTM) to process the sequence in both directions [9]. The upper layers, shown in the black round rectangle, of the MTM are shared among all the datasets. The bottom layers, shown in the red round rectangle, are dataset-specific whereas for output labelling, the Softmax function is used. In the multi-task learning (MTL) approach, shared layers help one task to be learned better with the help of another task. Training jointly on related tasks helps MTM to learn common features among different tasks by using shared layers [1]. The task-specific layers learn features that are more related to the current task. Training related tasks together helps the model to optimize the value of the parameters. The joint learning also lowers the chances of overfitting for any specific task. Therefore, we assume that the student model will also have lower chances to encounter overfitting with the help of knowledge distillation from the MTM. In this research, we aim to perform knowledge distillation at the token level and interested in the logits produced at the token level. For this reason, we use Softmax function which produces probability distribution at the token level. The token level knowledge distillation is not possible with conditional random fields (CRF) as they predict the labels of the whole sequence. The CRF-based model labels the sequence by considering the association between neighboring labels. This limits the distilling knowledge from the teacher models [12]. However, to verify this hypothesis, we used two different teacher MTMs where one MTM uses CRF at the output layer while the other uses Softmax. The student model is in fact a counterpart STM of the MTM. Therefore, the structures of both models are the same. In this research, we perform knowledge distillation using the teacher (MTM) logits, z t , which is the input of the Softmax layer [11]. The logits carry values that can range in [−∞, +∞], and they carry more information compared with the output layer where information is processed to produce labels. In addition to the logits of the MTM, we also use feature representations from different layers of MTM for knowledge distillation to the STM, this includes shared as well as task specific BiLSTM layers. During the training, the student model considers the hard labels as well as the distillation loss, which depends by the matching with the outputs of the teacher model (MTM). The loss function of our student model is depicted in equation (1), where the distillation loss aims to minimize the meansquared-error (MSE) between the student predictions and the teacher predictions at different layers. Here, x represents the input, W represents student model’s parameters, H is the cross-entropy loss whereas y corresponds to the true labels and σ is the Softmax function. The α and β are hyperparameters to quantify each loss. L(x; W ) = α · H(y, σ (z s , z t )) + β · (M S E)

(1)

Knowledge Distillation with Teacher Multi-task Model …

33

4 Experiments As a first approach, the MTM model, shown in the right side of Fig. 1, is trained separately. This MTM is then used to distill the knowledge to the student model. Different teacher MTMs are used for knowledge distillation. We use CRF-based MTM and Softmax-based MTM as teacher models. The CRF-based MTM uses CRF at the output layer whereas Softmax-based MTM uses Softmax. We additionally use an ensemble approach, where the logits of the CRF-based and Softmax-based MTMs are averaged to teach student models. The rational behind this approach is that different MTMs learn different features set, and therefore, we train our student models on a wider range of features. Similarly, we use two different approaches for our student models. In the first approach, we use a CRF-based student model that uses CRF at the output layer, whereas, in the second approach, Softmax is used in the Softmax-based student model. We perform three different experiments: in the first experiment, logits (output of the hidden layer) are used for knowledge distillation. In the second experiment, output of the shared BiLSTM is used for knowledge distillation while in the last experiment, both shared and task specific BiLSTM of the teacher model is used for knowledge distillation. In all the experiments, MSE is computed to match outputs of the intermediate layers. For our first experiment, where logits are considered for the knowledge distillation, we use β to quantify distillation loss i.e., β · M S Elogits , and is shown in Eq. (2). We

Fig. 1 Proposed Knowledge Distillation Approach (colored circles show embedding)

34

T. Mehmood et al.

perform experiments for different values of α i.e.,[0, 0.5, 1] whereas β = 1 − α. The hyperparameter tuning is not done for α, instead the values are selected in a simple straightforward way. If α = 0, then the student model learns with only distillation loss, while keeping α = 0.5 states that both student loss and distillation loss are considered equally. Finally, α = 1 only allows the student model to consider the student loss. It should be noted that for α = 1, the student model becomes STM but student model still considers the logits of the teacher model during its training phase. When we use the shared BiLSTM along with the logits of the hidden layer, we introduce another parameter, γ = [0, 0.5, 1], to penalize the shared layer matching loss as shown in Eq. (3). Similarly, when the output of the task specific BiLSTM is considered, the κ parameter is introduced (where κ = [0, 0.5, 1]) and the new loss is given in Eq. (4). However, in the latter cases, we kept α = 1, and perform hyperparamter tuning for β, γ , and κ to select the best value from [0, 0.5, 1] for each parameter. (2) L(x; W ) = α · H(y, σ (z s , z t )) + β · M S Elogits L(x; W ) = α · H(y, σ (z s , z t )) + β · M S Elogits + γ · M S E shar ed

L(x; W ) = α · H(y, σ (z s , z t )) + β · M S Elogits + γ · M S E shar ed + κ · M S E T ask Speci f ic

(3)

(4)

The input words are represented with WikiPubMed-PMC word embedding [3], while character embedding is initialized randomly and it is further processed by BiLSTM. In this paper, we perform experiments on the 15 biomedical datasets1 which are also used by Crichton et al. [2] and Wang et al. [13]. The description of such datasets can be found in [6]. We follow the same experimental setup adopted by Wang et al.,2 which uses both the training and the development sets for training the model. We consider the average F1-score of five runs for the results comparison in the next section.

5 Results and Discussion Table 1 shows the results of the CRF-based student model trained with the CRFbased MTM’s logits and with different α values. The best results are shown in the bold font while second best score is represented with the Italic style. It can be noticed that all the student models have shown a performance gain for most of the datasets compared with the STM. However, comparing with the teacher model 1 The

datasets can be found at https://github.com/cambridgeltl/MTL-Bioinformatics-2016.

2 https://github.com/yuzhimanhua/Multi-BioNER.

Knowledge Distillation with Teacher Multi-task Model …

35

(MTM), a performance gain is observed for only BC4CHEMD and linnaeus datasets. The increase in performance of the BC4CHEMD might be due to the single-task learning approach of the student model as this dataset shows the best score with STM, which can be observed by analyzing the STM and MTM F1-scores. We found that CRF-based MTM does not transfer enough knowledge at token level [12], and therefore, we notice performance degradation for the corresponding student models. To support our hypothesis, we extended the same experiments with the logits of the Softmax-based MTM, shown in Table 2. The logits of the Softmax-based MTM could benefit the CRF-based student model with additional information as the Softmax function independently assigns the probability distribution to each token. We observe that student models show noticeable performance gain compared with the STM whereas results are also improved for most of the datasets compared with their teacher MTM. It should be noted that in this table, we compare the results of the student models with their corresponding teacher model which is the Softmax-based MTM. On the other hand, in Table 1, the student models are compared with their corresponding teacher model i.e., CRF-based MTM. The BC4CHEMD dataset is the only one that shows best results using STM. It can be noticed that with the MTM there is a degradation of the BC4CHEMD result, and this worsening is propagated in the student models, as can be observed in the results of Table 2. By analyzing these results, we conclude that the Softmax-based teacher model is more beneficial for the knowledge distillation approach than the CRF-based teacher model. In another experiment, an ensemble approach is adopted where the logits of both CRF-based and Softmax-based MTMs are used to teach Softmax-based student mod-

Table 1 Results comparison of the proposed CRF-based student model Datasets STM MTM CRF-based Student α=0 α = 0.5 AnatEM BC2GM BC4CHEMD BC5CDR BioNLP09 BioNLP11EPI BioNLP11ID BioNLP13CG BioNLP13GE BioNLP13PC CRAFT ExPTM JNLPBA linnaeus NCBI 

86.83 81.82 90.43 88.68 87.88 83.41 86.21 83.18 76.65 87.69 85.11 73.54 72.26 87.86 84.85

87.70 81.68 89.10 88.45 89.08 85.28 87.66 84.70 80.93 89.35 84.49 82.44 72.84 88.32 86.00

87.66 80.17 90.23 88.31 88.03 84.09 86.84 83.42 77.04 88.17 84.38 76.02 71.41 88.86 84.97

Student model trained with CRF-based teacher MTM

87.67 80.23 90.10 88.38 88.09 83.94 87.10 83.39 76.88 88.06 84.41 76.06 71.20 89.50 85.07

α=1 87.62 80.34 90.07 88.29 88.03 83.97 86.76 83.25 76.94 88.29 84.37 75.95 71.22 88.92 84.99

36

T. Mehmood et al.

Table 2 Results comparison of the proposed CRF-based student models Datasets STM MTM CRF-based Student† α=0 α = 0.5 AnatEM BC2GM BC4CHEMD BC5CDR BioNLP09 BioNLP11EPI BioNLP11ID BioNLP13CG BioNLP13GE BioNLP13PC CRAFT ExPTM JNLPBA linnaeus NCBI †

86.53 81.07 90.24 88.09 87.37 82.58 85.58 82.11 75.38 87.26 84.27 73.06 70.86 87.88 83.98

86.78 79.68 86.80 87.49 88.40 84.56 87.26 83.83 80.06 88.17 81.96 80.69 70.40 88.32 84.50

87.48 81.26 89.81 88.33 88.94 83.92 86.82 83.12 77.89 88.33 84.10 76.08 71.82 88.94 85.33

87.59 81.36 89.71 88.31 88.79 84.40 86.66 83.31 77.80 88.25 84.35 76.08 71.89 89.67 85.37

α=1 87.49 81.20 89.78 88.30 88.75 84.38 86.69 83.36 77.97 88.28 84.32 76.38 71.72 88.78 85.54

Student model trained with Softmax-based MTM’s logits

els as shown in Table 3. The logits of both MTMs (CRF-based and Softmax-based) are combined and averaged during the training of the student models. However, instead of using CRF in the student model, the Softmax function is used at the output layer. Unlike the previous results, a noticeable improvement can be seen for student models. The student models also improve the results for those datasets for which the previous approaches failed to produce an improvement. To extract more knowledge from the teacher model, the knowledge distillation is performed at different levels of the intermediate layers. The output of both shared and task specific BiLSTMs (refer to Fig. 1) is used for knowledge distillation along with the logits of the hidden layer, and the results are depicted in Table 4. The MSE is computed for the output of the teacher’s BiLSTM and student’s BiLSTM. The proposed approach with task specific BiLSTM shows remarkable gain for all datasets compared with the STM whereas comparing against MTM, the increase in performance is noted for 9 datasets. Likewise our previously proposed approaches, it again fails to increase the performance for most of the protein datasets. However, with the introduction of the shared BiLSTM layer along with the task specific BiLSTM layer, we see that the performance of the student models increases. We also performed statistical analysis using Friedman test to see the results of the models that are significantly different with each other. We use the output ranks from the Friedman test to see which models are statistically better than others. The Fig. 2 depicts the statistical analyses of the CRF-based student models. The MTM(CRF) and STM(CRF) represent the models with CRF at the output layer unlike other MTM and

Knowledge Distillation with Teacher Multi-task Model …

37

Table 3 Results comparison of the proposed Softmax-based student models Datasets STM MTM Softmax-based Student†† alpha 0 alpha 0.5 AnatEM BC2GM BC4CHEMD BC5CDR BioNLP09 BioNLP11EPI BioNLP11ID BioNLP13CG BioNLP13GE BioNLP13PC CRAFT ExPTM JNLPBA linnaeus NCBI ††

86.53 81.07 90.24 88.09 87.37 82.58 85.58 82.11 75.38 87.26 84.27 73.06 70.86 87.88 83.98

86.78 79.68 86.8 87.49 88.40 84.56 87.26 83.83 80.06 88.17 81.96 80.69 70.4 88.32 84.50

87.57 81.41 90.09 88.44 88.74 84.47 87.48 83.74 78.21 88.34 84.48 76.23 72.00 89.35 85.45

85.17 81.35 90.02 88.50 88.92 84.51 87.54 83.66 78.12 88.41 84.58 76.11 71.95 89.02 85.17

alpha 1 87.83 81.34 90.21 88.50 88.90 84.67 87.26 83.71 78.16 88.35 84.57 76.25 72.01 89.17 85.53

Student model trained with Softmax-based and CRF-based MTMs

Table 4 Results comparison of the proposed Softmax-based student model Dataset STM MTM Student††† AnatEM BC2GM BC4CHEMD BC5CDR BioNLP09 BioNLP11EPI BioNLP11ID BioNLP13CG BioNLP13GE BioNLP13PC CRAFT ExPTM JNLPBA linnaeus NCBI †††

86.53 81.07 90.24 88.09 87.37 82.58 85.58 82.11 75.38 87.26 84.27 73.06 70.86 87.88 83.98

86.78 79.68 86.80 87.49 88.4 84.56 87.26 83.83 80.06 88.17 81.96 80.69 70.40 88.32 84.50

87.85 81.85 90.43 88.79 88.33 84.06 86.77 83.65 77.29 88.27 85.24 74.94 71.68 89.70 85.61

Student†† 87.93 81.59 90.61 88.84 88.68 84.59 86.83 83.61 78.18 88.27 85.05 74.49 71.55 88.67 85.60

Student model trained with task specific intermediate BiLSTM layer of Softmax-based MTM. Student model trained with Shared and task specific intermediate BiLSTM layer of Softmaxbased MTM ††

38

T. Mehmood et al.

Fig. 2 Statistical analysis of the student models with CRF at the output layer. Models are shown according to their ranks starting from best ranks from left to right. Soft represents the student model trained with Softmax-based MTM

Fig. 3 Statistical analysis of the student models with Softmax at the output layer. Models are shown according to their ranks starting from best ranks from left to right

STM where Softmax is used at the output layer. The arrows show that the models are statistically significant with each other. We notice that CRF-based student models are statistically worse compared with their corresponding teacher MTM(CRF). However, the Softmax-based student models produce statistically better results compared with their corresponding teacher MTM. We also see that no student model is able to produce statistically better results against another student model. However, student models (Soft_α) trained with Softmax-based MTM are statistically better compared with CRF-based and Softmax-based STMs, whereas student models (CRF_α) failed to produce statistically different results against CRF-based STM. In Fig. 3, we also analyzed the student models that use Softmax at the output layer. All the student models have produced statistically better results with respect to STM and the teacher model (MTM), but no student model has produced statistically significant results compared with any other variant of the student models. We can say that the Softmax-based student models are statistically better than the CRF-based student models due to the fact that Softmax-based student models have produced statistically significant results compared to the STM. The student model based on the output of the task-specific and shared BiLSTM (Task+Shared) has produced better ranks compared with the rest of the approaches.

6 Conclusions In this research, we have used knowledge distillation methods to boost the performance of an STM. We proposed a MTM as a teacher model and used different layers of the MTM for knowledge distillation, this includes shared and task specific layers as well. Two different MTMs are used, that differ in the function used at the output layer: one used Softmax while the other used CRF. Our proposed approach showed noticeable performance gains for STMs using knowledge distillation. Results showed

Knowledge Distillation with Teacher Multi-task Model …

39

that Softmax-based MTM is more effective for the token level knowledge distillation compared with the CRF-based MTM. We showed that distilling knowledge from intermediate layers of MTM is more effective. Finally, a statistical analysis demonstrated that knowledge distillation has a statistically significant impact on STM’s performance.

References 1. Bansal, T., Belanger, D., McCallum, A.: Ask the GRU: Multi-task learning for deep text recommendations. In: Sen, S., Geyer, W., Freyne, J., Castells, P. (eds.) Proceedings of the 10th ACM Conf. on Recommender Systems, Boston, MA, USA, September 15-19, 2016. pp. 107–114. ACM (2016) 2. Crichton, G.K.O., Pyysalo, S., Chiu, B., Korhonen, A.: A neural network multi-task learning approach to biomedical named entity recognition. BMC Bioinform. 18(1), 368:1–368:14 (2017) 3. Giorgi, J.M., Bader, G.D.: Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics 34(23), 4087–4094 (2018) 4. Gridach, M.: Character-level neural network for biomedical named entity recognition. J. Biomed. Informatics 70, 85–91 (2017) 5. Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015) 6. Mehmood, T., Gerevini, A., Lavelli, A., Serina, I.: Leveraging multi-task learning for biomedical named entity recognition. In: Alviano, M., Greco, G., Scarcello, F. (eds.) AI*IA 2019 - Advances in Artificial Intelligence - 18th Intl. Conf. of the Italian Association for Artificial Intelligence, Rende, Italy, November 19-22, 2019, Proceedings. LNCS, vol. 11946, pp. 431–444. Springer (2019) 7. Mehmood, T., Serina, I., Lavelli, A., Gerevini, A.: Knowledge distillation techniques for biomedical named entity recognition. In: Basile, P., Basile, V., Croce, D., Cabrio, E. (eds.) Proceedings of the 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th Intl. Conf. of the Italian Association for Artificial Intelligence (AI*IA 2020), November 25th-27th, 2020. CEUR Workshop Proceedings, vol. 2735, pp. 141–156. CEUR-WS.org (2020) 8. Putelli, L., Gerevini, A., Lavelli, A., Serina, I.: Applying self-interaction attention for extracting drug-drug interactions. In: Alviano, M., Greco, G., Scarcello, F. (eds.) AI*IA 2019 - Advances in Artificial Intelligence - 18th Intl. Conf. of the Italian Association for Artificial Intelligence, Rende, Italy, November 19-22, 2019, Proceedings. LNCS, vol. 11946, pp. 445–460. Springer (2019) 9. Putelli, L., Gerevini, A.E., Lavelli, A., Serina, I.: The impact of self-interaction attention on the extraction of drug-drug interactions. In: Bernardi, R., Navigli, R., Semeraro, G. (eds.) Proceedings of the Sixth Italian Conf. on Computational Linguistics, Bari, Italy, November 13-15, 2019. CEUR Workshop Proceedings, vol. 2481. CEUR-WS.org (2019) 10. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27: Annual Conf. on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada. pp. 3104–3112 (2014) 11. Tang, R., Lu, Y., Liu, L., Mou, L., Vechtomova, O., Lin, J.: Distilling task-specific knowledge from BERT into simple neural networks. CoRR abs/1903.12136 (2019) 12. Wang, X., Jiang, Y., Bach, N., Wang, T., Huang, F., Tu, K.: Structure-level knowledge distillation for multilingual sequence labeling. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. pp. 3317–3330. Association for Computational Linguistics (2020)

40

T. Mehmood et al.

13. Wang, X., Zhang, Y., Ren, X., Zhang, Y., Zitnik, M., Shang, J., Langlotz, C., Han, J.: Crosstype biomedical named entity recognition with deep multi-task learning. Bioinformatics 35(10), 1745–1752 (2019) 14. Zhou, J., Cao, Y., Wang, X., Li, P., Xu, W.: Deep recurrent models with fast-forward connections for neural machine translation. Trans. Assoc. Comput. Linguistics 4, 371–383 (2016)

Genomics-Based Models for Recurrence Prediction of Non-small Cells Lung Cancers Panyanat Aonpong, Yutaro Iwamoto, Weibin Wang, Lanfen Lin, and Yen-Wei Chen

Abstract This research is designed to examine the recurrence of non-small lung cancer (NSCLC) prediction using genomics information to reach the maximum accuracy. The raw gene data show very good performance but require more precise examination. This work is study about the way to reduce the complexity of the gene data with minimal information loss. This processed gene data tends to have the ability to archive the reasonable prediction result with faster process. This work presents a comparison of the operations of the two steps, including gene selection and gene quantization, Linear quantization and K-mean quantization, using associated gene selected from 88 patient sample from the open-access dataset of non-small cell lung cancer in The Cancer Imaging Archive Public Access. We use the different number of the group splitting and compare the performance of the recurrence prediction in both operations. The results of this study show us that the F-test method can provide us the best gene set that related to NSCLC recurrence. With F-test without quantization, accuracy of the prediction has been improved from 81.41% (using 5587 genes) to 91.83% (using selected 294 genes). With quantization methods, the suitable gene groups separation can maximize the accuracy to 93.42% using K-mean quantization. Keywords Non-small cell lung cancer · Recurrence prediction · Genomics · Genetic analysis

P. Aonpong (B) · Y. Iwamoto · W. Wang · Y.-W. Chen College of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan e-mail: [email protected] Y.-W. Chen e-mail: [email protected] Y.-W. Chen Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China L. Lin · Y.-W. Chen College of Computer Science and Technology, Zhejiang University, Hangzhou, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_4

41

42

P. Aonpong et al.

1 Introduction Lung cancer is one of the most dangerous diseases which has been the most common type of the cancer for several decade [1, 2]. The lung cancer can be divided into two types: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) [1–3]. From the statistic, the NSCLC covered more than 80 percent of all lung cancer patients [3, 4]. Early NSCLC surgery gave patients the best hope of a cure [3–5]. However, the rate of recurrence after surgery remained high [3, 5]. The recurrence patients are in the risk and should be received the specially take care and observe because half of all patients are dead after the recurrence [6]. If the doctors and patients keep preparing for the recurrence in advance, it may reduce the chance of death [3]. Nowadays, the analysis of lung cancer recurrence tends to be the trend of the study in both medical and statistical fields [3–12]. The good NSCLC recurrence prediction performance allow doctors and patients can better prepare to take care and observe the symptom [8–11]. The machine learning become the one of the important contributions to make a good diagnosis of recurrence [8–11]. The Radiomics Models [7–11] are developed to work with medical field tasks directly. Although the use of CT-image will produce satisfactory results, because the CT-image is unable to fully define some very detailed information about each patient, the prediction is therefore limited [13]. To overcome such limitations, we use gene data that could identify the patient health information much better than CT-image [14]. In this work, we compare the NSCLC recurrence prediction in two terms. First, a comparison is performed between using all existed gene data and three different gene selection methods to find the best gene set for NSCLC recurrence prediction. Second, using the best gene set, we make a comparison between non-quantization, linear quantization, and K-mean quantization to group the genes before used to predict the NSCLC recurrence. By the quantization methods, we also compare the performance of the prediction in different group number. We use accuracy matrix and area under the receiver operating characteristic curve (AUC) to evaluate the performance of the models in the comparison [15].

2 Materials and Methods The genomics-based method works based on knowledge of the genes and recurrence of NSCLC. This model uses genes expression as input. We first only select specific genes involved in recurrence. For each gene of all patients, we quantize the gene to group up the patients which have the observing gene that tend to express similar to the other patients in the group before they are fed into the neural network to analyze the information about the recurrence. The overall model is shown in Fig. 1.

Genomics-Based Models for Recurrence Prediction …

43

Fig. 1 The overall of genomics-based method

2.1 Dataset An open-access dataset of non-small cell lung cancer [16] in The Cancer Imaging Archive (TCIA) Public Access [17] was used in this study. This dataset was collected from NSCLC cohort of 211 patients. To verify the gene data, biologists must collect tumor samples from NSCLC patients, collect all tumor samples from non-drug volunteers during a period before collected. Along the longest axis of the amputated tissue, which will be frozen within 30 min after having the biopsy. Tissue was analyzed using RNA sequence expression. Whether RNA-sequencing (RNA-seq) data is exist depends on the presence of genes and the tissue quality. The RNA-seq analysis was performed with HiSeq 2500 (Illumina) machine according to the manufacturer’s instructions. The 130 tissue-set samples have been sequenced in 3 batch sizes: 16, 66, and 48 [16]. The RNA-seq data collected was pre-processed by CenTrillion Bioscience and gene expression was estimated in Fragments per Kilobase of Transcript Per Million (FPKM) [16, 18]. Finally, 22,127 genes data were collected for each patient [16]. The patient’s gene data showed ambiguous hypotension was reported by N/A. These ambiguous gene expressions will be removed from our work. Finally, there are only 5,587 genes out of 22,127 genes were used in our study. in the individual patient study dataset. For all patients, some patient samples have been screened by the preset criteria and availability of the information needed to inspect. Each patient does not have recurrence data and the person Patients who do not have a recurrent tumor but die before the time that it may recur. Eventually, 88 patients remained in the data set. The details of the data set are shown in Table 1. Table 1 Clinical characteristics of the screened subjects

Total n = 88 patients 46–85 (median = 69)

Age (year) Gender Cell type

Recurrence

Male

64 (72.72%)

Female

24 (27.27%)

ADC

68 (77.27%)

SQC

17 (19.30%)

Not otherwise specified

3 (3.41%)

No

29 (32.95%)

Yes

59 (67.05%)

44

P. Aonpong et al.

2.2 Gene Selection The gene expression data in the NSCLC radiation dataset is a large dataset containing more than 20,000 gene data for each patient. Most of the information is not related to the NSCLC recurrence. Overfilling of the genes into the model causes a significant increase in computational costs and a reduction in predictable recurrence results. For this reason, the selection of the specific genes involved was better than using the whole gene information in both computation time and precision prediction. We apply the feature selection methods to our work, including the least absolute shrinkage and selection operator (LASSO) [19], F-test (ANOVA) [20] and CHI-2 [21, 22], to select only related genes and remove non-related genes from the dataset.

2.3 Gene Quantization Since the values representing each gene are diverse in a wide range, the appropriate estimation method is difficult due to the high noise and very specific values. From our experiment, it has shown that gene quantization can im-prove the accuracy of gene estimation and recurrence prediction using estimated genes. The gene quantization methods which used in the experiment were linear quantization [23] and k-mean method [24, 25]. Linear quantization. Linear quantization is a method to change the form of continuous values to discrete values [23]. In this research, we applied linear quantization with the genes value to reduce the range of classification from continuous range to only k-class. The linear quantization will simply separate the range of data to k range and change the value of each element to the middle of the range that current element is belongs to. With linear quantization, this allows the step 1, gene estimation using CT-image, can process easily. K-mean. The k-mean clustering is an algorithm to classify or to group the objects or the vectors into K number of groups [24, 25]. The grouping using k-mean is done by minimizing the summation of the squares of the distances between data and corresponding cluster centroid [24, 25]. Although the linear quantization method can split the information into multiple parts, the linear quantization method does not use separate all the information according to the overall data but separate the information into the same range for each type. As a result, large groups of data may be split from each other if they fall exactly between the separation threshold. On the other hand, the k-mean method can group the similar data together considering the distance between every gene and every centroid. This makes the range of each data group are more flexible. In this work, linear quantization and k-mean quantization were applied to quantize the same gene of every patient. In every gene, all patients’ gene values will be considered and quantized using linear quantization and K-mean quantization. We do

Genomics-Based Models for Recurrence Prediction …

45

Fig. 2 The artificial neural networks used in NSCLC recurrence prediction using quantized genes

the quantization repeatedly for every gene. Each gene value ranges of all patients were quantized to only k classes. The middle value of each range in linear quantization and the centroids of the k-mean initially trained and saved from will be assigned to their estimated gene [25].

2.4 Classification In this process, Artificial Neural Network (ANN) [26] was used to classify the quantized genes. The ANN has been designed as shown in Fig. 2.

3 Experiments All examination in this study were performed based on ten-fold cross validation to find average accuracy. In each fold, CT images and gene data, 79–80 patients, were used as a training set, and CT images of 7–8 patients excluding gene data were assigned a validation set. Area under the Receiver Performance Characteristics (AUC) curve [15] is also provided to evaluate performance of each method. We made this experiment on a personal computer driven by CPU Intel® core™ i78700k @3.20–4.60 GHz, 48 GB of random-access memory (RAM) and RTX 2060 graphic accelerator. The Keras-GPU library version 2.2.4 on python 3.6 is used to perform these experiment’s actions. The processed genes (selected and quantized) with several different methods will be compare to non-processed genes in term of the NSCLC recurrence performance.

3.1 Genes Selection Results This section will show the comparison of the NSCLC recurrence prediction using different gene selection methods, including non-selected, LASSO [19], F-test [20] and CHI-2 [21, 22]. In LASSO, the zero-coefficient genes will be removed [19]. In

46 Table 2 Performance of the genomics method using the different feature selection methods in the efficiency of the selection and the accuracy

P. Aonpong et al. Feature selection method

Selected genes

Accuracy

Non-selected

5587

0.8141

LASSO

1123

0.8297

F-test

294

0.9183

CHI-2

2641

0.8122

F-test and CHI-2, The genes which have P-values higher than the threshold at P < 0.05 will be removed [20–22]. Only the genes that remain in the dataset will be fed to the neural network. The results of gene selection efficiency (the number of the selected genes) and selected gene set prediction performance has shown in Table 2. From the Table 2, the F-test method shown the highest efficiency to select the associated genes, it can reduce the genes from 5587 genes to only 294 genes. Furthermore, the selected gene show the better accuracy when compare to the non-selected method which use all the genes as the model input.

3.2 Genes Quantization Results This section will show the comparison of the NSCLC recurrence prediction using different gene quantization methods with the different groups counts. In both linear quantization method and K-mean quantization method, the range of group’s counts will be set from 2 to 10 groups. The results will show in the line graph for both accuracy and AUC as shown in Fig. 3. The bold lines show the accuracy and dotted lines show the AUC. From the Fig. 3, at the group numbers higher than 3, K-mean quantization results overcome the linear quantization results in both accuracy term and AUC term. Furthermore, if the group number set higher than 5 in K-mean method, the accuracy of recurrence prediction is higher than use the genomics method without gene quantization.

4 Conclusions In this study, the genomics method for the NSCLC recurrence prediction has been proposed. We use the gene selection to select the genes that related to the recurrence to improve the performance. Then, pass the selected genes to the quantization method. In this study, we found that the best gene selection method is the F-test method which can improve the accuracy from non-selected at 81.41–91.83%. Not only the accuracy improvement, reducing the number of genes can reduce the computational cost because the genes count has been reduced from 5587 genes to only 294 genes. About the gene quantization, K-mean quantization tend to be the most suitable

Genomics-Based Models for Recurrence Prediction …

47

Fig. 3 The gene quantization performance comparison of the recurrence prediction without quantization, linear quantization, and K-mean quantization with different group numbers

method because if the suitable group number has been set, the accuracy and AUC will be increase from the non-quantized gene set, from 91.83% (AUC = 89.13) to up to 93.42% (AUC = 91.83). Although the gene selection and the gene quantization technique can improve the performance of the NSCLC recurrence prediction from using the original gene data, the gene data is relatively expensive when compare to the other kind of the data such as computer tomography (CT-image) which is also popular among the NSCLC recurrence prediction study. In the next study, we will study about applying CT-image and gene data together to predict the recurrence for the maximum accuracy. Acknowledgements This work was supported in part by the Grant-in Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant No. 20KK0234, No.18H03267 and No. 20K21821.

48

P. Aonpong et al.

References 1. Zarogoulidis, K., Zarogoulidis, P., Darwiche, K., Boutsikou, E., Machairiotis, N., Tsakiridis, K., Spyratos, D. et al.: Treatment of non-small cell lung cancer (NSCLC). J. Thorac. Dis. (5), S389 (2013) 2. Jemal, A. et al.: Global cancer statistics. CA: Canc. J. Clin. 61(2), 69–90 (2011) 3. Thomas, P., Rubinstein, L., Lung Cancer Study Group: Cancer recurrence after resection: T1 N0 non-small cell lung cancer. Ann. Thorac. Surg. 49(2), 242–247 (1990) 4. Bareschino, M.A. et al.: Treatment of advanced non-small cell lung cancer. J. Thorac. Dis. 3(2), 122 (2011) 5. Uramoto, H., Tanaka, F.: Recurrence after surgery in patients with NSCLC. Transl. Lung Canc. Res. 3(4), 242 (2014) 6. Lee, E.-S. et al.: Prediction of recurrence-free survival in postoperative non–small cell lung cancer patients by using an integrated model of clinical information and gene expression. Clin. Canc. Res. 14(22), 7397–7404 (2008) 7. Huynh, E. et al.: Associations of radiomic data extracted from static and respiratory-gated CT scans with disease recurrence in lung cancer patients treated with SBRT. PloS One 12(1) (2017) 8. Kato, S. et al.: Computed tomography appearances of local recurrence after stereotactic body radiation therapy for stage I non-small-cell lung carcinoma. Jpn. J. Radiol 28(4), 259–265 (2010) 9. Fehrenbach, U. et al.: Tumour response in non-small-cell lung cancer patients treated with chemoradiotherapy—can spectral CT predict recurrence? J. Med. Imag. Rad. Oncol. 63(5), 641–649 (2019) 10. Mattonen, S.A. et al.: Early prediction of tumor recurrence based on CT texture changes after stereotactic ablative radiotherapy (SABR) for lung cancer. Med. Phys. 41(3), 033502 (2014) 11. Aonpong, P. et al.: Comparison of machine learning-based radiomics models for early recurrence prediction of hepatocellular carcinoma. J. Image Graph. 7(4) (2019) 12. Kuang, P., Wei-Na, C., Qiao, W.: Preview on structures and algorithms of deep learning. In: 2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing (ICCWAMTIP) (2014). IEEE 13. Pennes, D.R., et al.: Chest wall invasion by lung cancer: limitations of CT evaluation. Am. J. Roentgenol. 144(3), 507–511 (1985) 14. Buettner, R., Wolf, J., Thomas, R.K.: Lessons learned from lung cancer genomics: the emerging concept of individualized diagnostics and treatment. J. Clin. Oncol. 31(15), 1858–1865 (2013) 15. Tom, F.: An introduction to ROC analysis. Pattern Recogn. Lett. 861–874, (2005) 16. Bakr, S., Gevaert, O., Echegaray, S., Ayers, K., Zhou, M., Shafiq, M., Zheng, H., Zhang, W., et al.: Data for NSCLC radiogenomics collection. Canc. Imaging Arch. (2017). https://doi.org/ 10.7937/K9/TCIA.2017.7hs46erv 17. Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., Prior, F.: The cancer imaging archive (TCIA): maintaining and operating a public information repository. J. Dig. Imaging 26(6), 1045–1057 (2013) 18. Lambin, P. et al.: Radiomics: extracting more information from medical images using advanced feature analysis. Eur. J. Canc. 48(4), 441–446 (2012) 19. Ying, Z., Lan, H., Yanqi, H., Shuting, C., Penqi, W, Weitao, Y., Zaiyi, L., Changhong, L.: CTbased radiomics signature: a potential biomarker for preoperative prediction of early recurrence in hepatocellular carcinoma. Abdom. Radiol. (2017) 20. Gaddis, M.L.: Statistical methodology: IV. Analysis of variance, analysis of co variance, and multivariate analysis of variance. Acad. Emerg. Med. 5(3), 258–265 (1998) 21. Lancaster, H.O.: The Chi-squared Distribution. Wiley (1969) 22. McHugh, M.L.: The Chi-square test of independence. Biochem. Med. (Zagreb), 143–149 (2013) 23. Gray, R.M., Neuhoff, D.L.: Quantization. IEEE Trans. Inf. Theory 44(6), 2325–2383 (1998) 24. Teknomo, K.: K-means clustering tutorial. Medicine 100(4), 3 (2006)

Genomics-Based Models for Recurrence Prediction …

49

25. Ahmad, A., Lipika, D.: A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl. Eng. 63(2), 503–527 (2007) 26. Hassoun, M.H.: Fundamentals of Artificial Neural Networks. MIT press (1995)

IDH Mutation Status Prediction by Modality-Self Attention Network Xinran Zhang, Yutaro Iwamoto, Jingliang Cheng, Jie Bai, Guohua Zhao, Xian-Hua Han, and Yen-Wei Chen

Abstract Isocitrate dehydrogenase (IDH) status is an important basis for the diagnosis of gliomas in the 2016 World Health Organization classification scheme. A strong relationship exists between IDH mutation status and glioma prognosis. The preoperative prediction of IDH status is essential for the treatment of gliomas. However, the existing medical methods cannot predict IDH status before an operation. In this study, we propose a modality self-attention network to predict IDH mutation status on multimodality magnetic resonance imaging images. The proposed method predicts the importance of each modality for classification task and calculates weights and then uses weighted images for training. Moreover, we select a light and high-performance self-attention network for the classification to solve the overfitting problem on the glioma dataset of the First Affiliated Hospital of Zhengzhou University (FHZU). The proposed method achieved an F1-score of 0.6570 on the FHZU dataset, which is better than SE-Net (0.2563), a method proposed by Yoon Seong Choi et al. (0.3999), and SA-Net (0.5245). Keywords Isocitrate dehydrogenase · Glioma · Attention · Deep learning

1 Introduction Brain tumors are classified as primary and secondary. Glioma is the most prevalent primary brain tumor [1], and glioblastoma (GBM) is the most aggressive glioma. Less than 5% patients survive five years following diagnosis of glioblastoma [2]. X. Zhang · Y. Iwamoto · Y.-W. Chen (B) Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu 525-8577, Shiga, Japan e-mail: [email protected] J. Cheng · J. Bai · G. Zhao Department of Magnetic Resonance Imaging, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450004, China X.-H. Han Yamaguchi University, 1677-1 Yoshida, , Yamaguchi 753-8511, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_5

51

52

X. Zhang et al.

Magnetic resonance imaging (MRI) is a common method for glioma diagnosis. The MRI of glioma usually produces images in four modalities, i.e., T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), contrast-enhanced T1-weighted imaging (CE-T1WI), and fluid-attenuated inversion recovery (FLAIR). Each modality has its characteristics; the T1WI image shows the structure clearly, T2WI image is used to localize the tumor, CE-T1WI reveals the internal condition of the tumor and FLAIR image shows the lesion’s location with water suppression. Isocitrate dehydrogenase (IDH) status is an important basis for diagnosis in the 2016 World Health Organization (WHO) classification scheme for gliomas [3]. In low-grade gliomas, IDH mutated gliomas show a similar prognosis with IDH wildtype gliomas. However, IDH mutated GBM showed a better prognosis than IDH wildtype GBM [4]. Thus, the preoperative prediction of IDH status is essential for glioma treatment. Moreover, conventional methods cannot predict IDH mutation status before surgery and use it for treatment planning. A convolutional neural network (CNN) is a representative deep learning method. The CNN network automatically extracts the features required for image classification. Recently, CNN-based IDH prediction methods have been proposed [5]. However, the existing methods use four modalities as a 4-channel image, and differences among individual modalities is not considered. In this study, we propose a modality-self attention network to predict IDH mutation status on multi-modality MRI images. The proposed method first predicts the importance of each modality for the classification task and calculates weights and then uses weighted images for training. Furthermore, we selected a light and high-performance self-attention network for classification and obtained the best result on the FHZU dataset.

2 Method The proposed network architecture is shown in Fig. 1. The input is a 4-channel image, where each channel is an MRI image in a different modality. We used the multi-modality attention block to calculate the weights of each modality. Finally, we used the weighted images for IDH mutation status prediction.

Fig. 1 Architecture of the proposed modality self-attention network

IDH Mutation Status Prediction by Modality-Self Attention Network

53

2.1 Multi-Modality Attention Block Inspired by SE-Net [6], we propose a multi-modality attention block. The architecture of the multi-modality attention block is shown in Fig. 2. First, we used global average pooling (GAP) to calculate the average of each modality image: W H   1 xk (i, j) zk = H × W i=1 j=1

(1)

Here, k = 1, 2,…, C, and C indicates the number of modality. W and H represent the width and height of the input image z is the result of GAP, and x indicates the pixel value. We used the feature vector of average to predict the weight of each modality using full connection layer (FC): 2s = [s1 , s2 , . . . , sC ] = σ (W2 ReLU (W1 2z))

(2)

Here, s is the weight vector for four modalities, and σ indicates sigmoid function. W1 and W2 represent the weight vectors of FC layers. Finally, multiply each modality image by its weight: xk = sk xk (k - th weighted modality image) Fig. 2 Architecture of the multi-modality attention block

(3)

54

X. Zhang et al.

Here, x k are weighted multi-modality MRI images. Note that we did not apply channel attention to each convolutional layer because additional parameters would lead to overfitting.

2.2 Self-Attention Network [7] Convolution operation can be defined as follows: Yi =



X jkj

(4)

j∈R(i)

Here, Y is features calculated by convolution operation, X is the input,k is the convolution kernel, and R is the receptive field. Convolution operation has the following drawbacks: it lacks rotation invariance; as the receptive field becomes large, the number of parameters increases; the stationarity of the filter. Self-attention [7] is proposed to solve the above problems. We defined the self-attention operation as follows:       γ δ xi , x j  β x j (5) Yi = j∈R(i)

Here, δ is the relation function between target pixel and local neighbors. We used subtraction for relation function:     δ xi , x j = ϕ(xi ) − ψ x j

(6)

The output of the relation function is a single vector that represents the features xi, and x j ·γ , ϕ, ψ, β are 1 × 1 convolution layers. The output of γ is the weight of pixel x j . Self-attention network does not use stationarity convolution kernel. Instead, weights are calculated by relationship between the target pixel and local neighbor pixels using 1 × 1 convolution layers. In this case, parameters do not increase when the receptive field becomes larger, and the rotation invariance is achieved. The architecture of the self-attention operation is shown in Fig. 3.

3 Experiments 3.1 Dataset The dataset for this study was provided by the FHZU. The detail is shown in Table 1. The dataset contains 197 patients with IDH mutation and 555 patients with

IDH Mutation Status Prediction by Modality-Self Attention Network

55

Fig. 3 Architecture of the self-attention operation

Table 1 Dataset details

Dataset

IDH mutated

IDH wildtype

Train

384 slices (177 patients)

491 slices (276 patients)

Test

33 slices (20 patients)

64 slices (29 patients)

IDH wildtype. Each patient has four modalities MR images (i.e., T1, T2, T1ce, and FLAIR images). All IDH mutation labels are based on patient clinical outcomes. Based on the imaging presentation, the physicians also labeled the abnormal areas of individual tumors, which will be used for Region of Interest (ROI) cuts. The four modality images of a typical tumor are shown in Fig. 4. Note that each image of each modal corresponds to the other. The ROI images are used as input images for the prediction. We divided the dataset into training and test sets, as shown in Table 1. We trained SE-Net [6] and Choi et al. methods [5] from a pretrained model using ImageNet for 100 epochs. Moreover, we trained the proposed network from scratch for 100 epochs. We used Adam with batch size 8 and 0.01 learning rate on Nvidia GeForce RTX 2080Ti 11 GB GPU for training. We registered all MRI images to 0.85 × 0.85 × 6.5 mm using random horizontal flip for data augmentation. We only selected 1–3 images with the largest tumor area for each case because the border images contain less information.

3.2 Result The result on the FHZU dataset is shown in Table 2. Although traditional deep learning methods (e.g., SE-Net and Yoon Seong Choi et al. methods) have high

56

X. Zhang et al.

Fig. 4 Representative images Table 2 Result on FHZU dataset Method

Accuracy

Precision

Recall

F1 Score

SE-Net-18 [6]

0.7010

0.8333

0.1515

0.2563

Choi et al. [5]

0.7216

0.75

0.2727

0.3999

SA-Net [7]

0.7010

0.5714

0.4848

0.5245

Proposed method

0.7525

0.6216

0.6969

0.6570

IDH Mutation Status Prediction by Modality-Self Attention Network

57

precisions, most IDH mutated cases are not detected, thereby resulting in a low F1score. The self-attention network [7] has a low precision; however, approximately 50% of the IDH mutated cases are detected. The self-attention network has better F1-score performance than traditional methods. Based on SA-Net [7], the proposed method further improves the detection rate of mutated cases and achieves better F1-score performance.

4 Conclusion In this study, we proposed a modality self-attention network to predict IDH mutation status on multi-modality MRI images. We considered the effect of each modality on the prediction results of IDH and achieved the best performance on the FHZU dataset. Acknowledgements This work is supported in part by the Grant-in-Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant No. 20KK0234, No. 20K21821, and in part by Zhejiang Lab Program under the Grant No. 2020ND8AD01.

References 1. Ostrom, Q.T., Gittleman, H., Xu, J. et al.: CBTRUS statistical report: primary brain and other central nervous system tumors diagnosed in the united states in 2009–2013. Neuro Oncol. 18(5), v1–v75 (2016) 2. De Vleeschouwer, S. (ed.): Glioblastoma [Internet]. Codon Publications, Brisbane (AU) (2017) 3. Louis, D.N., Perry, A., Reifenberger, G., et al.: The 2016 World Health Organization classification of tumors of the central nervous system: a summary. Acta Neuropathol. 131(6), 803–820 (2016) 4. Yan, H., Parsons, D.W., Jin, G., et al.: IDH1 and IDH2 mutations in gliomas. N Engl J Med. 360(8), 765–773 (2009) 5. Choi, Y.S., Bae, S., Chang, J.H. et al.: Fully automated hybrid approach to predict the IDH mutation status of gliomas via deep learning and radiomics. Neuro Oncol. (2020) 6. Hu, J.., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018) 7. Zhao, H., Jia, J., Koltun, V.: Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)

Medical Watermarking

A Novel Robust Watermarking Algorithm for Encrypted Medical Image Based on Bandelet-DCT Yangxiu Fang, Jing Liu, Jingbing Li, Dan Yi, Wenfeng Cui, Xiliang Xiao, Baoru Han, and Uzair Aslam Bhatti

Abstract Digital watermarking technology is a good way to solve the problem of image information security. However, in previous studies, watermark is encrypted, but the host image is not encrypted, which causes the leakage of the private information of the images themselves. When the carrier image is a medical image, more attention should be paid. Therefore, a new robust zero-watermarking algorithm based on Bandelet-DCT (Bandelet Transform and Discrete Cosine Transform) for Y. Fang · J. Li (B) · D. Yi · W. Cui · X. Xiao School of Information and Communication Engineering, Hainan University, Haikou, Hainan, P.R. China e-mail: [email protected] Y. Fang e-mail: [email protected] D. Yi e-mail: [email protected] W. Cui e-mail: [email protected] X. Xiao e-mail: [email protected] J. Li State Key Laboratory of Marine Resource Utilization, South China Sea, Hainan University, Haikou, Hainan, P.R. China J. Liu Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, Zhejiang, P.R. China e-mail: [email protected] B. Han College of Medical Informatics, Chongqing Medical University, Chongqing, P.R. China e-mail: [email protected] U. A. Bhatti School of Geography (Remote Sensing and GIS Lab), Nanjing Normal University, Nanjing, Jiangsu, P.R. China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_6

61

62

Y. Fang et al.

medical image in the encryption domain of DWT-DCT (Discrete Wavelet Transform and Discrete Cosine Transform) is proposed. Firstly, the original medical image is encrypted in a transform domain based on a Logistic chaotic map to enhance the security of the original medical image. Then Bandelet-DCT is applied to extract the feature sequences of encrypted medical images which were preprocessed by SIFT (Scale Invariant Feature Transform). In the stage of watermarking embedding and extraction, zero-watermarking technology is implemented to ensure that the region of interest of the medical image remains unchanged, and evaluate the robustness of the algorithm through the correlation coefficient between the original watermark and the attacked watermark. Experimental results show that the proposed algorithm has strong robustness and can effectively solve the problem of information leakage. It has a strong anti-attack ability, good robustness, and certain application prospects. Keywords Medical images · Zero-watermarking · Encrypted domain · Bandelet-DCT · Robustness · Scale invariant feature transform

1 Introduction In recent years, with the gradual improvement of 5G technology, traditional medical care has gradually become cloud-based, and modern diagnostic tools such as medical images have also been widely used in the prediction and diagnosis of the medical industry[1–3], more and more medical images containing a large amount of private patient information are frequently active on the Internet or stored on the cloud. How to prevent the leakage of patient private information has always been a major difficulty in medical image research. To solve this problem, digital watermarking technology is a good solution[4]. The particularity of medical images requires that the embedding of watermarks cannot affect the doctor’s diagnosis[5]. Most of the commonly used digital image watermarking technologies are embedding and extracting watermark in the plaintext domain. Therefore, once the medical image in the plaintext domain is intercepted during Internet transmission, the private information carried by the carrier image will be exposed [6]. In other words, the watermarking technology in the plaintext domain cannot guarantee the security of the carrier image itself. Especially when the carrier image is a special medical image, we should be more cautious when using the related technologies of digital watermarking. Therefore, perform the related operation of watermarking in the ciphertext domain is a good solution [7, 8]. Digital watermarking technology is different in the plaintext domain, embedding and extracting the watermarking in the ciphertext domain can better protect the information security of the carrier image. Homomorphic encryption allows the encrypted carrier image and watermarking to be safely handed over to a third party for processing, without worrying about hidden security issues such as information leakage, and it is more convenient to use the advantages of the third party’s powerful resource to implement watermarking technology [9–11]. To perform related

A Novel Robust Watermarking Algorithm for Encrypted …

63

watermarking operations in the ciphertext domain, we need to encrypt the carrier image. Many researchers have made outstanding contributions to image encryption. Avudalappan et al. [3] used dual encryption procedure to encrypt the medical images. Xiong et al. [12] proposed a reversible image hiding algorithm based on integer wavelet transform, histogram shift and orthogonal decomposition. Hossein Nematzadeh et al. [7] proposed a medical image encryption method based on a hybrid model of the modified genetic algorithm (MGA) and coupled mapping lattice which not only has excellent encryption performance, but also can resist various typical attacks. Although there are many image encryption methods, the image encryption algorithms that can be used for robust watermarking are not very mature, especially for medical images. Bouslimi et al. [13] proposed a new encryption scheme for image data hiding, but the encrypted image cannot guarantee the robustness of the embedded watermarking. Samar M. Ismail et al. [14] proposed an image encryption algorithm based on a generalized Double Humped (DH) logistic map generated by pseudo-random sequence, but the watermarking embedded after encryption has poor watermarking quality under the traditional and geometric attacks. In summary, we observed that after applying some robust watermarking algorithms in the plaintext domain directly to the encrypted domain, the effect is not satisfactory. Based on the above reasons, this paper creatively proposed a new robust watermarking algorithm for encrypted medical images based on Bandelet-DCT and chaotic mapping. It adopted a zero-watermarking algorithm. By selecting appropriate visual feature vectors to process the watermarking in the ciphertext domain, the security of the watermarking and the medical image itself is guaranteed. We have proved that the algorithm has good robustness against common attacks and geometric attacks.

2 The Fundamental Theory 2.1 SIFT(scale-invariant feature transform) The SIFT was proposed by Lowe [15], and proved that it has ideal robustness to the rotation, translation, scaling, and projection transformation. This algorithm is a local feature description operator that uses local image features to extract key points. These key points refer to the maximum value of the image gray value. Its ingenuity lies in the adoption of the gaussian difference pyramid (see Fig. 1a). SIFT mainly realizes feature recognition through three processes: (1) Extract key points, as shown in Fig. 1b; (2) Add key point description information to obtain feature vectors (see Fig. 1c). Perform mechanism detection in the DOG (Difference of Gaussian) scale space to identify potential points of interest that are unchanged in scale and direction. Finally, the location and scale of the feature points are determined by fitting a fine model,

64

Y. Fang et al.

Fig. 1 a Gaussian pyramid and DOG pyramid; b Extract extreme key points; c Get SIFT feature vector

and the low-contrast and unstable edge feature points are removed to obtain a stable feature area.

2.2 Bandelet Transform The second-generation Bandelet transform was proposed by Peyer and Mallat [16] in 2005. It effectively simplifies the first-generation algorithm and is an adaptive multi-scale geometric analysis method. According to the geometric flow direction of the image gray level which change regularly, we can construct a bandelet base and can better extract geometric features such as complex textures and edges of the image (as shown in Fig. 2), there is no edge effect when reconstructing the image, and the algorithm process is simple. Bandelet transform has two advantages compared with wavelet transform: (1) Taking full advantage of geometric regularity, high-frequency sub-band energy is

A Novel Robust Watermarking Algorithm for Encrypted …

65

Fig. 2 Bandelet Transform: a The geometric flow direction of the original medical image; b Schematic diagram of geometric flow direction in sub-block

more concentrated, and non-zero coefficients are relatively reduced under the same quantization step; (2) Due to the quadtree structure and geometric flow information, the Bandelet coefficients can be rearranged, and the coefficient scanning mode is more flexible when encoding.

2.3 The Discrete Wavelet Transform (DWT) In the field of digital image processing, discrete wavelet transform [17] is a spatial frequency multi-scale analysis method of an image. It can characterize the local features of the image in both the spatial domain and the frequency domain. When used in digital watermarking technology, it can improve the robustnesss of watermarking against noise attacks and compression attacks. After the image is decomposed by a first-level wavelet, it will be decomposed into four subbands with different directions and different resolutions: low-frequency approximation subband (LL), horizontal high-frequency detail subband (HL), vertical high The frequency detail subband (LH) and the diagonal high-frequency subband (HH) are second-order wavelet transforms as shown in Fig. 3.

Fig. 3 Representation of 2-Levels of Wavelet Transform

66

Y. Fang et al.

The high-frequency sub-band embodies the detailed information of the image, that is, the texture area of the image, while the low-frequency sub-band contains the main visual content of the image and reflects the general characteristics of the image.

3 The Proposed Method 3.1 Encryption of Original Medical Images Due to the particularity of medical images, it is necessary to perform our watermarking algorithm in the ciphertext domain. Figure 4 shows the encryption scheme for medical images. The specific description is as follows: (1)

(2)

Perform DWT transformation on the original medical image O I (i, j) to obtain LL, HL, LH, HH subband wavelet coefficients; then apply DCT transformation to each wavelet subband coefficient to obtain coefficient matrix D(i, j); {L L , H L , L H, H H } = DW T 2(I (i, j))

(1)

D(i, j) = DC T 2(L L , H L , L H, H H )

(2)

Extract the Tent mapping chaotic sequenceX ( j), and then process it through the symbolic function sgn(x) to obtain the binary encryption matrixC(i, j), and then use the dot product algorithm to process D(i, j) and C(i, j), get the encryption coefficient matrixE D  (i, j); E D  (i, j) = D(i, j) ∗ C(i, j)

(3)

(3)

Perform IDCT transformation on the matrix E D  (i, j) to obtain the encrypted subband wavelet coefficient matrix E D(i, j), and then perform IDWT transformation on the matrix E D(i, j) to obtain the encrypted medical image

Fig. 4 The process of original medical image encryption

A Novel Robust Watermarking Algorithm for Encrypted …

67

Fig. 5 Watermarking embedding algorithm

E(i, j); E D(i, j) = I DC T 2(E D(i, j))

(4)

E(i, j) = I DW T 2(E D(i, j))

(5)

3.2 Watermarking Embedding Algorithm Due to the particularity of medical images, that is, the watermarking embedding and extraction must ensure the integrity of the medical image. This paper uses zerowatermarking technology to complete the above work. To improve the robustness of the algorithm against attacks, the SIFT preprocessing transformation is performed on the encrypted medical image before the feature extraction of the encrypted medical image, and the local feature invariant region I R(i, j) of the original encrypted medical image is established. Figure 5 shows the watermark embedding algorithm. K ey(i, j) = H V ( j) ⊕ E W (i, j)

(6)

3.3 Watermarking Extraction Algorithm Figure 6 shows the algorithm of watermarking extraction. Firstly, extract the feature vector V  ( j) of the encrypted medical image E  (i, j) to be tested, and then the hash function is used to process the key K (i, j) and the feature vector V  ( j) to extract the encrypted watermark E W  ( j), and the restored watermark can be obtained by decrypting it.

68

Y. Fang et al.

Fig. 6 Watermarking extraction algorithm

4 Experimental Results and Performance Analysis To verify the effectiveness of the algorithm, the medical image of abdominal (see Fig. 7a) and a 32-bit binary watermark image (see Fig. 7b) is selected for experiments, and use peak signal-to-noise ratio (PSNR) and normalized correlation coefficient (NC) to evaluate the robustness of the algorithm against attacks. In the experiment, we carried out various common attacks and geometric attacks on images embedded with watermarks. M N max (I(i, j) )2 i, j P S N R = 10log10   2  i j (I(i, j) − I (i, j) )   i j W(i, j) W (i, j)  2 NC = i j W(i, j)

(7)

(8)

Fig. 7 Medical pictures and watermarks: a Original medical image; b The encrypted medical image; c Original binary watermark; d The encrypted watermark

A Novel Robust Watermarking Algorithm for Encrypted …

69

Table 1 The values of PSNR and NC under common attacks based on Bandelet-DCT Common attacks

Gaussian noise

JPEG compression

5%

15%

25%

1%

10%

20%

PSNR(dB)

13.42

10.08

8.99

25.87

31

33.23

NC

0.94

0.94

0.89

0.84

0.89

0.89

Fig. 8 Medical Images under common attacks: a Gaussian noise level of 25%; b the extracted watermarking of (a); c JPEG compression quality 1%; d the extracted watermarking of (c);

4.1 Common Attacks We conduct common attacks of different strengths on encrypted medical images and calculate the NC value to test the robustness of the new algorithm (see Table 1). We observed that in the Gaussian noise attack, when the noise intensity reaches 30%, the NC value is still 0.84. When the JPEG compression quality is only 1%, its NC value can also reach 0.84. In Fig. 8 we show some medical images to be tested under common attacks and the extracted watermarks. The results proved that the algorithm has strong robustness against common attacks.

4.2 Geometric Attacks The robustness of algorithms against geometric attacks has always been a problem that researchers are committed to solving. Table 2 show the performance of test images under common geometric attacks. From the data in the table, we can see the significant advantages of the algorithm in this paper against geometric attacks. When the image is subjected to high-intensity geometric attacks, such as the rotation degree reaches 40 degrees, the scaling to 0.2, the downward translation reaches 30%, a square cut reaches 1/4, the value of NC still can be maintained above 0.7, and the extracted watermarking image features are distinguishable (as shown in Fig. 9).

70

Y. Fang et al.

Fig. 9 Images under geometrical attacks: a Rotation (clockwise) 40°; b the extracted watermark of (a); c Scaling factor 0.2; d the extracted watermark of (c); e right translation 30%; f the extracted watermark of (e); g down distance 30%; h the extracted watermark of (g); i Middle square cropping 25%; j the extracted watermark of (i)

Table 2 The values of PSNR and NC under geometrical attacks based on Bandelet-DCT

Geometrical attacks

Intensity of attacks

PSNR/dB

NC1

Rotation(clockwise)

12° 27° 40°

17.44 15.42 14.80

0.83 0.81 0.75

Scaling

× 0.2 × 1.4 × 2.4

-

0.89 0.81 0.94

Up translation

5% 15% 30%

18.92 15.36 12.38

0.82 0.83 0.69

Right translation

5% 15% 30%

18.19 14.68 12.18

0.81 0.83 0.81

Cropping(Square)

Upper left (1/8) Middle (1/4) Bottom right (1/8)

-

0.86 0.89 0.94

4.3 Comparison With Unencrypted Algorithm Using the same feature extraction method to test the robustness in the plaintext domain and the ciphertext domain, the test results are shown in Fig. 10. We can see that the robustness of the algorithm in the encryption domain is similar to that in the plaintext domain, indicating that the encryption algorithm is homomorphic. A to B respectively represent the attacks: Gaussian noise 30%, JPEG compression 10%,

A Novel Robust Watermarking Algorithm for Encrypted …

71

Fig. 10 Comparison with unencrypted algorithm

Rotation 16° (clockwise), Scaling × 0.6, Translation 15% (left), Translation 15% (up), Square Cropping (Middle 25%).

4.4 Comparison With Other Encrypted Algorithms To better reflect the performance of the algorithm in this paper, the robustness comparison of the following encryption algorithms against attacks is added. It can be seen from Fig. 11 that the encryption algorithm in this paper is more stable and robust against common attacks and geometric attacks.

5 Conclusion This paper creatively proposed a new robust watermarking algorithm for encrypted medical images based on Bandelet-DCT and chaotic mapping. By selecting appropriate and reliable visual feature vectors to embed and extract the watermarking in the ciphertext domain, the security of the watermarking and the medical image itself is guaranteed. And there is no need to make any changes to the original medical image, nor to decrypt the encrypted image to complete the entire process. According to the homomorphism of the encryption algorithm, we can safely hand over the watermark and carrier image to a third party for processing. The experimental results proved

72

Y. Fang et al.

Fig. 11 Comparison with other encrypted algorithms

that the proposed algorithm has good robustness both against common attacks and geometric attacks. Acknowledgements This work was supported in part by the Natural Science Foundation of China under Grant 62063004 and 61762033, in part by the Hainan Provincial Natural Science Foundation of China under Grant 2019RC018 and 619QN246, by the Postdoctoral Science Foundation under Grant 2020TQ0293, by the Science and Technology Research Project of Chongqing Education Commission Under Grant KJQN201800442, and by the General Project of Chongqing Natural Science Foundation Under Grant cstc2020jcyj-msxmX0422.

References 1. Ustubioglu, A., Ulutas, G.: A New Medical image watermarking technique with finer tamper localization. J. Digit. Imaging 30, 665–680 (2017) 2. Liu, Y., Li, J., Liu, J., Cheng, J., Liu, J., Wang, L., Bai, X.: Robust encrypted watermarking for medical images based on DWT-DCT and tent mapping in encrypted domain. pp. 584–596. Springer International Publishing, Cham (2019) 3. Avudaiappan, T., Balasubramanian, R., Pandiyan, S.S., Saravanan, M., Lakshmanaprabu, S.K., Shankar, K.: Medical image security using dual encryption with oppositional based optimization algorithm. J Med. Syst. 42, 208 (2018) 4. Liu, J., Li, J., Zhang, K., Bhatti, U.A., Ai, Y.: Zero-watermarking algorithm for medical images based on dual-tree complex wavelet transform and discrete cosine transform. J. Med. Imag. Health. 9, 188–194 (2019) 5. Coatrieux, G., Maitre, H., Sankur, B., Rolland, Y., Collorec, R.: Relevance of watermarking in medical imaging. 250–255. IEEE (2000)

A Novel Robust Watermarking Algorithm for Encrypted …

73

6. Sharma, A., Singh, A.K., Ghrera, S.P.: Robust and Secure Multiple Watermarking for Medical Images. Wireless Pers. Commun. 92, 1611–1624 (2017) 7. Nematzadeh, H., Enayatifar, R., Motameni, H., Guimarães, F.G., Coelho, V.N.: Medical image encryption using a hybrid model of modified genetic algorithm and coupled map lattices. Opt. Laser. Eng. 110, 24–32 (2018) 8. Mehto, A., Mehra, N.: Adaptive lossless medical image watermarking algorithm based on DCT & DWT. Procedia Comput. Sci. 78, 88–94 (2016) 9. Kim, D., Son, Y., Kim, D., Kim, A., Hong, S., Cheon, J.H.: Privacy-preserving approximate GWAS computation based on homomorphic encryption. BMC Med. Genomics 13, (2020) 10. Park, J., Kim, D.S., Lim, H.: Privacy-preserving reinforcement learning using Homomorphic encryption in cloud computing infrastructures. IEEE Access 8, 203564–203579 (2020) 11. Kim, T., Oh, Y., Kim, H.: Efficient privacy-preserving fingerprint-based authentication system using fully Homomorphic encryption. Secur. Commun. Netw. 2020, 1–11 (2020) 12. Xiong, L., Xu, Z., Shi, Y.: An integer wavelet transform based scheme for reversible data hiding in encrypted images. Multidim. Syst. Sign. P. 29, 1191–1202 (2017) 13. Bouslimi, D., Bellafqira, R., Coatrieux, G.: Data hiding in homomorphically encrypted medical images for verifying their reliability in both encrypted and spatial domains. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2016, 2496–2499 (2016) 14. Ismail, S.M., Said, L.A., Radwan, A.G., Madian, A.H., Abu, M.F.: Generalized double-humped logistic map-based medical image encryption. J. Adv. Res. 10, 85–98 (2018) 15. LOWE, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. 91– 110, (2004) 16. Peyré, G., Mallat, S.: Discrete bandelets with geometric orthogonal filters. IEEE Int. Conf. Image Process. (2009) 17. Kalra, G.S., Talwar, R., Sadawarti, H.: Robust blind digital image watermarking using DWT and dual encryption technique. 225–230. IEEE (2011)

Robust Zero Watermarking Algorithm for Encrypted Medical Images Based on DWT-Gabor Xiliang Xiao, Jingbing Li, Dan Yi, Yangxiu Fang, Wenfeng Cui, Uzair Aslam Bhatti, and Baoru Han

Abstract To ensure the safety of medical images and patient information, this paper proposes a robust algorithm based on encrypted medical images. Firstly, the medical image is encrypted by combining DWT-DCT and tent mapping. Then, the features of the encrypted medical image are extracted by DWT-Gabor transform. Finally, the scrambled watermark is embedded into the encrypted medical image combined with zero watermark technology. Experiments show that the algorithm in this paper has good robustness and invisibility, and can resist a certain degree of conventional attacks and geometric attacks, especially against the translation and crop attacks. Keywords Robustness · Zero watermark · Encrypted domain · DWT-gabor X. Xiao · J. Li (B) · D. Yi · Y. Fang · W. Cui School of Information and Communication Engineering, Hainan University, Haikou, Hainan, P.R. China X. Xiao e-mail: [email protected] D. Yi e-mail: [email protected] Y. Fang e-mail: [email protected] W. Cui e-mail: [email protected] J. Li State Key Laboratory of Marine Resource Utilization in the South China Sea, Hainan University, Haikou, Hainan, P.R. China U. A. Bhatti School of Geography (Remote Sensing and GIS Lab), Nanjing Normal University, Nanjing, Jiangsu, P.R. China B. Han College of Medical Informatics, Chongqing Medical University, Chongqing, P.R. China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_7

75

76

X. Xiao et al.

1 Introduction The rapid development of digitization still cannot escape the impact of network security issues. The dissemination of medical images and patient information on the Internet face many hidden security risks, and is vulnerable to external influences and damage, tampering, and leakage, it was even used maliciously [1, 2]. Therefore, the copyright protection and information protection of medical images are particularly important. Digital watermarking technology [3, 4], as a part of information hiding technology, hides or embeds hidden information in pictures, audios or videos through signal processing technology, which can effectively improve the security and concealment of information. In recent years, digital watermarking technology is also widely used in digital medical systems to protect medical images and patient information [5, 6]. Digital watermarking technology can be roughly divided into two categories: one is to embed the watermark directly in the carrier image and realize the embedding and extraction of the watermark in the spatial domain. For example, Eswaraiah et al. proposed a fragile watermark based on the region of interest(ROI) and the least significant bit(LSB), which embeds the watermark information in the region of interest of the medical image [7], but this method is not good for geometric attacks, and the choice of interesting area is very important. The other is a robust watermarking algorithm based on the transform domain [8], such as Zermi and others combine DWT-SVD and blind watermarking to ensure the integrity of image data and improve the robustness of watermarking [9]. These methods reduce the damage to the image and improve the robustness, but without the protection of the original image and watermark, in medical image watermarking, the patient’s identity information is implicit in the watermark information, and the carrier image represents the patient’s pathological information, so image encryption technology is particularly important [10, 11]. Miao encrypts the watermark by Arnold scrambling, which realizes the dual protection of the watermark and improves the security of watermark information [12]. Rajagopalan et al. used logistic map, DNA coding, and LFSR to encrypt medical images [13], Chiun et al. combined logistic map with sine map and tent map to form an image encryption system, and compared their advantages and disadvantages [14], these image encryption technologies can protect the watermark and the original image well, and can further improve the security of the watermark algorithm. Gabor transform [15–17] is widely used in image processing due to its characteristics in image texture feature extraction, such as fingerprint recognition, face recognition, and image segmentation. Gabor transform is also used in digital watermarking [18, 19]. Fan et al. proposed a zero watermarking algorithm based on Gabor transform and DCT transform. Firstly, Gabor transform is applied to the image, then DCT transform and SVD are used to get the feature matrix, and finally XOR with the encrypted watermark is used to get the key, so as to realize the watermark embedding [20]. This paper uses DWT-DCT to encrypt the image in the frequency domain,

Robust Zero Watermarking Algorithm for Encrypted … LL

LL HL

HL

HL

LL

HH

HH

LH

(a)

LH HH

HL

HL

HL LH

LH

77

LH

HH

(b)

HH

LH

HH

(c)

Fig. 1 Discrete wavelet transform of image a Single level decomposition, b Two level decomposition, c Three level decomposition

and at the same time performs chaotic scrambling on the watermark, uses the DWTGabor algorithm to extract the characteristics of the encrypted image and combines the zero-watermark technology to embed and extract the watermark. Based on this method, the watermark is guaranteed. The robustness and invisibility of the information also enhance the security of the watermark and carrier medical images during the transmission process.

2 Basic Theory 2.1 Discrete Wavelet Transform (DWT) Discrete wavelet transform can decompose signals on different scales. It is composed of a set of low-pass and high-pass filters to extract the low-frequency and highfrequency signals of the signal. The low-frequency signal represents the approximate information of the signal, and the high-frequency signal represents the details of the signal information. 2D-DWT is a one-dimensional discrete wavelet transform of the image in the horizontal and vertical directions so that four parts of information can be obtained, as shown in the Fig. 1.

2.2 Gabor Transform Gabor transform is obtained by Gaussian function and the complex sine function. It is a windowed Fourier transform with good direction and scale selectivity. By generating a set of self-similar filters, the texture features of images in different directions and scales can be extracted. The expression of Gabor mother wavelet g(x, y) is as follows: g(x, y) =

   1 x2 y2 1 exp(2π jwx) exp − + 2π σx σ y 2 σx σy

(1)

78

X. Xiao et al.

Fig. 2 Gabor filter kernels

Where,σx and σ y are the standard deviations of the Gaussian function in the X-axis and Y-axis directions, and w is the frequency of the complex sine function,   ⎧ ⎨ gu,v (x, y) = α −u g x  , y  , α > 1 x  = α −u (xcosθ + ysinθ ) ⎩ y  = α −u (−xsinθ + ycosθ )

(2)



1

where, α −u is the scale modulation factor, θ is the direction, α = ( UUhl ) S−1 , θ = vπ , K S is the number of scales and K is the number of directions. As shown in the Fig. 2 (5 scales and 8 directions are selected in this article): The image is respectively convolved by the above-mentioned filter kernel to extract image features. Wuv (x, y) =

d1 d2 x1

I (x − x1 , y − y1 )guv (x1 , y1 )

(3)

x2

3 Algorithm Process 3.1 Medical Image Encryption This paper uses the combination of the DWT-DCT algorithm and Tent map to encrypt the image in the frequency domain, and the image can be restored well after decryption. The decryption process is the same as the encryption process (Fig. 3). The encryption process is as follows: First, perform DWT transformation on the original image to obtain coefficient matrices LL, HL, LH, HH, and then perform DCT transformation on these four matrices to obtain four DCT coefficient matrices (C1, C2, C3, C4); meanwhile, set the initial value, generate a chaotic sequence through Tent map and binarize it, select

Robust Zero Watermarking Algorithm for Encrypted … Fig. 3 Medical image encryption flowchart

79

The original image Initial value: x0

DWT Coefficient matrix (LL,HL,LH,HH) DCT DCT coefficient matrix (C1,C2,C3,C4)

chaotic sequence Binary sequence Dot multiplication

IDCT IDCT coefficient matrix IDWT Encrypted medical image

a sequence in an appropriate range, and multiply it with the DCT coefficient; finally, IDCT transformation and IDWT transformation are performed to obtain encrypted medicine image.

3.2 Feature Vector Extraction In this paper, DWT and Gabor are used to extract the features of encrypted medical images and get 32-bit feature vectors, as shown in the Fig. 4: First, perform a 3-layer DWT decomposition on the encrypted medical image, extract the low-frequency coefficient matrix LL3 and perform Gabor transformation to obtain Gabor feature matrices with 5 scales and 8 directions, and take the average values of these matrices to form a mean matrix M(u, v) of size 5 × 8. Sort the mean values in different directions under the same scale in descending order, read the first 32 bits from the first column, and determine the size relationship between before and after (the previous bit is greater than the next bit, which is judged as 1, otherwise it is 0) to get 32 bits Feature vector. Encrypted medical image

DWT

Three Level Decomposition LL3

Gabor

5x8 Gabor feature matrices

Fig. 4 Feature vector extraction based on DWT Gabor

5x8 Mean matrix

32-bit feature vector

80

X. Xiao et al. Chaotic sequence X(j) Encrypted medical image

Binary encrytion matrix C(i,j)

Original watermark W(i,j)

DWT-Gabor

Chaotic encrytion watermark BW(i,j)

32-bit feature vector V(j) Logical secret key K(i,j)

Fig. 5 Watermark embedding flow chart

3.3 Watermark Embedding The first is to encrypt the watermark image, generate a chaotic sequence X(j) through the logistic map, and then perform binarization to obtain a binary encryption matrix C(i, j), which is XORed with the original watermark to obtain the chaotic encrypted watermark; secondly, use the above the mentioned DWT-Gabor algorithm extracts the 32-bit feature vector V(j) of the encrypted medical image; finally, the encrypted watermark and the feature vector are XORed to obtain the logical secret key and save it to extract the watermark (Fig. 5).

3.4 Watermark Extraction The extraction of the watermark is the reverse process of watermark embedding, which includes extracting the feature vector V (j) of the encrypted medical image to be tested, combining it with the reserved secret key to extract the encrypted watermark BW (i, j), and then using the same binary encryption matrix and XOR the encrypted watermark to restore the watermark (Fig. 6). As shown below:

4 Experimental Results In this paper, experiments are carried out on a 512 pixels × 512 pixels shoulder joint medical image, and it is encrypted and attacked; a 32 pixels × 32 pixels image with HN is used as the watermark. The PSNR value represents the degree of the image being attacked, the lower the value, the more severe the damage; the NC value represents the degree of correlation between the original watermark and the extracted and restored watermark (Fig. 7).

Robust Zero Watermarking Algorithm for Encrypted …

81

Image to be tested Chaotic sequence X’(j)

DWT-Gabor 32-bit feature vector V’(j Logical secret key K(i,j)

Binary encrytion matrix C’(i,j)

Chaotic encrytion watermark BW’(i,j)

Original watermark W’(i,j)

Fig. 6 Watermark extraction flow chart

Fig. 7 a original image, b encrypted image, c restored image, d original watermark, e encrypted watermark

M N max (I(i, j) )2 i, j P S N R = 10log10

2  (I i j (i, j) − I (i, j) )

 i j W(i, j) W (i, j)

2 NC = i j W(i, j)

(4)

(5)

4.1 Conventional Attack There are three conventional attacks tested in this article: Gaussian noise, JPEG compression, and median filtering. It can be seen from Table 1 that when the image suffers 25% Gaussian noise, the NC value is 0.73; when the image suffers 1% JPEG compression, the NC value is 1; when the image is subjected to 30 times of 7 × 7 median filtering, the NC value is 0.87; and it can be seen from Fig. 8 that the watermark can still be restored well after the picture is subjected to a larger degree of conventional attacks. Thus, the algorithm has better robustness to the above conventional attacks.

5%

13.42

0.87

Strength

PSNR/dB

NC

Gaussian noise

15%

0.86

10.07

25% 0.73

8.98 1.00

25.58

1%

JPEG compression

1.00

31.53

10%

Table 1 PSNR value and NC value under the different intensity of conventional attack

20% 0.87

34.32

0.91

35.35

3×3

Median filter (Thirty times) 5×5 0.87

28.66

7×7 0.87

25.60

82 X. Xiao et al.

Robust Zero Watermarking Algorithm for Encrypted …

83

Fig. 8 Encrypted medical images and restored watermarks under Conventional attack

4.2 Geometric Attack The geometric attacks tested in this paper are as follows: rotation, translation, and crop. As shown in Table 2, when the encrypted medical image undergoes different degrees of geometric attacks, its NC value is greater than 0.7; at the same time, it can be seen from Fig. 9 that when the image is subjected to a certain degree of rotation, translation, and crop attacks, the restored watermark is still clear. Therefore, the algorithm in this paper has high robustness to the above types of geometric attacks. Table 2 PSNR and NC values under geometric attacks

Geometric attacks

Attack strength

PSNR/dB

NC

Rotation (clockwise)



17.25

0.75

10°

15.09

0.87

Translation attack (left)

Translation attack (up)

Clipping (Y direction)

Clipping (X direction)

20°

13.38

0.74

5%

15.55

0.83

15%

12.02

0.87

25%

10.23

0.87

5%

15.86

1.00

10%

13.44

1.00

20%

10.97

0.96

5%

/

0.87

15%

/

0.87

25%

/

0.69

5%

/

0.91

15%

/

0.91

25%

/

0.76

84

X. Xiao et al.

(a) Rotate 20°

(c) Move up 20%

(b)Move left 25%

(d) Crop 15%(Y direction)

(e) Crop 15%(X direction)

Fig. 9 Encrypted medical images under different geometric attacks and restored watermarks

4.3 Algorithm Comparison The algorithm in this article is a robust watermarking algorithm based on encrypted medical images. In this article, we focus on comparing the advantages and disadvantages of the algorithms before and after encrypting the original medical images, as shown in the following table: It can be seen from Table 3 that in the case of the same attack, the plaintext domain and encrypted domain algorithms have little difference in the restoration of watermarks, and the NC values are relatively close. So, the algorithm in this paper has strong robustness to both plaintext domain and encrypted domain, and meets the requirements of homomorphism encryption. It can be seen from Table 4 that after the image is encrypted with the same encryption method, the classic feature extraction algorithm DCT performs well under conventional attacks, but performs poorly under geometric attacks, while the performance of this algorithm under conventional attacks is similar to the classic DCT Table 3 Algorithm comparison between encrypted domain and plaintext domain Attack strength

Plaintext domain PSNR/dB

Encrypted domain NC

PSNR/dB

NC

Gaussian noise 15%

10.74

0.87

10.07

0.86

JPEG compression 1%

25.49

0.91

25.58

1.00 0.87

Median filter 7 × 7 (Thirty times)

24.08

1.00

25.60

Rotation 20°

13.06

0.81

13.38

0.74

Translation attack(left) 25%

11.22

1.00

10.23

0.87 0.96

Translation attack(up) 20%

11.10

0.91

10.97

Crop (Y direction) 15%

/

0.86

/

0.87

Crop (X direction) 15%

/

0.91

/

0.91

Robust Zero Watermarking Algorithm for Encrypted … Table 4 Comparison of different feature extraction algorithms in the encrypted domain

85

Attack strength

NC DWT-Gabor

DCT

Gaussian noise 15%

0.86

0.90

JPEG compression 1%

1.00

0.90

Median filter 7 × 7 (Thirty times)

0.87

0.90

Rotation 20°

0.74

0.29

Translation attack(left) 25%

0.87

0.46

Translation attack(up) 20%

0.96

0.54

Crop (Y direction) 15%

0.87

0.61

Crop (X direction) 15%

0.91

0.63

algorithm, but performs well under geometric attacks, especially under shear and translation attacks.

5 Conclusion This paper proposes a robust watermarking algorithm for encrypted medical images based on DWT-Gabor. Firstly, DWT-DCT transform is combined with tent mapping to encrypt the medical image. Secondly, it uses the combination of DWT transform and Gabor transform to extract the features of the encrypted image to obtain the feature vector of the representative image, and combines the zero-watermark algorithm to embed the watermark, and extract the watermark in the encrypted medical image after the attack. Experimental data shows that the algorithm in this paper has high robustness to both conventional attacks and geometric attacks, and can well protect the original medical images and watermark information. Acknowledgements This work was supported in part by the Natural Science Foundation of China under Grant 62063004 and 61762033, in part by the Hainan Provincial Natural Science Foundation of China under Grant 2019RC018 and 619QN246, by the Postdoctoral Science Foundation under Grant 2020TQ0293, by the Science and Technology Research Project of Chongqing Education Commission Under Grant KJQN201800442, and by the General Project of Chongqing Natural Science Foundation Under Grant cstc2020jcyj-msxmX0422).

86

X. Xiao et al.

References 1. Memon, N.A., Alzahrani, A.: Prediction-based reversible watermarking of CT scan images for content authentication and copyright protection. IEEE Access. pp. 1–1 (2020). https://doi.org/ 10.1109/ACCESS.2020.2989175 2. Hurrah, N.N., Parah, S.A., Loan, N.A., Sheikh, J.A., Elhoseny, M., Muhammad, K.: Dual watermarking framework for privacy protection and content authentication of multimedia. Futur. Gener. Comput. Syst. 94, 654–673 (2019). https://doi.org/10.1016/j.future.2018.12.036 3. Singh, A.K., Thakur, S., Jolfaei, A., Srivastava, G., Mohan, A.: Joint encryption and compression-based watermarking technique for security of digital documents. ACM Trans. Internet Technol. 21, 1–20 (2021). https://doi.org/10.1145/3414474 4. Yang, Z.D., Jing, L.U., Wang, K.Q., Liu, S.L., Guo, Y.X.: Application of digital watermarking technology in electric power information security. (2016) 5. Singh, A.K., Kumar, B., Singh, G., Mohan, A.: Medical image watermarking. Multimedia Syst. Appl. (2017) https://doi.org/10.1007/978-3-319-57699-2 6. Nyeem, H., Boles, W., Boyd, C.: A Review of medical image watermarking requirements for teleradiology. J. Digit. Imaging 26, 326–343 (2013). https://doi.org/10.1007/s10278-0129527-x 7. Eswaraiah, R., Reddy, E.S.: A fragile ROI-based medical image watermarking technique with tamper detection and recovery. In: Fourth International Conference on Communication Systems & Network Technologies (2014). https://doi.org/10.1109/CSNT.2014.184 8. Fares, K., Khaldi, A., Redouane, K., Salah, E.: DCT & DWT based watermarking scheme for medical information security. Biomed. Signal Process. Control 66, 102403 (2021). https://doi. org/10.1016/j.bspc.2020.102403 9. Zermi, N.N., Amine, K., Redouane, K., Fares, K., Salah, E.: A DWT-SVD based robust digital watermarking for medical image security. Forensic Sci. Int. 110691 (2021). https://doi.org/10. 1016/j.forsciint.2021.110691 10. Kim, M., Lauter, K.: Private genome analysis through homomorphic encryption. BMC Med. Inform. Decis. Mak. 15, S3–S3 (2015). https://doi.org/10.1186/1472-6947-15-s5-s3 11. Tong, X.J., Zhu, W., Miao, Z., Yang, L.: A new algorithm of the combination of image compression and encryption technology based on cross chaotic map. Nonlinear Dyn. 72, 229–241 (2013). https://doi.org/10.1007/s11071-012-0707-5 12. Miao, S., Li, J., Dong, C., Yong, B.: The encrypted watermarking for medical image based on arnold scrambling and DWT. J. Converg. Inf. Technol. (2013). https://doi.org/10.4156/jcit. vol8.issue5.104 13. Rajagopalan, S., Janakiraman, S., Rengarajan, A.: Medical image encryption. (2019) 14. Chiun, L.C., Mandangan, A., Daud, A., Che, H., Che, H.: Image encryption and decryption by using Logistic-Sine Chaotic system and Logistic-Tent Chaotic system. In: 4th International Conference on Mathematical Sciences (2017). https://doi.org/10.1063/1.4980898 15. Zhang, Q., Li, H., Li, M., Ding, L.: Feature extraction of face image based on LBP and 2-D Gabor wavelet transform. Math. Biosci. Eng. 17, 1578–1592 (2020). https://doi.org/10.3934/ mbe.2020082 16. Li, H.-A., Fan, J., Zhang, J., Li, Z., Zhang, Y.: Facial image segmentation based on Gabor filter. Math. Probl. Eng. 2021, 1–7 (2021). https://doi.org/10.1155/2021/6620742 17. Manjunath, B.S., Ma, W.Y.: Texture features for browsing and retrieving of large image data. IEEE Trans. Pattern Anal. Mach. Intell. 33, 117–128 (2011). https://doi.org/10.1109/34.531803 18. Wei, K., Zhang, M.: An efficient image watermarking scheme based on real-valued discrete Gabor transform. (2015) 19. Juck-Sik, L.: Blind digital image watermarking methods based on QIM. J. Korean Inst. Inf. Technol. 15, 107–115 (2017). https://doi.org/10.14801/jkiit.2017.15.9.107 20. Fan, D., Li, Y., Gao, S., Chi, W., Lv, C.: A novel zero watermark optimization algorithm based on Gabor transform and discrete cosine transform. Concurrency Comput. Pract. Experience (2020). https://doi.org/10.1002/cpe.5689

A Zero Watermarking Scheme for Encrypted Medical Images Based on Tetrolet-DCT Wenfeng Cui, Jing Liu, Jingbing Li, Yangxiu Fang, Dan Yi, Xiliang Xiao, Uzair Aslam Bhatti, and Baoru Han

Abstract Aiming at the problem of medical image leakage in the transmission process, in order to enhance the concealment of patient data, a Tetrolet-DCT based encrypted medical image zero watermarking scheme is proposed. Firstly, DFT and Logistic mapping are used to encrypt the original medical image to obtain the encrypted medical image. Then, Tetrolet-DCT is used to extract the features of the encrypted medical image. After that, zero watermark technology is applied to embed W. Cui · J. Li (B) · Y. Fang · D. Yi · X. Xiao School of Information and Communication Engineering, Hainan University, Haikou, Hainan, P.R. China e-mail: [email protected] W. Cui e-mail: [email protected] Y. Fang e-mail: [email protected] D. Yi e-mail: [email protected] X. Xiao e-mail: [email protected] J. Liu · J. Li State Key Laboratory of Marine Resource Utilization in the South China Sea, Hainan University, Haikou, Hainan, P.R. China e-mail: [email protected] J. Liu Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, Zhejiang, P.R. China U. A. Bhatti School of Geography (Remote Sensing, and GIS Lab), Nanjing Normal University, Nanjing, Jiangsu, P.R. China e-mail: [email protected] B. Han College of Medical Informatics, Chongqing Medical University, Chongqing, P.R. China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_8

87

88

W. Cui et al.

and extract the watermark information of patients. Experimental results show that the proposed scheme ensures the concealment of the original medical image, and can effectively extract the watermark information. It has good robustness both against traditional attacks and geometric attacks. Keywords Image encryption · Zero watermark · Logistic · Tetrolet

1 Introduction With the rapid development of digital information technology and the rapid spread of digital images on the Internet, the security of images cannot be guaranteed, especially in medical, military, government and other sensitive fields. In the process of transmission, they are vulnerable to external attacks, and may face security problems such as copying, cutting, tampering, and even more serious risks of image information leakage [1]. To solve this problem, a large number of scholars have studied and discussed in the field of image encryption. At present, the chaotic encryption methods for digital images include: Logistic chaotic mapping, Chebychev mapping, piecewise linear chaotic mapping, Cubic mapping, Henon mapping, Lorenz chaotic mapping, Chua’s chaos, Rossler chaotic system, two-dimensional Sinai map, Chen’s chaotic system, etc.[2]. Literature [3] performed image encryption through Radon transform and Fourier transform, literature [4] proposed an image encryption algorithm based on Chen hyperchaos and DNA encoding. Deng Xiaohong [5] proposed an encryption algorithm based on 2D sine logistic chaotic map. Dr. Jyoti Bharti [6] used QR code for image encryption, this method can respond quickly and is often used in applications. Since the common image encryption technology has a large amount of information and is very slow to process, Le Kexin [7] studied image encryption schemes in the cloud environment, using cloud computing capabilities to improve the efficiency of computing. Gao, Haojiang [8] proposed a new encryption algorithm based on power function and tangent function, which is safe and efficient. Digital watermarking technology is a supplement to traditional encryption technology. Watermarking technology includes the generation, embedding and extraction of watermarks. It mainly embedding the watermark information into the feature vector interval of the image to realize the watermark embedding without changing the original image [9]. Now, digital watermarking technology mainly focuses on spatial domain and transform domain. The spatial domain is mainly realized by changing the gray value of pixels. The main classic algorithms are LSB algorithms and Patchwork algorithms. The transform domain is mainly to embed watermark by changing coefficient value, among which the common classical algorithms are DCT, DWT [10], DFT transform [11]. Literature [12] used DWT and FWHT to embed and extract watermark. However, wavelet transform can only reflect the zero-dimensional characteristics of the signal. In order to represent and process image data more effectively, multi-scale geometric analysis has gradually become a research hotspot. At present,

A Zero Watermarking Scheme for Encrypted Medical …

89

researchers have proposed a series of multi-scale geometric analysis tools such as Ridgelet, Curvelet, Contourlet, Brushlet, Wedgelet, Bandlet, Shearlet, Tetrolet [13]. Jens Krommweh [14] proposed the Tetrolet transform, and literature [15] proposed a robust zero-watermark scheme based on Tetrolet on this basis. Therefore, in order to solve the problem of information security and copyright, this paper combineed image encryption technology and digital watermarking technology, proposed a Tetrolet-DCT-based zero watermarking scheme for encrypted medical images. DFT and logistic was used to encrypt the medical image, TetroletDCT was used to extract the features of the encrypted medical image, and the zero watermarking technology was applied to embed and extract the watermark.

2 Theoretical Knowledge 2.1 DFT Two-dimensional discrete Fourier transform is commonly used to process images. It can convert the image from spatial domain to frequency domain. Similarly, inverse discrete Fourier transform can be used to convert from frequency domain to time domain. The fast algorithm of discrete Fourier transform in MATLAB is FFT. Due to the symmetry, periodicity, and orthogonality of the basis function, the redundancy of the algorithm can be reduced, thereby greatly improving the efficiency of the algorithm. Discrete Fourier Transform (DFT) formula: F(u, v) =

M−1 N −1  x=0

f (x, y) · e− j2π xu/M e− j2π yv/N

(1)

y

u = 0, 1, · · · , M − 1; v = 0, 1, · · · , N − 1; Inverse Discrete Fourier Transform (IDFT) formula: f (x, y) =

M−1 N −1 vy 1  ux F(u, v)e j2π( M + N ) M N u=0 v=0

x = 0, 1, · · · , M − 1; y = 0, 1, · · · , N − 1

(2)

90

W. Cui et al.

2.2 Tetrolet Tetrolet is a new adaptive Haar wavelet transform, In 2009, Jens Krommweh [14] proposed the Tetrolet transformation. This transformation can perform sparse representation well, and the changed image coefficients are more concentrated. The transformation first divides the image into 4 times 4 regions, and then determines the tetromino partition in each block. The partition is combined according to the geometric structure. There are 22 basic forms of square combination. If rotation and reflection are considered, there are 117 solutions (Fig. 1). The decomposition steps are as follows: Step1: Divide the picture into 4 × 4 blocks. Step2: Find the sparsest expression of each block. Step3: Rearrange the low-pass part and high-pass part of each block into 2 × 2 blocks. Step4: Save the high-pass part of the Tetrolet decomposition coefficient. Step5: Repeat steps 1–4 for the low-pass part until the decomposition is complete.

Fig. 1 Tetrolet decomposition flowchart

A Zero Watermarking Scheme for Encrypted Medical …

91

2.3 DCT Discrete cosine transform has the advantages of strong robustness and compatibility with international image compression standards. After the DCT transformation, the middle and low frequencies are mainly concentrated in the upper left corner, so the watermark information can be embedded in the middle and low frequency regions of the DCT domain pair. The 2D-DCT transformation is as follows: F(u, v) = c(u)c(v)

M−1 N −1  x=0 y=0

f (x, y)cos

π(2y + 1)v π(2y + 1)u cos 2M 2N

where, u = 0, 1, · · · , M − 1; v = 0, 1, · · · , N − 1; ⎧ ⎧   1 1 ⎨ ⎨ v = 0 u=0 M M c(v) =  c(u) =  ⎩ 1 v = 1, 2, · · · , N − 1 ⎩ 1 u = 1, 2, · · · , N − 1 N N

(3)

(4)

where, x, y denote sampling values in the spatial domain; u and v denote sampling values in the frequency domain, and M and N denote the scales of the image.

2.4 Logistic Mapping Logistic mapping is a simple, classic and ergodic chaotic dynamic system. The characteristic is that it is highly sensitive to the initial value and is widely used in the encryption field. It can be described by the following nonlinear equation: xn+1 = μxn (1 − xn )

(5)

When 3.5699456 < μ ≤ 4, Logistic maps become a chaotic state. When μ = 4, the probability distribution function of Logistic chaotic sequence ρ(x) is defined as: ρ=

1 ,0 < x < 1 π x(1 − x) √

(6)

This formula shows that the Logistic sequence has ergodicity, and the sequence probability density distribution function it produces has nothing to do with the initial value.

92

W. Cui et al.

3 Medical Image Encryption To verify the performance of the algorithm in this paper, we use a 512 pixel × 512 pixel brain CT image for image encryption. First, perform DFT transformation on the original medical image to obtain the coefficient matrix (i, j). Then use the Logistic chaotic sequence to construct the encryption matrixC(i, j), and do the dot multiplication of the encryption matrix and the coefficient matrix. Finally, perform IDFT transformation to obtain the encrypted image (Figs. 2 and 3). The flow chart and encrypted medical image are as follows:

Fig. 2 Medical image encryption flowchart

Fig. 3 a Original medical image b Encrypted medical image c Decrypted medical image

A Zero Watermarking Scheme for Encrypted Medical …

93

Fig. 4 Watermark embedding flowchart

4 Embedding and Extraction of Zero Watermark The scheme proposed in this paper is carried out on the encrypted image, which mainly includes the preprocessing of the watermark, watermark embedding and watermark extraction. It uses Tetrolet combined with DCT for feature extraction.

4.1 Watermark Embedding First, extract the features of the encrypted medical image to obtain a feature vector V ( j) (Fig. 4). Similarly, an encrypted two-dimensional matrix BW (i, j) is obtained via the Logistic logical sequence X ( j). Then, the feature vector and the encrypted matrix are performed XOR operation to get a logical key K (i, j).The whole process does not change the image itself, can embed the watermark information into the image, and it is the whole process of zero watermark embedding.

4.2 Extract Watermark Perform feature extraction on the encrypted and attacked medical image to obtain the feature vector V  ( j), and perform XOR operation with the logical key K (i, j) to obtain the encrypted watermark BW  (i, j) (Fig. 5). Then, XOR operation was applied with encryption matrix C(i, j) to finally extract watermark information.

94

W. Cui et al.

Fig. 5 Watermark extraction flowchart

5 Experiments 5.1 Data from Various Attacks To objectively demonstrate the performance of the watermarking algorithm, the normalized correlation coefficient NC is used to evaluate the quality of the extracted watermark, and the peak signal-to-noise ratio (PSNR) is used to evaluate the quality of the embedded watermarked image. And to verify the robustness of the watermarking algorithm, traditional attacks and geometric attacks were applied separately. First, perform traditional attacks on encrypted watermarked images. The experimental data are shown in Table 1. The following shows the encrypted medical images and extracted watermark images after the traditional attacks (Fig. 6). It can be seen from the figure that though the encrypted medical image has undergone three types of attacks, the NC value is high, and the extracted watermark image has clearly, which is hardly affected by the attacks. Table 2 is the geometric attacks data for the encrypted image. The following is the watermarked image of the encrypted medical image under the geometric attacks and the extracted watermark images (Fig. 7). From the watermarks extracted after the geometric attacks, we found that the anti-geometric ability of the algorithm in this paper is not ideal, but the watermark image is still visible to the human eye.

A Zero Watermarking Scheme for Encrypted Medical … Table 1 The encrypted medical watermarked image data under traditional attacks

95

Conventional attack Gaussian noise/ %

JPEG attack/ %

Median filtering/10 times

PSNR(dB)

NC

10

11.13

0.92

20

9.40

0.86

30

8.63

0.80

5

26.91

1.00

10

30.71

1.00

20

34.23

1.00

[3]

35.30

1.00

[5]

25.45

1.00

[7]

22.58

1.00

PSNR=9.41

PSNR=25.45

PSNR=37.07

NC=0.86 (a)

NC=1.00 (b)

NC=1.00 (c)

Fig. 6 The encrypted medical images and extracted watermark images under traditional attacks a Gaussian noise 20%, b median filter 10 times c JPEG compression 40% Table 2 The encrypted medical watermarked image data under geometric attacks

Geometric attack Scaling

Rotation (clock wise)/°

Movement (down)/%

Cropping Y direction /%

PSNR(dB)

NC

× 0.25



0.94

× 0.5



1.00

× 1.5



1.00

2

17.44

0.87

10

13.74

0.67

15

12.99

0.62

2

16.18

0.88

5

13.83

0.81

10

12.33

0.65

5



0.94

10



0.63

20



0.57

96

W. Cui et al.

-

PSNR=13.74

PSNR=12.33

-

NC=1.00

NC=0.67

NC=0.65

NC=0.63

(a)

(b)

(c)

(d)

Fig. 7 The encrypted medical images and extracted watermark images under the geometric attacks a Scaling × 0.5, b Rotation10° (clockwise), c Movment (down) 10%, d Cropping Y direction 10%

According to the experimental data and images after the attacks, the algorithm in this paper has a NC value higher than 0.5 under various attacks, so it has better robustness. In the cropping attacks, the decrease is obvious when the cropping is 20%, but the extracted watermark is still distinguishable and effective. The watermark extracted in traditional attacks is clearly visible, and the NC value is very high, especially in anti-compression attacks and median filtering attacks.

5.2 Comparison in Plaintext and Encryption Domain The watermark NC value extracted from the medical image before and after encryption. Experimental data shown that the NC value and plaintext domain are lower under various attacks, but the extraction effect is still effective (Table 3).

5.3 Feature Extraction Algorithm Comparison We compared the feature extraction algorithm in this paper with other algorithms in the plaintext domain of NC value (Table 4). Compared with DWT, Curvelet, Tetrolet and multi-scale geometric tools, the performance of JPEG compression, median filtering and Gaussian traditional attacks are improved, while Tetrolet is better in rotation attacks. Overall, the experimental data show that the proposed method has good robustness.

A Zero Watermarking Scheme for Encrypted Medical … Table 3 NC comparison in plaintext and encryption domain

97

Attacks

NC value Plaintext domain

Encrypted domain

JPEG attack 40%

1.00

1.00

Median filtering [5 × 5]/10times

1.00

1.00

Gaussian noise 20%

0.88

0.86

Scaling × 0.5

1.00

1.00

Movement (down) 10%

1.00

0.65

Movement (left) 5%

0.92

0.67

Rotation 10° (clock wise)°

0.81

0.67

Cropping Y direction 10%

0.83

0.63

Cropping X direction 10%

0.81

0.72

Table 4 Compared data bettween the proposed algorithm and other algorithms Attacks

parameter DWT-DCT

JPEG attack

1

Median filtering/10 times

Gussian noise/%

PSNR

Tetrolet-DCT

PSNR

NC

26.28

0.94 26.28

1.00 26.28

5

28.43

0.94 28.43

1.00 28.43

1.00

10

31.29

0.94 31.29

1.00 31.29

1.00

[3]

33.99

0.94 33.99

1.00 33.99

1.00

[5]

28.55

0.94 28.55

1.00 28.55

1.00

[7]

26.39

0.89 26.39

1.00 26.39

1.00

2

17.77

0.89 17.75

1.00 17.77

1.00

10

11.89

0.83 11.85

0.80 11.86

1.00

20

9.81

0.83

0.79

Movement(down)/% 5

Rotation (clock wise)/°

Curvelet-DCT

9.78

NC

PSNR

NC 1.00

9.78

0.87

18.99

0.94 15.10

1.00 15.10

1.00

10

16.19

0.94 14.73

0.89 14.73

1.00

15

15.33

0.94 14.28

0.57 14.28

0.75

5

18.00

0.76 18.00

0.71 18.00

0.81

10

15.59

0.76 15.59

0.63 15.59

0.81

15

14.86

0.68 14.86

0.62 14.86

0.81

98

W. Cui et al.

6 Conclusion A zero watermarking scheme for medical image based on Tetrolet-DCT is proposed. In the case of medical image encryption, the zero watermark technology is used to embed and extract the watermark information of the encrypted medical image effectively. Using chaotic map to embed watermark information, zero embedding of watermark is realized. According to the watermark extracted from the experiment and the NC value of watermark performance evaluation, the scheme has good robustness against traditional attacks. Under the dual protection, the invisibility of medical image is guaranteed, and the information of the image owner is not disclosed. The scheme improves the security of medical image transmission. Acknowledgements This work was supported in part by the Natural Science Foundation of China under Grant 62063004 and 61762033, in part by the Hainan Provincial Natural Science Foundation of China under Grant 2019RC018 and 619QN246, by the Postdoctoral Science Foundation under Grant 2020TQ0293, by the Science and Technology Research Project of Chongqing Education Commission Under Grant KJQN201800442, and by the General Project of Chongqing Natural Science Foundation Under Grant cstc2020jcyj-msxmX0422).

References 1. Xia, Z., Wang, X., Li, X., Wang, C., Unar, S., Wang, M., Zhao, T.: Efficient copyright protection for three CT images based on quaternion polar harmonic Fourier moments. Sig. Process 164, 368–379 (2019) 2. Chang-ci, W., Qin, W., Xiao-ning, M., Xiang-hong, L., Yang-xiang, P.: Digital image encryption: a survey. Comput. Sci. 39( 6–9) 24 (2012) 3. YAO Xi-Cheng, L.X.: Image encryption based on linear random radon transform and fourier transform. Comput. Inf. Technol. 28, 11–13 (2020) 4. Yanqi, L.Z.J.A.: Image encryption algorithm based on Chen hyperchaos and DNA coding. J. Nat. Sci. Heilongjiang Univ. 37, 602–609 (2020) 5. Deng Xiaohong, L.D.L.H.: Medical image encryption algorithm in frequency domain base on 2D sine logistic chaotic mapping. Appl. Res. Comput. (2020) 6. Jyoti Bharti, A.S.: A review on image encryption using QR-Code. In: IDES joint International conferences on IPC and ARTEE (2017) 7. Kexin, L., Liangliang, W.: Image encryption scheme based on Chaos in cloud environment. J. Shanghai Univ. Electr. Power 36, 500–504 (2020) 8. Gao, H., Zhang, Y., Liang, S., Li, D.: A new chaotic algorithm for image encryption. Chaos, Solitons Fractals 29, 393–399 (2006) 9. NIE Xuan, H.D.G.D.: Overview of digital image watermarking technology. Comput. Knowl. Technol. (2015) 10. Jeevitha, S., Amutha Prabha, N.: Novel medical image encryption using DWT block-based scrambling and edge maps. J. Amb. Intel. Hum. Comp. (2020) 11. Hong-lan, Y.: The image copyright protection method based on DFT and Arnold technology design and implementation. J. Hefei Univ. (2018) 12. Savakar, D.G., Pujar, S.: Digital image watermarking using DWT and FWHT. Int. J. Image Graphics Signal Process. 10, 50–67 (2018) 13. Cai-Lian L, J.S.Y.K.: Image multi-scale geometric analysis. Nat. Sci. J. Hainan Univ. 29, 275–283 (2011)

A Zero Watermarking Scheme for Encrypted Medical …

99

14. Krommweh, J.: Tetrolet transform: a new adaptive Haar wavelet algorithm for sparse image representation. J. Vis. Commun. image. R. 21, 364–374 (2010) 15. Mastan Vali, S.K., Naga Kishore, K.L., Prathibha, G.: Notice of retraction robust image watermarking using tetrolet transform. 1–5. IEEE (2015)

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for Medical Images in the Encrypted Domain Dan Yi, Jingbing Li, Yangxiu Fang, Wenfeng Cui, Xiliang Xiao, Uzair Aslam Bhatti, and Baoru Han Abstract In the process of network transmission, medical images are maliciously attacked to cause the loss of diagnostic data or the leakage of patient privacy. To solve this problem, this paper proposes a robust watermarking algorithm for encrypted medical images based on PHTs-DCT. First, use DWT-DCT and tent chaotic sequence to encrypt the medical image, which increases the security of the carrier medical image information. Then, PHTs-DCT transform is performed on the encrypted medical image to extract the feature vector. Finally, using the homomorphism of the encryption algorithm, combined with the concept of a third party, the zero-watermarking technology is used to embed and extract the watermark of the encrypted medical image. Experiments based on the MATLAB simulation platform, a large number of experimental results show that the proposed algorithm has D. Yi · J. Li (B) · Y. Fang · W. Cui · X. Xiao School of Information and Communication Engineering, Hainan University, Haikou, Hainan, P.R. China D. Yi e-mail: [email protected] Y. Fang e-mail: [email protected] W. Cui e-mail: [email protected] X. Xiao e-mail: [email protected] J. Li State Key Laboratory of Marine Resource Utilization, South China Sea, Hainan University, Haikou, Hainan, P.R. China U. A. Bhatti School of Geography (Remote Sensing, GIS Lab), Nanjing Normal University, Nanjing, Jiangsu, P.R. China B. Han College of Medical Informatics, Chongqing Medical University, Chongqing, P.R. China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_9

101

102

D. Yi et al.

good robustness to conventional attacks, and geometric attacks, especially in scaling, compression and translation attacks. Keywords Phts · DWT-DCT · Medical image · Robustness · Encrypted domain

1 Introduction With the rapid development of communications and computer networks, big data, smart cities, and smart medical care have emerged. The 2009 “Health Information Technology for Economic and Clinical Health Act” (HITECH) has begun to encourage the use of electronic health records. The electronic medical record improves the accessibility of data and simplifies the computer data update [1]. As an important carrier of medical information storage, medical images play an important role in real-time diagnosis, understanding of critical diseases and avoiding misdiagnosis. Different from ordinary images, the patient’s more private personal information and diagnosis records are stored in medical images. During the watermark embedding process, we must not only ensure the clarity of the image but also ensure the integrity and security of the image during network transmission [2]. Digital watermarking is considered to be an effective method to ensure the authenticity and integrity of medical images. Authenticity refers to the ability to identify the source of information and prove that the data is relevant to the patient. Integrity refers to the ability to ensure that information is not changed without authorization [3, 4]. Because of the particularity of medical images, we introduce zero-watermarking technology, in the process of embedding watermark information, without changing the visual effect of the original medical image, ensuring the clarity of the medical image [5]. Nowadays, digital watermarking algorithms are mostly used in the plaintext domain [6–8], that is, the embedding and extraction of watermarks are done on unencrypted carrier medical images. If the carrier image is intercepted during transmission, it will expose the carrier image information. Therefore, we adopt the method of encrypting the original medical image, embed the watermark information in the encryption domain, and use the homomorphism of the encryption algorithm to transmit the encrypted watermark and medical image to a trusted third party [1, 9], This allows us to extract the watermark without using the original image or image encryption. This method can simultaneously ensure the safety, integrity and robustness of the watermark information and the medical image itself. Feature extraction is a key link in watermark embedding, and predecessors have achieved many research results. Discrete Cosine Transformation (DCT) [10], Discrete Wavelet Transform (DWT) [11], Zernike Moments (ZMs) [12], PseudoZernike Moments (PZMs) [13, 14], Polar Harmonic Transforms (PHTs) [15, 16]are commonly used feature extraction algorithms. The polar harmonic transforms (PHTs) proposed by YAP et al. [17] include polar complex exponential transform (PCET), polar cosine transform (PCT), and polar sine transform (PST). Compared with other

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for …

103

orthogonal moments, they have lower noise sensitivity, and the kernel function calculation is simple, without any numerical stability problems. Li et al. [18] evaluated the image performance capabilities of Zernike moments (ZMs), pseudo-Zernike (PZMs) and polar harmonic transformations (PHTs). When the ZM/PZM order exceeds a certain value, the quality of the reconstructed image will drop rapidly. PHTs do not have this problem and have better performance. To calculate PHTs faster, Singh et al. [19] proposed a method to quickly estimate PHTs using the recursion of the kernel function and the 8-way symmetric/antisymmetric properties. The pixel clustering at eight radially symmetric positions improves the calculation speed. The speed of this method is 3 to 4 times higher than the previous method. In this article, we combine zero-watermarking, chaotic encryption technology and 8-way symmetric/antisymmetric PHTs algorithm, and propose a robust zerowatermarking medical image encryption algorithm based on Polar Harmonic Transforms and discrete cosine transform.

2 Fundamental Theory 2.1 Polar Harmonic Transforms(PHTs) Polar harmonic transforms are the orthogonal moments defined in polar coordinates. The domain of polar coordinates is the unit circle. Projecting the horizontal and vertical coordinates of an N × N image to the interval [-1, 1] can get the PHTs in polar coordinates. The definition is: 2π 1 Hnm = λ

∗ f (r, θ )Vnm (r, θ )r dr dθ

(1)

0 0

where n, m are arbitrary integers, that is, n, m = 0, ±1, ±2, . . . , n is also called ∗ the order, m is the number of repetitions, and Vnm (r, θ ) is the kernel function. From Eq. (2), it can be seen that the basis function Vnm (r, θ ) is composed of two parts: radius-related component Rn (r ) and angle-related component e jmθ . According to the difference of the radius-related component Rn (r ), PHTs can be divided into Polar Complex Exponential Transform (PCET), Polar Sine Transform (PST), Polar Cosine Transform (PCT). Vnm (r, θ ) = Rn (r )e jmθ

(2)

⎧ j2πnr 2 ⎨ e  , f or PC E T Rn (r ) = cos π nr 2 ) , f or PC T   ⎩ sin π nr 2 , f or P ST

(3)

104

D. Yi et al.

PHTs basis functions are orthogonal, and any image can be reconstructed by the following formula: ∞ 

f (r, θ ) =

∞ 

Hnm Vnm (r, θ )

(4)

n=−∞ m=−∞

It can be seen from the definition formula (1) that when n, m take a fixed value, an approximate image of the original image can be obtained. As n, m increase, the reconstructed image will gradually approach the original image.

2.2 Logistic Map Logistic map is a chaotic model widely studied in chaotic mapping. Its definition is as follows: X K +1 = μ • X K • (1 − X K )

(5)

where, k is the number of iterations, μ is the control parameter, and xk ∈ (0, 1) is the system variable. When 3.569945 ≤ μ ≤ 4, the system enters a chaotic state. The closer μ is to 4, the stronger the chaos. Since the logistic map is extremely sensitive to the initial value, the logistic map can be used as an ideal logical key sequence.

3 Proposed Algorithm 3.1 Medical Image Encryption The flow chart of medical image encryption is shown in Fig. 1. First, use DWT transformation to obtain the subband wavelet coefficients cA, cH, cV, cD of the original medical image, and perform DCT transformation on each wavelet coefficient to obtain the coefficient matrix D(i, j). Pass the tent chaotic sequence through the sgn(x) function to obtain the binary encryption matrix C(i, j). sgn(x) =

1, x(n) ≥ 0 −1, x(n) < 0



X ( j) = sgn(X ( j)) 

C(i, j) = r eshapeX ( j)

(6) (7) (8)

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for …

105

Fig. 1 Medical image encryption process

Do a dot product of the D(i, j) and C(i, j) matrices to obtain the encryption coefficient matrixED (i, j). Perform IDCT transformation on ED (i, j) to obtain encrypted subband wavelet coefficientsE D(i, j), and perform IDWT transformation on matrix E D(i, j) to obtain encrypted medical image E(i, j). 

E D (i, j) = D(i, j)C(i, j)

(9)

E D(i, j) = I DC T 2(E D  (i, j))

(10)

E(i, j) = I DW T 2(E D(i, j))

(11)

Figure 2 shows the original medical images, and Fig. 3 shows the corresponding medical images encrypted by DWT-DCT and chaotic tent mapping.

Fig. 2 The original medical images: a Wrist, b Brain, c Coronary artery, d Arm, e Abdomen, f Foot, g Maxilla, h Lumbar spine

106

D. Yi et al.

Fig. 3 Corresponding encrypted medical images

3.2 Feature Extraction and Watermark Embedding We randomly select a medical image, perform DWT-DCT encryption and PHTsDCT transformation on it, and perform various attacks on the transformed image. We find that when we take a 4 × 8 coefficient matrix in the low-frequency coefficients and compare each coefficient with the mean value of the 32-bit coefficients in the matrix, the size transformation law is relatively stable. Therefore, in this experiment, we compare the mean values of these 32-bit coefficients, and stipulate that the coefficients greater than or equal to the mean value are replaced by “1”, and the coefficients less than the mean value are replaced by “0”, and the 32-bit binary sequence obtained is the character sequence. Based on the above method, we extracted the encrypted features of all the medical pictures in Fig. 2, and calculated the normalized correlation coefficients between all the feature sequences, as shown in Table 1. It can be seen from the table that all pictures and their own NC values are 1, and the NC values of other pictures are all less Table 1 Values of the correlation coefficients between different encrypted medical images Image (a)

(a) 1.00

(b) 0.12

(c)

(d)

−0.09

−0.25

(e)

(f)

0.23

0.18

(g) 0.21

(h) 0.21

(b)

0.12

1.00

−0.06

−0.17

0.16

−0.15

−0.08

0.36

(c)

−0.09

−0.06

1.00

−0.10

-0.08

0.37

−0.05

−0.05

(d)

−0.25

−0.17

−0.10

1.00

0.40

0.33

0.18

0.18

(e)

0.23

0.16

−0.08

0.40

1.00

0.23

0.24

0.24

(f)

0.18

−0.15

0.37

0.33

0.23

1.00

0.21

−0.12

(g)

0.21

−0.08

−0.05

0.18

0.24

0.21

1.00

−0.07

(h)

0.21

0.36

−0.05

0.18

0.24

−0.12

−0.07

1.00

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for …

107

Fig. 4 Feature extraction and watermark embedding process

than 0.5, indicating that the low-frequency coefficients of encrypted medical images after PHTs-DCT transformation can be used as effective visual feature vectors. The flowchart of feature extraction and watermark embedding is shown in Fig. 4. We perform DCT transformation on the encrypted medical image E(i, j) of 512 pixels × 512 pixels, and select a matrix of 128 pixels × 128 pixels to form the coefficient matrix A(i, j) to reduce data redundancy and improve computational efficiency. Perform PHTs transformation on the coefficient matrix A(i, j) to obtain approximate coefficients. Perform DCT transformation on approximate coefficients to obtain coefficient matrix F(i, j). Select the 4 × 8 matrix at the low frequency of F(i, j), and generate the characteristic binary sequence V (i, j) of the 32-bit medical image through the hash function. [H nm Real, H nm I mg] = P H T s(H (i, j))

(12)

F(i, j) = DC T 2(H nm I mg(i, j))

(13)

The initial value x0 passes through logistic map to get the chaotic sequence X ( j). Sort the values in the chaotic sequence X ( j) from small to large, and then scramble the position space of the watermark pixels according to the position changes before and after each value in X ( j) is sorted, and then get the encrypted watermark BW (i, j). The feature vector V (i, j) and BW (i, j) are XORed bit by bit, which completes the embedding of the watermark and obtains the logical key K ey(i, j) at the same time. K ey(i, j) = BW (i, j)



V (i, j)

(14)

3.3 Watermark Extraction and Decryption The flow chart of watermark extraction and decryption is shown in Fig. 5. Perform PHTs-DCT transformation on the encrypted medical image E  (i, j) after the attack,

108

D. Yi et al.

Fig. 5 Watermark extraction and restoration process

and extract the characteristic sequence V  (i, j). The characteristic sequence V  (i, j) and the logical key K ey(i, j) will be XORed to obtain the encrypted watermark BW  (i, j). BW  (i, j) = K ey(i, j)



V  (i, j)

(15)

Using the same method as in the watermark encryption process, sort the values of logistic chaotic sequence X ( j) in ascending order, according to the position changes before and after sorting each value in X ( j), the position space of the pixels in the watermark is restored to obtain the encrypted watermark W  (i, j).

4 Experiments and Results 4.1 Simulation Experiment We choose MATLAB 2018b as the simulation experiment platform. The experimental image is a 512 pixels × 512 pixels brain CT image (see 2(b)), and the watermark image is a 32 pixels × 32 pixels “CN” letter image. Regarding the setting of the initial value and growth parameters in the encryption process, the medical image is: x0 = 0.34, μ = 0.8, and the watermark image is: x0  = 0.2, μ = 4. The peak signal-to-noise ratio (PSNR) is used as an objective evaluation standard for medical image quality. Use the normalized correlation coefficient (NC) to compare the similarity between the embedded original watermark and the extracted watermark. The mathematical expression is as follows: M N max (I(i, j) )2 i, j P S N R = 10log10 2  i j (I(i, j) − I (i, j) )  i j W(i, j) W (i, j) 2 NC = i j W(i, j)

(16)

(17)

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for …

109

4.2 Attacks Results Take the brain medical image in Fig. 2b as an example, perform conventional attacks and geometric attacks. The test results are shown in Table 2. Part of the encrypted medical image after the attack and the corresponding extracted watermark is shown in Fig. 6. It can be seen from Table 2 that as the attack intensity increases, the quality of the extracted watermark gradually decreases. With Gaussian noise reaching 15%, JPEG compression reaching 8%, and the median filtering size reached 11 × 11, in the case of 10 filtering times, the NC value of the extracted watermark is still above 0.8, indicating that the proposed algorithm is relatively strong against conventional attacks robustness. Geometric attacks such as rotation are a difficult problem in digital watermarking. In this experiment, when the clockwise rotation angle reaches 18°, NC = 0.76, it Table 2 The PSNR and NC values under conventional and geometrical attacks

Attack types

Intensity of attacks

PSNR(dB)

NC

Gaussian noise

3%

15.39

0.93

10%

11.14

0.94

15%

10.07

0.81

1%

24.83

0.76

8%

30.43

0.80

15%

33.43

1.00

[3 × 3]

36.17

1.00

[5 × 5]

28.58

0.81

[7 × 7]

25.73

0.81

× 0.25

-

1.00

× 0.5

-

1.00

×4

-

1.00



16.44

0.71

18°

14.73

0.76

21°

14.45

0.54

3%

17.36

0.86

10%

14.69

0.80

20%

12.22

0.68

2%

-

0.86

5%

-

0.80

8%

-

0.63

3%

-

0.94

10%

-

0.80

18%

-

0.61

JPEG compression

Median filter (10 times)

Scaling

Rotation (clock wise)

Movement (right)

Cropping (Y direction)

Cropping (X direction)

110

D. Yi et al.

Fig. 6 Medical images and corresponding watermark images: (a) Gaussian noise (10%); (c) Median filter [7 × 7] (10times); (e) Scaling (×0.25); (g) Rotation (clockwise, 18z);(i) Translation (10%, right); (k) Cropping (3%, X direction); b, d, f, h, j, and l are extracted watermarks

can be seen from Fig. 2h that the watermark information with higher similarity is still extracted. When panning to the right and X-axis shearing reaches 10%, and the zooming degree reaches 0.25 times, the NC value of the extracted watermark is above 0.8, and the watermark information is visible, indicating that the proposed medical image encryption algorithm has good resistance to geometric attacks robustness.

4.3 Comparison With Unencrypted Algorithm We compare the proposed algorithm with the algorithm that does not encrypted medical images. The results are shown in Table 3. As can be seen from the table, the PSNR and NC values of plaintext domain and encrypted domain are Table 3 The PSNR and NC values in PHTs-DCT plaintext and encrypted domain Attacks

PSNR (dB)

NC value

Plaintext domain

Encrypted domain

Plaintext domain

Encrypted domain

Gaussian noise 15%

10.59

10.07

0.90

0.81

JPEG attack 10%

33.81

31.58

0.90

0.93

Median filtering[7 × 7]/10times

26.40

25.73

0.90

0.81

Scaling × 0.3





0.90

1.00

Rotation 8°(clock wise)°

16.21

16.44

0.90

0.71

Movement(right)20%

11.98

12.22

0.74

0.68

Cropping Y direction 3.5%





0.79

0.81

Cropping X direction 15%





0.74

0.74

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for …

111

Table 4 Comparison with other encrypted algorithms Attacks

NC value DCT

DWT

DWT-DCT

Gaussian noise 15%

0.76

0.68

0.81

JPEG attack 10%

0.84

0.83

0.93

Median filtering [7 × 7]/10times

0.66

0.70

0.81

Scaling × 0.3

0.89

0.87

1.00

Rotation 8°(clock wise)°

0.89

0.76

0.71

Movement(right)20%

0.71

0.70

0.68

Cropping Y direction 3.5%

0.83

0.76

0.81

Cropping X direction 15%

0.77

0.68

0.74

not much different, indicating that the proposed encryption algorithm has good homomorphism.

4.4 Comparison With Other Encrypted Algorithms We use the same method to perform DCT and DWT encryption experiments on the original medical images, and compare the experimental data with the DWT-DCT encryption data used in this article. The results are shown in Table 4. It can be seen from Table 4 that the DWT-DCT encryption algorithm only performs slightly inferior to the DWT encryption algorithm in rotation and translation attacks, and performs better in other attacks. The DCT encryption algorithm performs well against rotation attacks and is inferior to the proposed algorithm for ordinary attacks and zoom attacks. In terms of translation and clipping, the DCT encryption algorithm is similar to the proposed algorithm. On the whole, the proposed algorithm is more robust to various attacks.

5 Conclusion This paper proposes a robust zero-watermarking algorithm based on PHTs-DCT feature extraction and DWT-DCT medical image encryption. We take advantage of the polar harmonic transforms algorithm’s strong stability and resistance to geometric attacks, and then use the DCT algorithm to concentrate the energy in the lowfrequency region for feature extraction. The algorithm uses zero-watermarking technology to make the embedded watermark invisible, ensuring the clarity and integrity

112

D. Yi et al.

of medical images. Using the homomorphism of the encryption algorithm, the watermark can also be extracted without the original image or image encryption. The technology of encrypting medical images protects the information of the medical images themselves and greatly improves the security of medical images in network medical diagnosis. The simulation results show that the proposed algorithm is numerically stable and has good robustness to various attacks. Acknowledgements This work was supported in part by the Natural Science Foundation of China under Grant 62063004 and 61762033, in part by the Hainan Provincial Natural Science Foundation of China under Grant 2019RC018 and 619QN246, by the Postdoctoral Science Foundation under Grant 2020TQ0293, by the Science and Technology Research Project of Chongqing Education Commission Under Grant KJQN201800442, and by the General Project of Chongqing Natural Science Foundation Under Grant cstc2020jcyj-msxmX0422).

References 1. Vengadapurvaja, A.M., Nisha, G., Aarthy, R., Sasikaladevi, N.: An efficient Homomorphic medical image encryption algorithm for cloud storage security. Procedia Comput. Sci. 115, 643–650 (2017) 2. Yang, Y., Liu, X., Deng, R.H., Li, Y.: Lightweight sharable and traceable secure mobile health system. IEEE Trans. Dependable Secure Comput. pp. 1–1 (2017) 3. Qasim, A.F., Meziane, F., Aspin, R.: Digital watermarking: applicability for developing trust in medical imaging workflows state of the art review. Comput. Sci. Rev. 27, 45–60 (2017) 4. Guo, J., Zheng, P., Huang, J.: Secure watermarking scheme against watermark attacks in the encrypted domain. J. Vis. Commun. Image Represent. 30, 125–135 (2015) 5. Rajani, D., Kumar, P.R.: An optimized blind watermarking scheme based on principal component analysis in redundant discrete wavelet domain. Signal Process. 172, 107556.1–107556.9 (2020) 6. Zhang, X., Zhang, W., Sun, W., Xu, T., Jha, S.K.: A robust watermarking scheme based on ROI and IWT for remote consultation of COVID-19. Comput. Mater. Contin. 64, 1435–1452 (2020) 7. Xu, H., Kang, X., Chen, Y., Wang, Y.: Rotation and scale invariant image watermarking based on polar harmonic transforms. Opt. Int. J. Light Electron Opt. (2019) 8. Kang, X.B., Zhao, F., Lin, G.F., Chen, Y.J.: A novel hybrid of DCT and SVD in DWT domain for robust and invisible blind image watermarking with optimal embedding strength. Multimed. Tools Appl. (2017) 9. Dai, Y., Wang, H., Yu, Y., Zhou, Z.: Research on medical image encryption in telemedicine systems. Technol. Health Care Off. J. Eur. Soc. Eng. Med. 24 Suppl 2, S435 (2016) 10. Kumar, S., Panna, B., Jha, R.K.: Medical image encryption using fractional discrete cosine transform with chaotic function. Med. Biol. Eng. Comput. 57, 2517–2533 (2019). https://doi. org/10.1007/s11517-019-02037-3 11. Sy, N.C., Kha, H.H., Hoang, N.M.: An efficient robust blind watermarking method based on convolution neural networks in wavelet transform domain. Int. J. Mach. Learn. Comput. 10, 675–684 (2020). https://doi.org/10.18178/ijmlc.2020.10.5.990 12. Lutovac, B., Dakovi´c, M., Stankovi´c, S., Orovi´c, I.: An algorithm for robust image watermarking based on the DCT and Zernike moments. Multimed. Tools Appl. 76, 23333–23352 (2017). https://doi.org/10.1007/s11042-016-4127-2 13. Xin, Y., Liao, S., Pawlak, M.: Circularly orthogonal moments for geometrically robust image watermarking. Pattern Recognit. 40, 3740–3752 (2007). https://doi.org/10.1016/j.patcog.2007. 05.004

A Robust Zero-Watermarkinging Algorithm Based on PHTs-DCT for …

113

14. Wang, X.-Y., Hou, L.-M.: A new robust digital image watermarking based on Pseudo-Zernike moments. Multidimens. Syst. Signal Process. 21, 179–196 (2010). https://doi.org/10.1007/s11 045-009-0096-1 15. Upneja, R., Pawlak, M., Sahan, A.M.: An accurate approach for the computation of polar harmonic transforms. J. Light Electronoptic. (2018) 16. Singh, S.P., Urooj, S.: A new computational framework for fast computation of a class of polar harmonic transforms. J. Signal Process. Syst. 91, 915–922 (2019). https://doi.org/10.1007/s11 265-018-1417-0 17. Pew-Thian, Y., Xudong, J., Kot, A.C.:Two-dimensional polar harmonic transforms for invariant image representation. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1259–1270 (2010). https:// doi.org/10.1109/TPAMI.2009.119 18. Li, L., Li, S., Wang, G., Abraham, A.: An evaluation on circularly orthogonal moments for image representation. In: International Conference on Information Science & Technology (2011) 19. Singh, C., Kaur, A.: Fast computation of polar harmonic transforms. J. Real-Time Image Process. 10, 59–66 (2015). https://doi.org/10.1007/s11554-012-0252-y

Support System for Medicine and Healthcare

Recent Advancements on Smartwatches and Smartbands in Healthcare Marco Cipriano, Gennaro Costagliola, Mattia De Rosa, Vittorio Fuccella, and Sergiy Shevchenko

Abstract The purpose of this paper is to present recent advances regarding the use of smartwatches and smartbands in healthcare settings. To this end, we conducted a careful review of recent studies related to this area of the literature. This research revealed a significant increase in the literature regarding this topic in recent years. A substantial part of the works mainly concerns the monitoring of daily motor activities, in order to allow a healthy lifestyle, but there are also works about hospitalization and more serious diseases. In general, some results confirm that these devices can help to maintain a healthy lifestyle, meanwhile, they can also be used successfully for more serious and debilitating diseases, although in this case there is the need for more rigorous research, especially on larger populations, before they can be definitively integrated into clinical practice. Keywords Smartwatch · Smartband · Healthcare · Mobile health

1 Introduction Smartwatches and smartbands have become increasingly popular in the past few years. For instance, the global smartwatch market was valued at $20.64 billion in 2019 and is projected to reach $96.31 billion by 2027 [32], with many leading vendors such as Apple, Xiaomi, Fitbit, etc. operating in this sector. Such devices have similarities with the widespread smartphones (e.g., hardware, UI, software, operating system, etc.), but one major difference is that they are designed to be worn continuously even when not directly used (e.g. even during sleep), without impeding the user’s daily activities. In contrast, smartphones, when not in use, are typically placed in different locations (in a bag, in a pocket, on a desk, etc.).

M. Cipriano · G. Costagliola · M. De Rosa (B) · V. Fuccella · S. Shevchenko Department of Informatics, University of Salerno, Via Giovanni Paolo II, Fisciano, (SA), Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_10

117

118

M. Cipriano et al.

Smartwatches and smartbands are typically equipped with many sensors, some typically already found in smartphones (e.g. accelerometers, gyroscopes, GPS, etc.), and others that take advantage of being worn by the user (e.g. heart rate, etc.). This is particularly advantageous for applications that require continuous monitoring of the physical state (and the physical activity performed by the user), for example to give alerts or suggestions based on the changes in the parameters monitored by the sensors and the location of the user. Moreover, alarms and notifications can also be observed more easily than on smartphones, as users can be alerted with vibrations, sounds, or screen text while wearing the device [33], being also able to directly respond to such notifications, for example writing [34–37] a reply to a just notified instant message. In fact, the user-friendliness and the possibility of being able to wear them all the time are the most important factors that allow continuous and long-term monitoring. In addition to continuously track data related to the physical state, the presence of many other practical features makes smartwatches good platforms for healthcare applications [1]. There is high development potential with regards to the use of smartwatches in healthcare applications, and the ability to run software applications (apps) on them allows their use in a variety of healthcare scenarios, depending on the user needs. This possibility can play a very important role in improving the cure and prevention of many diseases that afflict humans today. In recent years, research in this area has increased greatly, also taking advantage of the new devices that have entered the market. The purpose of this paper is to analyze the recent literature in this area that exploits these new devices. For analyses of studies that are not as recent, we refer to previous surveys [38, 39]. The paper is organized as follows: Sect. 2 describes how we performed the literature review, while Sect. 3 describes the results of our review for the categories we have identified. Finally, Sect. 4 discusses our findings and Sect. 5 concludes the paper.

2 Literature Analysis Recent studies on smartwatches/smartbands have highlighted that, although there have been several research studies on them, there is a lack of user-centered design and usability testing prior to implementation [39]. In this paper we, therefore, want to examine the use of smartwatches/smartbands in the healthcare sector, analyzing in which cases they are used, with particular attention to the obtained results.

Recent Advancements on Smartwatches and Smartbands in Healthcare

119

In order to find recent studies on the use of smartwatches/smartbands for health, we conducted bibliographic research using dblp,1 IEEE Xplore,2 ScienceDirect,3 ACM Digital Library4 and PubMed5 with focus on recent literature. We immediately noticed a wide range of studies, this highlight that the scientific community believes in the use of these devices for healthcare. In our research on the digital libraries we used queries including “smartwatch application”, “smartwatch app”, “smartwatch healthcare” (and their variation with “smart watch”, “smartband”, “smart band”), and the most notable brands of smartwatch/smartband:“Android Wear”, “Apple Watch”, “Xiaomi”, “Amazfit”, “Garmin”, “Samsung Galaxy”, “Fitbit”. After collecting the results we first eliminated duplicates, then read the abstracts and eliminated papers that were obviously not relevant. Then, we read the remaining documents and excluded the non-relevant ones, while checking the references for possible missing papers. Finally, we analyzed the remaining papers and categorized them. We also identified for each of them the target, the number of experiment participants, the used sensors, and the setting.

3 Results Our literature review led us to an in-depth analysis of 31 papers. The selected articles were published in 2014 [1, 2] (n = 2), 2015 [3] (n = 1), in 2016 [4] (n = 1), in 2017 [5, 6] (n = 2), in 2018 [7–16] (n = 10), and in 2019 [17–31] (n = 15). In the papers we analyzed, most researchers focused on the use of smartwatches/smartbands to monitor daily motor activities (to enable a healthy lifestyle), [9, 10, 12, 18–26], but we have seen that the scientific community has also moved to more serious pathologies (see Table 1). In particular, we tried to categorize the recent literature according to the targets in which the devices were used. We have found the following categories: healthy people (without evident pathologies), nursing assistance, mood state, heart diseases, tremor, memory loss, multiple sclerosis, sleep pathologies, other pathologies and physical activity. In the next sections, we describe these categories and their papers in detail.

3.1 Healthy People In recent years, wearable devices have revolutionized the fitness field [23]. In [9], among other things, pedometers/accelerometers were compared with the FitBit Flex 1 https://dblp.org. 2 https://ieeexplore.ieee.org. 3 https://www.sciencedirect.com. 4 https://dl.acm.org. 5 https://pubmed.ncbi.nlm.nih.gov.

120

M. Cipriano et al.

Table 1 Summary of the reviewed literature Paper

Category

Sensors

1

Nursing assistance

Multiple

Experiment participants –

2

Tremor

Accelerometer

41

3

Nursing assistance

Accelerometer



4

Physical activity (stroke)

Accelerometer

37

5

Physical activity (stroke)

Multiple

10

6

Heart disease

Heart rate

20

7

Physical activity (diabetes)

Accelerometer

138

8

Mood state

Accelerometer

50

9

Healthy people

Accelerometer

35

10

Healthy people

Accelerometer



11

Nursing assistance



10

12

Physical activity (cancer)

Accelerometer

20

13

Heart disease

Accelerometer

20

14

Mood state (PTSD)

Accelerometer, heart rate

13

15

Sleep pathologies

Sleep

223

16

Tremor

Accelerometer, heart rate



17

Mood state (depressive disorder)

Accelerometer, heart rate

30

18

Healthy people

GPS, Accelerometer

5

19

Healthy people

Accelerometer, heart rate



20

Healthy people

Heart rate

65

21

Healthy people

Multiple

1

22

Healthy people

Heart rate

20

23

Healthy people

Multiple



24

Healthy people

Accelerometer

38

25

Healthy people

Accelerometer

18

26

Healthy people

Accelerometer, Heart rate

4

27

Heart disease

Heart rate

40

28

Physical activity (osteoarthritis)

Heart rate

15

29

Memory losses

Camera

48

30

Physical activity (multiple sclerosis)

Accelerometer

629

31

Sleep walking

Accelerometer

2

smartband in assessing sedentary time in a study with 35 participants (with the ability to walk independently). Sedentary time from all devices was significantly correlated with each other, with a 23.3% difference between the time recorded by Fitbit Flex and the hip-worn Actical accelerometer. To monitor the user state, wearable fitness trackers (WFTs) with heart rate monitoring are also used [26]. Studies in [22] show that although the WFTs were consumer level, the WFTs produced results very close to the ones of dedicated medical devices. In fact, the mean absolute percent difference between the WFTs and a chest strap

Recent Advancements on Smartwatches and Smartbands in Healthcare

121

heart rate monitor (HRM) was 6.9%. Overall, >75% of WFTs were within 5–10% of the HRM. It is important to highlight that children, 19 in the first experiment and 20 in the second one, were used as participants not only during school activity but also during rest, play, etc. In addition to measuring heart rate and the number of steps walked, wearable devices also include sensors that allow measuring Cardiorespiratory fitness (CRF). In [20] the Fitbit Charge 2’s measure of CRF was compared with gold standard ˙ 2max testing in an experiment with 65 participants (age 18–45) involved for one VO week in which they completed a qualifying outdoor run. The results show that the absolute percentage error was less than 10% for each comparison and that the device has an acceptable level of accuracy for the tested population. Given the accuracy of such devices, it is possible to build mathematical models that help people who practice strenuous physical activity (endurance activities) to do so efficiently, avoiding injuries and cardiovascular problems. Users, in fact, are ready to integrate smart devices into their lives to achieve these results, although such use may create privacy issues [19]. Smartwatches/smartbands, being devices with limited computing power, may utilize smartphones to perform more expensive computations, especially when it comes to mathematical models based on machine learning. Using this architecture, a monitoring system for the elderly [10] makes use of wearable device’s accelerometer raw data to allow intelligent fall detection, demonstrating greater accuracy than traditional systems. Smart devices evolve very rapidly and medical scientists are not always able to implement the software since very specific knowledge is required. To solve this problem, frameworks have been proposed [21] that allow the creation of complex processing systems using sets of ready-to-use tools. Moreover, one of the strengths of smart wearable devices is that they have the potential to be used by people of any age and can be used for mobility monitoring [18]. In recent years there has been interest in student monitoring. In [25] wearable devices are used to monitor student activity to improve their learning ability during school lessons. Thanks to a study with 18 participants, it has been proven that students have specific movement patterns depending on whether they are concentrated or not. In [24] the author performed a 12-week pilot study with 38 college students, in which social media and smartwatches (to estimate the number of steps walked) were used. The authors show that is possible to improve physical activity and dietary behaviors, although the use of smartwatches may not have provided an additional benefit (with respect to social media).

3.2 Nursing Assistance The use of smartwatches in hospitals and nursing homes has great potential. In [11] a smartwatch-based system is used to allow real-time vital-sign monitoring, threshold alarms, and to-do lists handling. In fact, the decrease in reaction time is evident, and in

122

M. Cipriano et al.

some cases, the presence of smart wearables allows for more efficient communication within the facility between the medical staff. Thanks to the smartwatch system, staff response time to call lights has been reduced from bathrooms by 58%, from bedrooms by 40%, and from bed exit alarms by 29%. On the other hand, the use of to-do lists has not shown a notable improvement in task management, but the smartwatch system reduced perceived workload by about 50%. Another smart-watch system to help nurses in hospital settings is presented in [3]. In [1] instead, the authors looked at the advantages and disadvantages of smartwatches for supporting older adults at home, concluding that such devices show promise, but that significant effort is needed in designing appropriate user interfaces and hardware to address the constraints associated with potential physical and cognitive impairments.

3.3 Mood State Very important aspects of the daily life of everyone are the mood and the emotional state. In [8] thanks to sensors (e.g. accelerometers) the authors were able to understand the emotional state and consequently act on it, showing that the way a person walks reflects their current state of mind. In the experiment condition, the identification accuracy of sadness and happiness was 78% using as a sensor only a smartwatch accelerometer. Smartwatches have also been used [17] to provide real-time assessment of mood and cognitive assessment of people with major depressive disorder. The increase in the health data that can be obtained through wearable devices has offered and still offers new opportunities in therapies for the treatment of mental health, as in the case of post-traumatic stress disorder (PTSD) that afflicts former war veterans [14]. In this study with 13 subjects, the use of these devices increased self-awareness, supported social interactions, and interaction with other veterans.

3.4 Heart Diseases A sensor that monitors the heart rate is present in most wearable devices in the retail market. Therefore, it can be useful to compare the accuracy of different devices, both smartwatches and smartbands, especially if one plans to use them in healthcare. The study conducted in [6] shows that both Apple Watch and Fitbit are suitable devices to measure heart rate in patients with heart failure (with a slight advantage for Fitbit for some measurements) and show promise for the prognosis and the continuous monitoring of outpatients. In [13], the use of the FitBit Flex smartband to monitor step count and distance walked in post-cardiac surgery patients was assessed. However, in this case, the

Recent Advancements on Smartwatches and Smartbands in Healthcare

123

device showed too high an error in measuring steps and distance traveled for use in a medical setting. In the case of atrial fibrillation (AF), smartwatches are promising for non-invasive monitoring, combined with other precautions. The algorithm used in [27] is able, with 98.1% accuracy, to identify irregular pulse, patients prone to AF, and those with AF in the past. Despite their advanced age, patients were inclined to use the device, and some felt safer wearing a smartwatch that could alert others if the situation worsened.

3.5 Tremor Tremor is a disease that obviously has the potential of being monitored by motiondetecting sensors. In [2] the authors used smartwatches to distinguish between Parkinson’s disease postural tremor and essential tremor with an accuracy of 90.9%. In [16] instead the author considers whether wearable technologies can provide diagnostically relevant information about essential tremor and can be used for premature detection, diagnosis, education, and treatment of essential tremor. Regarding this topic, however, further research is needed.

3.6 Memory Loss The number of people with dementia worldwide is estimated to increase from 50 million to 152 million by 2050 [29]. Technology is seen as a promising tool to improve the lives of people who unfortunately have ongoing memory loss. Social Support Aid (SSA) [29] is a mobile application for smartphones paired with smartwatches, that allows facial recognition. However, there was no evidence of improvement in the quality of social interaction or the quality of life of people with this disease. Nonetheless, the obtained results bode well for the development and evaluation of future assistive technologies.

3.7 Sleep Pathologies The sleep and body mass index (BMI) relationship is well known. The use of wearable devices has allowed researchers to evaluate it objectively. In [15] the authors used the Fitbit smartband data of 223 participants to assess that the average hours of sleep per day were negatively associated with the body mass index (BMI) and observed that such devices allow for continuous, accurate data. Sleep disorders that can cause severe behaviors include REM sleep and sleepwalking. These pathologies cause the person to perform activities usually performed during a state of full consciousness when in a state of low consciousness. In [31] the

124

M. Cipriano et al.

authors consider how using Fitbit devices to monitor steps and movements during the night may be crucial to be able to target and deal with these pathologies.

3.8 Other Pathologies and Physical Activity Exercise is an integral part of diabetes treatment, in particular aerobic exercise, as physical activity is considered an excellent medicine. In [7], however, the use of the Fitbit smartband did not lead to a steady increase in the daily steps of patients with this disease. A total of 138 subjects divided into 3 groups participated in the study: FitBit only (48), Fitbit with a daily text message reminding them to wear the Fitbit (44), and a daily text message about a step goal (46). In all groups, after the novelty effect wore off after the first few days, the number of daily steps steadily decreased. The use of smartwatches/smartbands is common among people with multiple sclerosis, as frequent physical activity is helpful in containing and managing the symptoms. People who use such devices, in a survey with 629 participants [30], report being inclined and driven to more physical activity (compared to those who do not use them). In [28], the authors evaluated the performance of a Fitbit device in estimating physical activity in 25 people with osteoarthritis, as physical activity improves the patient’s quality of life. However, this study did not show great results, rather the authors concluded that prudence should be exercised when using the Fitbit device by patients with this pathology. Finally, in [4, 5], the focus was on monitoring physical activity in individuals who had experienced a stroke, while in [12] the focus was on cancer survivors.

4 Discussion The literature study shows a strong message from the scientific community regarding the need to devise ways to use smartwatches/smartbands to aid in health protection and prevention. In 2020, it has been 10 years since retail wearable devices first appeared, and the technology has continued to evolve. It has gone from smartbands to smartwatches, and in recent years it has returned to smartbands as they are more comfortable in everyday use, have longer battery life, and can be used conveniently at night. Smartwatches have more potential in cases that involve notifying the wearer, for the simple reason that they have a larger display. They can also already be worn by people, as they are normally used for other purposes as well. On the other hand, smartbands are most used in monitoring motor and cardiac activity, both for people with and without diseases. The sensors in these devices have been shown to have accuracies comparable to those of dedicated medical devices, as well as being easy to use.

Recent Advancements on Smartwatches and Smartbands in Healthcare

125

5 Conclusions The main advantage of using smartwatches/smartbands is their small shape and size, which allows them to be worn and used all day long. As can be seen from the papers we have described, they are very useful and helpful tools to monitor steps, movement, and heart rate. The use of these devices by patients with chronic diseases and conditions is promising; we have also inferred and understood that, in order to achieve broad acceptance by health care professionals, rigorous research on their accuracy, completeness, and effect on the physician’s work must be conducted before finally integrating them into clinical practice. Studies with users (patients) are needed to investigate features that the clinician cannot ignore, user interface design, and usability for a wide variety of clinical settings that may arise. Finally, we can say that more research is needed to understand the impact of these devices in clinical practice. Acknowledgements This work was partially supported by the grant “Fondo FSC 2014 2020 per il Piano Stralcio Ricerca ed Innovazione 2015–2017—MIUR—progetto MEDIA” (project code: PON03PE_00060_5/12 - D54G14000020005).

References 1. Frederic, E., Christian, L.: Supporting elderly homecare with smartwatches: Advantages and drawbacks. Studies in Health Technology and Informatics 205(e-Health – For Continuity of Care), 667–671 (2014), https://doi.org/10.3233/978-1-61499-432-9-667 2. Wile, D.J., Ranawaya, R., Kiss, Z.H.: Smart watch accelerometry for analysis and diagnosis of tremor. Journal of Neuroscience Methods 230, 1–4 (2014), https://www.sciencedirect.com/ science/article/pii/S016502701400137X 3. Bang, M., Solnevik, K., Eriksson, H.: The nurse watch: design and evaluation of a smart watch application with vital sign monitoring and checklist reminders. In: AMIA Annual Symposium Proceedings. vol. 2015, pp. 314–319. American Medical Informatics Association (2015), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765660/ 4. Rozanski, G.M., Aqui, A., Sivakumaran, S., Mansfield, A.: Consumer wearable devices for activity monitoring among individuals after a stroke: A prospective comparison. JMIR Cardio 2(1), e1 (2018), https://doi.org/10.2196/cardio.8199 5. Mooney, R., Quinlan, L.R., Corley, G., Godfrey, A., Osborough, C., ÓLaighin, G.: Evaluation of the Finis Swimsense® and the Garmin Swim™ activity monitors for swimming performance and stroke kinematics analysis. PLOS ONE 12(2), 1–17 (2017), https://doi.org/10. 1371/journal.pone.0170902 6. Moayedi, Y., Abdulmajeed, R., Posada, J.D., Foroutan, F., Alba, A.C., Cafazzo, J., Ross, H.J.: Assessing the use of wrist-worn devices in patients with heart failure: Feasibility study. JMIR Cardio 1(2), e8 (2017), https://doi.org/10.2196/cardio.8301 7. Polgreen, L.A., Anthony, C., Carr, L., Simmering, J.E., Evans, N.J., Foster, E.D., Segre, A.M., Cremer, J.F., Polgreen, P.M.: The effect of automated text messaging and goal setting on pedometer adherence and physical activity in patients with diabetes: A randomized controlled trial. PLOS ONE 13(5), 1–12 (2018), https://doi.org/10.1371/journal.pone.0195797

126

M. Cipriano et al.

8. Quiroz, J.C., Geangu, E., Yong, M.H.: Emotion recognition using smart watch sensor data: Mixed-design study. JMIR Mental Health 5(3), e10153 (2018), https://mental.jmir.org/2018/ 3/e10153/ 9. Donahoe, K., MacDonald, D.J., Tremblay, M.S., Saunders, T.J.: Validation of piezorx pedometer derived sedentary time. International journal of exercise science 11(7), 552–560 (2018), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6033509/ 10. Mauldin, T., Canby, M., Metsis, V., Ngu, A., Rivera, C.: Smartfall: A smartwatch-based fall detection system using deep learning. Sensors 18(10), 3363 (2018), https://doi.org/10.3390/ s18103363 11. Ali, H., Li, H.: Evaluating a smartwatch notification system in a simulated nursing home. International Journal of Older People Nursing 14(3), e12241 (2019), https://onlinelibrary.wiley. com/doi/abs/10.1111/opn.12241 12. Hardcastle, S.J., Galliott, M., Lynch, B.M., Nguyen, N.H., Cohen, P.A., Mohan, G.R., Johansen, N.J., Saunders, C.: Acceptability and utility of, and preference for wearable activity trackers amongst non-metropolitan cancer survivors. PLOS ONE 13(12), 1–13 (2019), https://doi.org/ 10.1371/journal.pone.0210039 13. Daligadu, J., Pollock, C.L., Carlaw, K., Chin, M., Haynes, A., Thevaraajah Kopal, T., Tahsinul, A., Walters, K., Colella, T.J.: Validation of the Fitbit Flex in an acute post–cardiac surgery patient population. Physiotherapy Canada 70(4), 314–320 (2018), https://doi.org/10.3138/ptc. 2017-34 14. Ng, A., Reddy, M., Zalta, A.K., Schueller, S.M.: Veterans’ perspectives on Fitbit use in treatment for post-traumatic stress disorder: An interview study. JMIR Mental Health 5(2), e10415 (2018), https://mental.jmir.org/2018/2/e10415/ 15. McDonald, L., Mehmud, F., Ramagopalan, S.V.: Sleep and BMI: Do (Fitbit) bands aid? F1000Research 7(511) (2018), https://doi.org/10.12688/f1000research.14774.2 16. Daneault, J.F.: Could wearable and mobile technology improve the management of essential tremor? Frontiers in Neurology 9, 257 (2018), https://www.frontiersin.org/article/10.3389/ fneur.2018.00257 17. Cormack, F., McCue, M., Taptiklis, N., Skirrow, C., Glazer, E., Panagopoulos, E., van Schaik, T.A., Fehnert, B., King, J., Barnett, J.H.: Wearable technology for high-frequency cognitive and mood assessment in major depressive disorder: Longitudinal observational study. JMIR Mental Health 6(11), e12814 (2019), https://mental.jmir.org/2019/11/e12814 18. Kheirkhahan, M., Nair, S., Davoudi, A., Rashidi, P., Wanigatunga, A.A., Corbett, D.B., Mendoza, T., Manini, T.M., Ranka, S.: A smartwatch-based framework for real-time and online assessment and mobility monitoring. Journal of Biomedical Informatics 89, 29–40 (2019), https://www.sciencedirect.com/science/article/pii/S1532046418302120 19. Nelson, E.C., Verhagen, T., Vollenbroek-Hutten, M., Noordzij, M.L.: Is wearable technology becoming part of us? developing and validating a measurement scale for wearable technology embodiment. JMIR Mhealth Uhealth 7(8), e12771 (2019), https://mhealth.jmir.org/2019/8/ e12771/ 20. Klepin, K., Wing, D., Higgins, M., Nichols, J., Godino, J.G.: Validity of cardiorespiratory ˙ 2max . Medicine & Science in Sports & Exercise fitness measured with Fitbit compared to VO 51(11), 2251–2256 (2019), https://doi.org/10.1249/mss.0000000000002041 21. Shah, L.M., Yang, W.E., Demo, R.C., Lee, M.A., Weng, D., Shan, R., Wongvibulsin, S., Spaulding, E.M., Marvel, F.A., Martin, S.S.: Technical guidance for clinicians interested in partnering with engineers in mobile health development and evaluation. JMIR Mhealth Uhealth 7(5), e14124 (2019), https://mhealth.jmir.org/2019/5/e14124/ 22. Brazendale, K., Decker, L., Hunt, E.T., Perry, M.W., Brazendale, A.B., Weaver, R.G., Beets, M.W.: Validity and wearability of consumer-based fitness trackers in free-living children. International journal of exercise science 12(5), 471–482 (2019), https://www.ncbi.nlm.nih.gov/ pmc/articles/PMC6413843/ 23. Arigo, D., Jake-Schoffman, D.E., Wolin, K., Beckjord, E., Hekler, E.B., Pagoto, S.L.: The history and future of digital health in the field of behavioral medicine. Journal of Behavioral Medicine 42(1), 67–83 (2019), https://doi.org/10.1007/s10865-018-9966-z

Recent Advancements on Smartwatches and Smartbands in Healthcare

127

24. Pope, Z., Barr-Anderson, D., Lewis, B., Pereira, M., Gao, Z.: Use of wearable technology and social media to improve physical activity and dietary behaviors among college students: A 12-week randomized pilot study. International Journal of Environmental Research and Public Health 16(19), 3579 (2019), https://doi.org/10.3390/ijerph16193579 25. Liang, J.M., Su, W.C., Chen, Y.L., Wu, S.L., Chen, J.J.: Smart interactive education system based on wearable devices. Sensors 19(15), 3260 (2019), https://doi.org/10.3390/s19153260 26. Collins, T., Woolley, S.I., Oniani, S., Pires, I.M., Garcia, N.M., Ledger, S.J., Pandyan, A.: Version reporting and assessment approaches for new and updated activity and heart rate monitors. Sensors 19(7), 1705 (2019), https://doi.org/10.3390/s19071705 27. Ding, E.Y., Han, D., Whitcomb, C., Bashar, S.K., Adaramola, O., Soni, A., Saczynski, J., Fitzgibbons, T.P., Moonis, M., Lubitz, S.A., Lessard, D., Hills, M.T., Barton, B., Chon, K., McManus, D.D.: Accuracy and usability of a novel algorithm for detection of irregular pulse using a smartwatch among older adults: Observational study. JMIR Cardio 3(1), e13850 (2019), https://doi.org/10.2196/13850 28. Silva, G.S., Yang, H., Collins, J.E., Losina, E.: Validating Fitbit for evaluation of physical activity in patients with knee osteoarthritis: Do thresholds matter? ACR Open Rheumatology 1(9), 585–592 (2019), https://onlinelibrary.wiley.com/doi/abs/10.1002/acr2.11080 29. McCarron, H.R., Zmora, R., Gaugler, J.E.: A web-based mobile app with a smartwatch to support social engagement in persons with memory loss: Pilot randomized controlled trial. JMIR Aging 2(1), e13378 (2019), https://doi.org/10.2196/13378 30. Silveira, S.L., Motl, R.W.: Activity monitor use among persons with multiple sclerosis: Report on rate, pattern, and association with physical activity levels. Multiple Sclerosis Journal - Experimental, Translational and Clinical 5(4) (2019), https://doi.org/10.1177/2055217319887986 31. Somboon, T., Grigg-Damberger, M.M., Foldvary-Schaefer, N.: Night stepping: Fitbit cracks the case. Journal of Clinical Sleep Medicine 15(02), 355–357 (2019), https://jcsm.aasm.org/ doi/abs/10.5664/jcsm.7646 32. Tewari, D., Patil, A.: Smartwatch market by product, application and operating system: Global opportunity analysis and industry forecast, 2020-2027s. https://www.alliedmarketresearch.com/smartwatch-market (2020), https://www. alliedmarketresearch.com/smartwatch-market 33. O’Donnell, B.: Smartwatches: The new smartphones jr.? https://www.usatoday.com/story/ tech/2015/04/02/70825490/ (2015), https://eu.usatoday.com/story/tech/2015/04/02/ smartwatches-the-new-smartphones/70825490/ 34. Costagliola, G., De Rosa, M., Fuccella, V.: Handwriting on smartwatches: An empirical investigation. IEEE Trans. Human-Mach. Syst. 47(6), 1100–1109 (2017) 35. Costagliola, G., De Rosa, M., D’Arco, R., De Gregorio, S., Fuccella, V., Lupo, D.: C-QWERTY: a text entry method for circular smartwatches. In: The 25th Int. DMS Conference on Visualization and Visual Languages. pp. 51–57 (2019) 36. Costagliola, G., D’Arco, R., De Gregorio, S., De Rosa, M., Fuccella, V., Lupo, D.: Text entry on circular smartwatches: the C-QWERTY layout. Journal of Visual Language and Computing 2019(2), 127–133 (2019) 37. De Rosa, M., Fuccella, V., Costagliola, G., Adinolfi, G., Ciampi, G., Corsuto, A., Di Sapia, D.: T18: an ambiguous keyboard layout for smartwatches. In: 2020 IEEE International Conference on Human-Machine Systems (ICHMS). pp. 1–4 (2020) 38. Lukowicz, P., Kirstein, T., Tröster, G.: Wearable systems for health care applications. Methods of information in medicine 43(03), 232–238 (2004), https://pubmed.ncbi.nlm.nih.gov/ 15227552/ 39. Lu, T.C., Fu, C.M., Ma, M., Fang, C.C., Turner, A.: Healthcare applications of smart watches. Applied Clinical Informatics 07(03), 850–869 (2016), https://doi.org/10.4338/aci-2016-03-r0042

A Proposal of Architecture Framework and Performance Indicator Derivation Model for Digitalization of Quality Management System Kasei Miura, Nobuyuki Kobayashi, Tetsuro Miyake, Seiko Shirasaka, and Yoshimasa Masuda Abstract Manufacturers of pharmaceuticals and medical devices are accountable for the quality and stable supply of their health-related products. In order to guarantee the provision of products that meet quality requirements, they have established and are operating quality management systems. On the other hand, as the risk of unstable supply increases due to pandemics and outsourcing of supply chains, there is a need to strengthen the monitoring capabilities for early detection of quality problems and countermeasures. Recently, with the advancement of digital technology, IoT, automation and block chain technologies are expected to be used in the field of quality management. Alternatively, quality management systems must continuously maintain their performance in the process of change. However, no method has been established to systematically promote the digitalization of quality management systems and to evaluate the performance of the transition process. In this paper, we propose an architecture framework and a model for deriving performance indicators to promote the digitalization of quality management systems and to evaluate the performance of the transition process. Issues and future plans in this field are also discussed. Keywords Quality management system · Digitalization · Architecture framework · Performance measurement

K. Miura (B) · S. Shirasaka · Y. Masuda Graduate School of System Design and Management, Keio University, Kanagawa, Japan e-mail: [email protected] N. Kobayashi The System Design and Management Research Institute of Graduate School of System Design and Management, Keio University, Kanagawa, Japan K. Miura · T. Miyake Bayer Yakuhin Ltd, Osaka, Japan Y. Masuda The School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_11

129

130

K. Miura et al.

1 Introduction A quality management system is a cross-departmental system that is established and operated within a company to ensure the quality of products offered by a manufacturer [1]. Quality management system standards such as ISO9001 have been published and are accepted internationally [1]. In addition, in the healthcare industry, ICH Q10 and PIC/S GMP/GDP provide quality management system models for manufacturing and distributing of pharmaceuticals [2–4]. Manufacturers must establish and maintain a quality management system according to regulatory requirements. A quality management system includes two types of monitoring and evaluation. The first is monitoring and evaluation of pre-planned operations. The second is to monitor and evaluate changes affecting products, processes, and quality management systems, such as product strategies, regulations, medical needs, and supply chains, to develop or change plans. A quality management system is a system that supports these two types of monitoring and evaluation to make quality decisions in a knowledge-intensive manner [5]. In addition, performance of a quality management system itself needs to be properly monitored and evaluated to ensure that requirements are being met on an ongoing basis [6]. However, collection, storage, analysis, and use of data to conduct this extensive monitoring and evaluation are a challenge [7]. Additionally, complexity of supply chain systems is increasing due to active use of outsourcing, and risk of an unstable supply of products is increasing due to pandemics and other factors [8]. To detect quality problems at an early stage and take countermeasures, companies must strengthen the monitoring and evaluation capabilities of their quality management systems [9]. Alternatively, recent advances in digital technology are expected to transform business at various levels, such as creating new value, improving the operational efficiency, and building new business models. Companies are developing digital strategies to transform their businesses. Enterprise Architecture (EA) frameworks have been used to build and deploy digital strategies [10]. EA is a systematic approach that encompasses all the artifacts of the enterprise, including business, applications, data, and infrastructure, and provides a roadmap from a current architecture to a future architecture. EA supports creation of a roadmap from the current architecture to the future architecture [11]. Furthermore, an adaptive integrated digital framework has been proposed and is expected to be used to drive the digital transformation [12]. A quality management system is a comprehensive system that monitors quality of all the company’s deliverables. It is expected to be useful to apply the adaptive integrated digital framework because it requires integrated management of departments that have their own organizational strategies. Therefore, the current study proposess an architecture framework (AF) for quality management systems that combine the Adaptive Integrated Digital Architecture Framework (AIDAF) for digitalizing quality management systems. Additionally, we propose a performance indicator derivation model to be used in the AF and to then evaluate the model’s effectiveness. To utilize the proposed AF and the performance indicator derivation model in the context of quality management, we designed a quality management system architecture, which was referred to in a company. To

A Proposal of Architecture Framework and Performance ...

131

evaluate the effectiveness of the proposed AF and performance indicator derivation model, a case study of a quality management system in a manufacturer was used for evaluation. In the quality management system case study, the following RQs were evaluated: RQ1: How can the proposed AF help manage the digitalization of quality management systems? RQ2: How can the proposed performance indicator derivation model manage performance measurement in the digitalization of quality management systems? This paper consists of the following contents: first, the background of the research and related studies is introduced. Then, we propose an architectural framework and a model for deriving performance indicators for promoting the digitalization of quality management systems and evaluating performance in this process. Next, a case study of their application in a manufacturer is reported. Finally, future issues are presented.

2 Background and Related Research 2.1 Quality Management System In the healthcare industry, companies are accountable for the products they provide and must establish a quality management system to comply with regulatory requirements. The requirements and models for quality management systems include ISO13485, ICH Q10, and PIC/S GMP/GDP [2–4, 6]. A quality management system requires various monitoring activities implement quality management to provide products that meet requirements [13]. The first step is to monitor the requirements, i.e., regulatory requirements and medical needs. Products, processes, and quality management systems are planned based on the identified requirements. Next, there is monitoring of operations. This is monitoring of the operation status of the planned products, processes, and quality management systems. This monitoring includes monitoring throughout the product life cycle and monitoring the entire supply chain. Finally, there is monitoring of changes, i.e., changes that affect the products, processes, and quality management systems. In addition, there is monitoring of performances of a quality management system to evaluate that all of this monitoring is being done properly. In other words, the effectiveness, adequacy, and suitability of a quality management system must be evaluated and maintained. Moreover, a quality management system is organized across departments within a company. Each individual department is managed according to its own organizational strategy. Therefore, inter-organizational coordination is necessary for the formulation and development of strategies in a quality management system.

132

K. Miura et al.

2.2 Performance Measurement Performance measurement plays an important role in embodying strategies, communicating them to stakeholders, and assessing their effectiveness [14]. Recently, to respond to increasing variability, uncertainty, complexity, and ambiguity of the environment surrounding organizations, they are changing their strategies more dynamically to achieve their objectives. However, it is sometimes difficult to align the performance measurement with the changes in strategy, resulting in an inconsistency, and there is a need for methods to appropriately and dynamically manage performance measurement. Considering the above background, previous studies have proposed a framework for deriving performance indicators based on the linkage between an organization’s states and interventions for realization [15]. This framework for deriving performance indicators consists of four steps: 1. 2. 3. 4.

Estimate transition states of an organization Analyze organizational enablers of each state Analyze functions of strategic interventions Derive performance indicators from the results of the analysis of organizational enablers and strategic interventions

With this framework, it is expected to measure and evaluate organizational performance timely.

2.3 Digitalization in Manufacturing Control and Quality Control IoT in a factory is expected to enable unattended production control and quick response to problems based on constant monitoring [16]. The use of IoT in manufacturing has been studied and reviewed [17]. Moreover, cloud manufacturing is being considered to use manufacturing capabilities and resources on the cloud platform [18]. The timely analysis of large-scale data can reveal anomalies, unknown correlations, and other useful information that are difficult to detect by humans [19]. Advances in big data analytics are helping evolve a research landscape the from retrospective to predictive and prescriptive [20]. Big data analytics can contribute to a steady supply of high-quality products based on data.

2.4 Enterprise Architecture Framework for the Digital Era The development of digital-related technologies is providing enterprises with opportunities for transformation. Enterprises are leveraging digital technologies to deliver

A Proposal of Architecture Framework and Performance ...

133

new value, improve operational efficiency and create new business models. Enterprise architecture frameworks have been used to transform integrated enterprise systems using digital technologies [10]. Enterprise architecture frameworks must be effective in order to leverage digital technologies and solve organization-wide challenges. Therefore, from a comprehensive perspective, it encompasses all the artifacts of an enterprise, including business, applications, data, and infrastructure, and is structured to build a roadmap from a current architecture to a future architecture [11]. On the other hand, EA frameworks themselves have also been examined to keep pace with the evolution of digital technologies. Earlier studies have proposed Adaptive Integrated Digital Architecture Framework (AIDAF), which is aligned with digital strategy [12]. In the adaptive integrated digital cycle, new digitalization projects need to be driven in the short term. This starts with a context phase where the project leader develops a plan according to the business needs. In the next assessment/architecture review phase, the architecture board (AB) reviews the architecture described in the IT project plan. In the rationalization phase, the stakeholders and AB decide whether the proposed new IT system is applicable or not. In the realization phase, the project team initiates the new IT project after deliberating on the issues and action items. In the adaptive integrated digital cycle, different EA frameworks can be adopted to accommodate the differences in strategies among the relevant organizations [12, 21]. With this framework, it is expected to use the existing EA that suits each department and dynamically manage the planning of the system through the adaptive EA lifecycle.

3 AF for Digitalization of Quality Management System and Proposal of Performance Indicator Derivation Model The proposed architecture framework that combines the digitalization of the quality management system and AIDAF in the transition process is shown in Fig. 1. The upper part shows that a quality management system in a company is built crossdepartmentally and refers to an enterprise-wide quality and digital strategies. The bottom part shows an architecture review board and adaptive integration cycle for managing a quality management system that corresponds to the context of each organization and the company-wide strategy. In the rationalization/assurance phase of the adaptive integration cycle, performance indicators are discussed by referring to the performance measurement model. This enables decision makers to recognize that their quality management system consists of independent departments and to align with the company’s overall digital and quality strategies. In addition, the architecture review board and the adaptive integration cycle in the bottom part in Fig. 1 allow a quality management system to be coordinated with inter-organizational and enterprise-wide strategies and to dynamically manage the performance indicators that correspond to those strategies. The procedures for building this framework are described in Fig. 2.

134

K. Miura et al. Department A

Department B

Department C

Headquarters

Deliverable

Deliverable

Deliverable

Process

Process

Process

Digital Strategy

Mission

Mission

Mission

QMS Architecture

Quality Standard & Strategy

Quality Management System

Architecture Review Board & Adaptive Integration Cycle Assessment Strategy Context Requirement

Defining

Performance Measurement Model

Rationalization / Assurance

Realization

Fig. 1 AF combining AIDAF for the digitalization of a quality management system

Cloud, … Server, …

Quality Management System al Intern

rnal Exte

Resources

Validation

Process Computerized Process, …

Development

Requirement & Risk Data Integrity, …

Utilization

Quality & Digital Strategy Security Policy,…

Lifecycle

Infrastructure, Application, …

Fig. 2 Architecture to be referenced for quality assurance

Figure 2 shows an architecture that is referred to in a company for quality assurance. This architecture is used to design and monitor the quality management system in the architecture framework shown in Fig. 1. Quality management systems are used to support the monitoring and evaluation to assure the quality of products and their realization activities [22]. This study focuses on a quality management system from the perspective of quality assurance, addressing following three perspectives: 1. Management of processes and resources to meet objectives and requirements; 2. Management throughout the life cycle; 3. Allocation of function to internal and external entity. The objective and requirement perspective of the vertical axis is used to discuss realization of products, processes, and quality management systems. One horizontal axis, the lifecycle perspective, is used to discuss enablers of realization in each lifecycle phase. The other horizontal axis, the function allocation perspectives, is used to consider division of responsibilities for the realization. By combining the requirements, lifecycle, and function allocation perspectives, a comprehensive performance evaluation of quality-related suitability, adequacy, and effectiveness can be conducted. RAMI 4.0 (Reference Architecture Model Industry 4.0), which is based on the Smart Grid Architecture Model, is a reference architecture used in Industry 4.0

A Proposal of Architecture Framework and Performance ...

135

[23]. RAMI 4.0 is a three-dimensional model that consists of one vertical axis to describe the system layers and two horizontal axes. One of the horizontal axes is the system lifecycle perspective and is based on IEC 62,890 [22]. The other is the perspective of the function allocation in the system, which is based on IEC 62,264. The lifecycle perspective is necessary to keep track of the relationships between the phases and to ensure that changes are appropriate. The function allocation perspective is necessary to be able to track the division of functions in networked systems. In addition, ISO 9001, ICH Q10 and PIC/S GMP/GDP have both lifecycle perspective and function allocation perspective, so the proposed architecture is beneficial to manage quality management from a bird’s eye view. Additionally, this architecture allows us to examine quality assurance of products, processes, quality management systems, and digitalization projects. IoT has been reported to be applicable across value chains [17], and our proposed architecture has a lifecycle perspective to support IoT. Also, the domain perspective proposed in the cloud manufacturing architecture can be supported by the function allocation perspective [18]. Based on the above, the components of the architecture in Fig. 2 are considered to meet the requirements, as they cover the objectives and requirements, lifecycle, and function allocation perspective. Figure 3 shows a model for deriving performance indicators in digitalization of quality management systems. This performance indicator derivation model can be used to consider performance indicators of digitization in the rationalization and assurance phase of the adaptive integration cycle. The organization’s state and enabler perspective can correspond to the analysis of the future and current architectures of a quality management system. The digitalization measures correspond to the digital technologies to be introduced. Prior studies have reported AIDAF to drive cloud, mobile, and digital IT strategies. Figure 1 shows an architectural framework that combines AIDAF in the digitalization of quality management systems. This architecture framework for a quality management system is a framework model that integrates the adaptive integration cycle in the bottom part and the strategy of the headquarters department or the internal organization in the upper part. In the rationalization phase of the adaptive integration cycle, performance indicators can be derived using the performance indicator derivation model. The quality management system in the upper part of the model can refer to the architecture in Fig. 2 to examine the quality of products, Quality Management System Target Architecture

IoT, RPA, Cloud, …

State

Digitalization

Enabler

Performance Measurement

Current Architecture

Fig. 3 Performance indicators derivation model for digitalization of quality management systems

136

K. Miura et al.

processes, and quality management systems. In summary, by combining AIDAF with the architecture of quality assurance and the performance indicator derivation model, this proposal differs from the previous studies on digitalization in that it not only considers digitalization but also adapts to various requirements for the supply of quality-assured products and supports their performance measurement.

4 Results 4.1 Case of Quality Management in a Manufacturer The architecture framework of a quality management system was introduced to local departments in a global company and evaluated for effectiveness. The following results were obtained: R1.

R2. R3. R4. R5.

R6. R7. R8. R9. R10. R11. R12.

The architecture review board deliberating on the digitalization recognized the differences in quality management processes and IT management processes among the departments. In the process of reviewing digitalization projects, common needs among departments were identified and reflected in the priorities for investment. Up to the digitalization concept validation phase, the evaluation of digitalization projects was conducted based on common criteria. From the development phase of the IT system, the quality control processes and standards of each department were applied. For the performance evaluation, in the concept validation phase, indicators related to operational risks, such as error rates, were selected and used for evaluation. In the actual operation phase, indicators of the expected effects and risks were selected and used for evaluation. To evaluate the performance of the quality management system in actual operation, the evaluation by the manager and supervisor was required. At the beginning of the project, the local digital strategy was not yet established, so an incremental objective state was set. By referring to the global headquarter’s digital strategy, a local roadmap was created using the company-wide roadmap as a baseline. Once the local digital strategy was established, the prioritization of digitalization projects was clarified on the basis of the strategy. Based on the roadmap for digitalization, digitization of documents, which was prioritized, was adopted as a performance indicator. The results of the performance measurement discussions provided input for developing of a validation plan for the digitalization of the quality management system.

A Proposal of Architecture Framework and Performance ...

137

5 Discussion The deliberation by the architecture review board identified differences in governance among departments (R1, R3, R4) and common needs (R2), suggesting that it is valuable in making efficient investment decisions. Additionally, local digital strategy was discussed by referring to the digital strategy of global headquarters (R9, R10), suggesting that the model was useful in recognizing external factors and devising a streamlined roadmap. This suggested the effectiveness of RQ1 in terms of interorganizational coordination and alignment with company-wide objectives. Furthermore, for each phase of digitalization, the performance indicators were changed from risk-focused performance indicators to balanced indicators of outcome and risk (R5, R6), suggesting that it can support the review of performance indicators that match the lifecycle phase. According to the maturity of the strategy, the performance indicators were changed from bottom-up to priority-based performance indicators (R8, R9, R10, R11), suggesting that the model enablesus to support the revision of performance indicators according to the strategy. When considering the validation of computerized systems, the discussion during the formulation of performance indicators was used, suggesting the model supports identifying the information necessary for validation during quality assurance (R12). In summary, the effectiveness of RQ2 was suggested in terms of dynamic performance indicator management and quality assurance.

6 Issues and Future Research In this case study, the proposal was applied to limited projects in the initiation phase of digitalization. Therefore, to evaluate the effectiveness of this proposal, it is necessary to apply it to the operation phase as well, and to consider the application corresponding to each phase, as well as to conduct quantitative evaluation. As the discussion has excluded external risks such as information security, future research topic needs how to assess that the system can maintain its performance in the process of changes in the internal and external environment. The proposed AF has a potential to covers quality management systems consisting of multiple organizations and multiple independent IT systems, such as RPA and IoT. To deal with organizations and IT systems with independent objectives in an integrated manner, future research topic needs how to clarify and coordinate constraints and priorities to achieve common and respective objectives, and how to describe them.

138

K. Miura et al.

7 Conclusion In this paper, we proposed an AF and performance indicator derivation model for the digitalization of a quality management system. In addition, we verified in a quality management system in a manufacturer that the above AF and performance indicator derivation model can address the issues in the digitalization of quality management systems leading to the answers to RQ1 and RQ2. The main limitation of this study is that it was a single case study within a manufacturer, thus limiting the scope of the research. Future research topic needs how to assess system resilience and how to handle complex organizations and IT system in the use of AF and performance indicator derivation models.

References 1. International Organization for standardization: ISO9001: 2008 Quality management systems— requirements (2008) 2. ICH: ICH Harmonized Tripartite Guideline: Pharmaceutical Quality System Q10 (2008) 3. Pharmaceutical Inspection Co-operation Scheme: PIC/S GMP Guides (2013) 4. Pharmaceutical Inspection Co-operation Scheme: PIC/S GDP Guides (2014) 5. Garstenauer, A., Blackburn, T., Olson, B.: A knowledge management based approach to quality management for large manufacturing organizations. Eng. Manage. J. 26(4), 47–58 (2014) 6. International Organization for Standardization: ISO13485:2016 Medical devices—quality management systems—requirements for regulatory purposes (2016) 7. Mazumder, B., Bhattacharya, S. and Yadav, A.: Total quality management in pharmaceuticals: a review. Int. J. PharmTech Res. 3(1), 365–375 (2014) 8. Kuo, S., Ou, H.T., Wang, C.J.: Managing medication supply chains: lessons learned from Taiwan during the COVID-19 pandemic and preparedness planning for the future. J. Am. Pharm. Assoc. 61(1), e12–e15 (2021) 9. Gray, J.V., Roth, A.V., Leiblein, M.J.: Quality risk in offshore manufacturing: evidence from the pharmaceutical industry. J. Oper. Manage. 29(7), 737–752 (2011) 10. Buckl, S., Matthes, F., Schulz, C., Schweda, C.M.: Exemplifying a framework for interrelating enterprise architecture concerns. In: Sicilia, M.A., Kop, C., Sartori, F. (eds.) Ontology, Conceptualization and Epistemology for Information Systems, Software Engineering and Service Science, Vol. 62, pp. 33–46. Springer, New York (2010) 11. Tamm, T., Seddon, P.B., Shanks, G., Reynolds, P.: How does enterprise architecture add value to organizations? Commun. Assoc. Inf. Syst 28, 10 (2011) 12. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono: Architecture board practices in adaptive enterprise architecture with digital platform: a case of global healthcare enterprise. Int. J. Enterp. Inf. Syst. 14, 1 (2018). IGI Global 13. Lawrence, X.Y., Kopcha, M.: The future of pharmaceutical quality and the path to get there. Int. J. Pharmaceut. 528(1–2), 354–359 (2017) 14. Dixon, J.R., Nanni, A.J., Vollmann, T.E.: New Performance Challenge: Measuring Operations for World-Class Competition (Irwin/Apics Series in Production Management). McGraw-Hill Professional Publishing, Homewood (1990) 15. Miura, K., Kobayashi, N., Shirasaka, S.: A strategic performance indicator deriving framework for evaluating organizational change. Rev. Integr. Bus. Econ. Res. 9(4), 36–46 (2020) 16. Singh, M., Sachan, S., Singh, A., Singh, K.K.: Internet of things in pharma industry: possibilities and challenges. In: Emergence of Pharmaceutical Industry Growth with Industrial IoT Approach, pp. 195–216. Academic Press (2020)

A Proposal of Architecture Framework and Performance ...

139

17. Sharma, A., Kaur, J., Singh, I.: Internet of things (IoT) in pharmaceutical manufacturing, warehousing, and supply chain management. SN Comput. Sci. 1(4), 1–10 (2020) 18. Fisher, O., Watson, N., Porcu, L., Bacon, D., Rigley, M., Gomes, R.L.: Cloud manufacturing as a sustainable process manufacturing route. J. Manufa. Syst. 47, 53–68 (2018) 19. Barenji, R.V., Akdag, Y., Yet, B., Oner, L.: Cyber-physical-based PAT (CPbPAT) framework for pharma 4.0. Int. J. Pharmaceut. 567, 118445 (2019) 20. Szlezak, N., Evers, M., Wang, J., Pérez, L.: The role of big data and advanced analytics in drug discovery, development, and commercialization. Clin. Pharmacol. Therapeut. 95(5), 492–495 (2014) 21. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Int. J. Enterp. Inf. Syst. IJEIS. IGI Glob. 13(3), 1–22 (2017), 22 p. https://doi.org/10.4018/ijeis.2017070101 22. Arling, E.R., Dowling, M.E., Frankel, P.A.: Creating and managing a quality management system. Pharmaceut. Sci. Encycl. Drug Discov. Dev. Manuf. 1–48 23. Resman, M., Pipan, M., Šimic, M., Herakoviˇc, N.: A new architecture model for smart manufacturing: a performance analysis and comparison with the RAMI 4.0 reference model. Adv. Prod. Eng. Manag. 14(2), 153–165 (2019)

Support System for Medical/Hospital Management

Prediction of Length of Stay Using Vital Signs at the Admission Time in Emergency Departments Amin Naemi , Thomas Schmidt , Marjan Mansourvar , Ali Ebrahimi , and Uffe Kock Wiil

Abstract Length of Stay (LOS) prediction at the time of admission can give clinicians insight into the illness severity of patients and enable them to prevent complications and adverse events. It can also help hospitals to manage their facilities and manpower more efficiently. This paper first applies Borderline-SMOTE and multivariate Gaussian process imputer techniques to overcome data skewness and handle missing values which have been ignored by most studies. Then, based on our conversation with clinicians, patients are stratified into five classes according to their LOS. Finally, five machine learning algorithms, including support vector machine, deep neural networks, random forest, extreme gradient boosting, and decision tree are developed to predict LOS of unselected patients admitted to the emergency department at Odense University Hospital. These models utilize information of patients at the time of admission, including age, gender, heart rate, respiratory rate, oxygen saturation, and systolic blood pressure. Performance of predictive models on the data before and after imputation and class balancing are investigated using the area under the curve metric and the results show that our proposed solutions for data skewness and missing values challenges improve the performance of predictive models by an average of 13%. Keywords Length of stay · LOS · Machine learning · Deep learning · Health informatic · Vital signs · Emergency department · Data skewness · Missing values

1 Introduction Length of Stay (LOS) normally refers to the duration that a patient stays at the hospital during a single admission and it is considered as one of the key indicators for the hospital resources consumption [1]. In recent years, there are ongoing attempts to force hospitals to adopt policy changes introduced for decreasing the LOS and A. Naemi (B) · T. Schmidt · M. Mansourvar · A. Ebrahimi · U. K. Wiil Center for Health Informatics and Technology, The Maersk Mc-Kinney Institute, University of Southern Denmark, Odense, Denmark e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_12

143

144

A. Naemi et al.

readmission rates which make hospitals assess their processes to become more cost efficient [2]. Therefore, cost management, number of admission reduction, and LOS reduction have become among the most important challenges for healthcare systems nowadays [3]. LOS offers a better understanding of the patients’ movement through a healthcare system, which is necessary for assessment of clinical and operational functions of such a system [4]. It has been shown that longer LOS in Intensive Care Unit (ICU), approximately more than 3 days, is associated with increased long-term mortality [5]. Hence, LOS is an important metric for both healthcare providers and patients. Predicting LOS accurately would enable hospitals to predict the discharge time of patients, which could help them to scale their capacity over the long-term strategic planning, as well as to estimate health care costs. LOS is one of the easiest outcomes to extract and can be used as a substitute for other outcomes, such as in-hospital mortality or ICU mortality. LOS is also used for identification of illness severity and utilization of healthcare resources [1]. In recent years Emergency Departments (ED) crowding continues to be an serious issue in developed countries and LOS is the most significant indicator for monitoring the throughput process because it can be both the cause and the result of ED crowding [6]. Therefore, prediction of LOS can be beneficial in terms of avoiding and controlling ED crowding as well as improving resource allocation and reducing healthcare costs. Various studies showed that patients’ LOS can be predicted effectively by Machine Learning (ML) and Deep Neural Networks (DNN) techniques. Different ML techniques including Logistic Regression (LR), Decision Tree (DT), Multi-Layer Perceptron neural network (MLP), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Neural Networks (NN) have been implemented to predict LOS of patients with specific diseases or clinical environments [2, 7–10]. Several studies have investigated the application of ML techniques for prediction of LOS at EDs. Combes et al. [11] utilized several classifiers, including LR, DT, RF, and SVM to predict patients’ LOS at ED. Rahman et al. [12] developed a DT based model for prediction of LOS greater than four hours in an Australian ED. Barnes et al. [13] implemented RF and LR models to predict LOS at ED and they showed that both techniques had higher sensitivity and lower specificity than hospital staff. Although various ML algorithms have been applied to predict LOS, there are still some aspects that need further research. First, most studies have neglected main challenges in clinical data such as handling missing values, and skewness of LOS distribution. Carter et al. [2] have criticized many studies for not considering the skewness nature of LOS data. Second, various studies have tried to simplify the problem to a binary classification (long LOS versus short LOS). Third, most studies have been done on a specific type of patients such as diabetic patients, cardiology patients, etc. This study addresses the mentioned challenges by using multivariate Gaussian Process (GP) imputer to fill missing values of incomplete records and, BorderlineSMOTE, a proper oversampling technique to overcome the skewness of the data.

Prediction of Length of Stay Using Vital Signs …

145

Moreover, we considered five common categories of LOS based on a conversation with clinicians and, finally, five ML-based models, including SVM, DNN, RF, Extreme Gradient Boosting (XGB), and DT are developed to predict LOS of unselected group of patients at the ED.

2 Materials and Method 2.1 Data Collection Data was collected from patients admitted to the ED of Odense University Hospital (OUH) between June 2018 and April 2019. Data was gathered from the HL7 interface of Philips IntelliVue patient monitors. Messages from HL7 included information about Arterial Blood Oxygen Saturation (SpO2) and Pulse Rate (PR) calculated by pulse oximetry, Respiration Rate (RR) measured using 3-lead electrocardiography. A Noninvasive Blood Pressure (NBP) cuff was used to measure Systolic Blood Pressure (SBP) [14]. Other information such as age, gender, admission and discharge dates was added to the dataset [15]. Missing Values One of the major common challenges in many real-world datasets especially in Electronic Health Records (EHR) data, is missing values [16]. There are different techniques to solve missing values problem, however, most of the traditional algorithms such as univariate imputation, mean substitution, and moving average fill the missing data using the summary statistics (mean, median, or most frequent). These techniques ignore the correlation of variables. For example, in EHR data, patient’s temperature and HR may be highly correlated, so if a missing value in HR is imputed by the low mean of HR under the high temperature, it could impact the prediction accuracy. Therefore, to address this problem, multivariate imputation techniques should be utilized. In this case, the missing values of a variable are imputed not only based on that variable but also based on other variables. In this study, we have considered six regression techniques, including KNN, Bayesian Ridge (BR), DT, RF, Gradient Boosting (GB), and GP for multivariate imputation. KNN is a non-parametric method used for classification and regression. In both cases, the output depends on the K closest training sample in the feature space. BR is a linear regression using probability distributions rather than point estimates. In other words, the target variable is not estimated as a single point, but it is assumed to be drawn from a probability distribution. DT is a non-parametric supervised learning technique for regression and classification. In this algorithm, the aim is to build a model that predicts the target value by learning simple decision rules inferred from the data features. RF is an Ensemble Learning (EL) technique for regression and classification tasks by constructing multiple decision trees. GB is also an EL technique for regression which produces a prediction model in the form of an ensemble

146

A. Naemi et al.

Table 1 Imputation models hyperparameters Models Hyperparameters KNN

N = 10

BR

Iteration = 100, α = λ = 10–6

DT

Criterion = MSE, max depth = 30

RF

Criterion = MSE, max depth = 30, number of DT = 50

GB

Iteration = 100, Loss function = MSE, learning rate = 0.1 max depth = 30, number of DT = 50

GP

Kernel = N(0,1), optimizer = LM-BFGS

of weak prediction models, typically DTs. Finally, GP is a stochastic process that includes random variables that each finite collection of those random variables has a multivariate normal distribution. GP is a supervised learning technique which can be used for both classification and regression problems [17]. In this study, to evaluate the performance of imputers, 75% of patients without missing values were used for training imputation models and the rest (25%) were used for the test. At each iteration, 25% of the training set was used for validation and fivefold cross validation was applied to have a better estimation of the performance of models. The hyperparameters of imputers for training are shown in Table 1. Data Imbalance LOS data is often highly skewed [18]. Therefore, in this study, a resampling technique [19] was considered to overcome the skewness and imbalanced class problem. There are two ways of resampling namely, undersampling and oversampling. In oversampling, the number of records of the minority class is increased, while undersampling decreases the number of samples in the majority class. Synthetic Minority Oversampling Technique (SMOTE) [20] is one of the most suitable ways to deal with the class imbalance problem. SMOTE selects samples which are close in the feature space and draws a line between them and synthesizes a new sample along that line. By using SMOTE, we can create as many synthetic samples as we need. One of the main drawbacks of this technique is that synthetic samples are generated without considering the majority class, so if the classes have high overlap, it could produce ambiguous samples. To address this challenge, various extensions of SMOTE have been introduced [21]. One of the popular extensions to SMOTE is Borderline-SMOTE [22], focusing on the difficult instances and providing more resolution where it is required. The main idea behind this technique is that the borderline samples in the minority class are easily misclassified compared to the samples located far from the borderline. Hence, this method only oversamples the samples of minority class placed near the borderline, while SMOTE oversamples the minority class through all the examples of the minority class or a random subset of it [22].

Prediction of Length of Stay Using Vital Signs …

147

Data Scaling Data scaling is a process of changing all features’ ranges into the same scale. There are two common techniques for this purpose, including standardization and normalization. Normalization refers to rescaling values of all features to values between 0 and 1, while standardization means to transform all variables to have a mean of zero and a standard deviation of 1 [23]. In this study, we used normalization technique for data scaling.

2.2 Model Development After addressing data skewness and missing values challenges, ML predictive models were implemented to predict the LOS of patients at ED. These models were trained using available information at the admission time, including vital signs (PR, RR, SpO2, SBP), age, and gender. In this study, five ML algorithms, including SVM, DNN, RF, XGB, and DT were considered. SVM is a supervised and instance-based ML algorithm, which creates a boundary called hyperplane between classes. The primary aim of this algorithm is to maximize the margin between classes. DNNs are a subset of artificial neural network with multiple layers between input and output layers. There are different types of DNN such as feedforward, convolutional, and long short-term memory [17]. In this study, we considered feedforward DNN to predict patients’ LOS. EL is among the state-of-the-art ML algorithms to solve various problems. EL refers to building a model by combining different individual models to achieve a model with superior performance compared to the primary models. There are two main kinds of EL, namely bagging and boosting. In this study, we considered RF algorithm, a bagging technique and XGB, a boosting technique. In the bagging approach, several subsets of data are chosen randomly with replacement and each subset is used to train one model such as DT, while in boosting, models are trained sequentially with early models building simple models; then the error is analyzed. EL has some advantages such as overfitting prevention, avoiding local optimal, and solving the curse of dimensionality [24]. In this study, 75% of patients were used to train predictive models and the rest 25% were used as the test data. At each iteration, 25% of the training set was used for validation purpose and based on that, the optimal hyperparameters of models were obtained using the Mean Square Error (MSE) metric. Hyperparameters of predictive models are shown in Table 2.

148

A. Naemi et al.

Table 2 Predictive models hyperparameters Models

Hyperparameters

DT

Max depth = 30

SVM

Kernel = RBF, C = 10, gamma = 0.001

DNN

Hidden layers = 4, nodes: (6, 10, 20, 20, 10, 5), epochs = 1000, optimizer = Adam

RF

Number of DT = 50, max depth = 30

EGB

Learning rate = 0.1, number of DT = 50

3 Results In this section, the findings of this study are presented. The description of the data is shown in Table 3. First, we investigated the relation between LOS and patients’ outcomes. Of 6,027 hospitalized patients, 101 patients experienced clinical deterioration, including receiving aid from ICU, ICU transfer, and death. To investigate the relation between these events with LOS, we compared the LOS of stable patients and deteriorated patients using Kruskal test with p < 0.05 as a statistically significant level. The result showed that there was a statistical difference between these two groups of patients (p < 0.0001). Figure 1a shows the boxplot of LOS for stable and deteriorated patients. The dashed red line in the plot represents the median value. As shown in Fig. 1a, we can see that the LOS for deteriorated patients is significantly higher. Out of 6,027 OUH patients, 5,331 records had no missing values, so these records were used to evaluate the performance of imputers. Performance of regressors for imputation of missing values of each vital sign based on Normalized Root Mean Square Error (NRMSE) was calculated. Based on Table 4, GP had the best performance, so GP was used to impute missing values of incomplete records. Through a conversation with clinicians, patients were divided into five classes based on their LOS, including (0–4 h), (4–8 h), (8–24 h), (24–48 h), and (>48 h). The distribution of patients’ LOS by hour is shown in Fig. 1b. As seen, the distribution Table 3 Data description

Patients = 6,027

Statistics

Missing ratio (%)

Gender, male n (%)

3,195 (53%)

0

Age, median (IQR)

68 (52–80)

0

PR (min−1 ), median (IQR)

83 (69–99)

2.45

RR (min−1 ), median (IQR)

18 (14–22)

3.01

SpO2 (%), median (IQR)

96 (92–99)

2.35

SBP (mmHg), median (IQR)

129 (110–147)

7.78

Prediction of Length of Stay Using Vital Signs …

149

Fig. 1 a LOS boxplot of stable and deteriorated patients. b LOS distribution by hour

Table 4 Imputation models performance NRMSE (%) KNN

BR

DT

RF

HG

GP

PR

6.4

6.8

9.7

6.9

6.6

7.9

BP

10.8

10.6

15.0

11.6

10.7

10.5

RR

10.2

10.0

14.4

10.9

10.1

7.8

SpO2

5.3

5.5

NRMSE average

8.175

8.225

7.6 11.675

5.5

5.4

5.3

8.725

8.200

7.875

of LOS is skewed. Thus, Borderline-SMOTE oversampling was applied to increase the number of samples in the minority classes to overcome this challenge. Following the imputation of missing values and solving imbalanced data problem, building ML models to predict LOS is the next phase. In this study SVM, DNN, RF, XGB, and DT were implemented. The performance of models was investigated by Receiver Operating Characteristics (ROC) curve and Area Under the Curve (AUC) for the dataset before and after applying GP imputer and Borderline-SMOTE. As shown in Fig. 2, applying imputation (Fig. 2b) and oversampling techniques (Fig. 2c) have improved the performance of all predictive models. Before oversampling, DNN had the best performance, followed by two EL algorithms (RF, XGB) in terms of AUC and after oversampling, EL (RF, XGB) models had the best performance.

4 Discussion In this paper, LOS of patients admitted to the ED based on admission information, including age, gender, HR, RR, SpO2, and SBP were predicted. To the best of our knowledge, most studies in this area ignored the skewness of LOS data which is

150

A. Naemi et al.

Fig. 2 Prediction performance: a before imputation and before oversampling (original data), b after imputation and before oversampling, c after imputation and after oversampling

the main challenge for predicting patients’ LOS. Therefore, in this study, we have presented a solution for it based on Borderline-SMOTE oversampling technique and investigated the effect of this technique on the performance of ML models. The results showed that the AUC of all ML models was improved. Missing values is one of the other challenges in EHR data. Some studies [2, 8, 9, 11, 13] have not mentioned how they dealt with incomplete records while some researchers have considered strategies such as using mean value [10], values from healthy patients’ records [7], and replacing missing values with specific characters e.g. ‘NA’ [12] to fill missing values. In this study, we have implemented an iterative multivariate imputer based on ML regression techniques. In this approach, missing values of a variable are imputed based on that variable and the correlation of it with other variables that lead to more accurate results. For this purpose, six different ML regression techniques were applied and based on the results, GP imputer which had the best performance, was selected as the imputer for filling missing values. Finally, five ML-based models were built to predict patients’ LOS at ED. The results showed that RF, XGB, and DNN had a promising performance to predict LOS. Moreover, various studies have defined a threshold e.g. median of LOS and categorized patients into two classes such as long LOS and short LOS [7, 9, 10, 12], while in this study, we grouped patients into five classes, including (0–4 h), (4–8 h), (8–24 h), (24–48 h), and (>48 h) based on clinicians point of view and the way how they manage patients at ED. Our results indicated that our proposed solutions to address imbalanced data and missing values problems improved the performance of ML predictive models by an average of 13% (Fig. 2). For example, the performance of RF has reached from AUC = 0.81 on the original data to AUC = 0.93 after applying missing values imputation and oversampling techniques. Based on [25] that considers AUC > 0.90 as an excellent performance, the performance of RF has changed from good to excellent. The results of our study also indicated that there is a statistically significant relationship between LOS and clinical outcomes, meaning that LOS for deteriorated

Prediction of Length of Stay Using Vital Signs …

151

patients was much higher compared to stable patients, so LOS can be used as an index to identify patients at the risk of deterioration. Therefore, as Barnes et al. [13] have indicated, such a system can predict LOS and discharge time more accurately compared to clinicians, thus this system can be applied at the ED to predict LOS of patients at the time of admission and help hospitals to estimate patients’ needs, illness severity as well as avoiding ED crowding. Inadequate information about patients at the admission time, which can potentially affect their LOS, is one of the limitations of this study. Nevertheless, our proposed model is flexible and new variables can easily be added to the model as soon as they are collected.

5 Conclusion This study aims to predict LOS of unselected patients admitted to ED using admission information, including vital signs (HR, RR, SpO2, BP), age, and gender. However, there are unavoidable challenges such as data skewness and missing values in developing accurate predictive models. Ignoring data skewness can result in underprediction of LOS and reduce models’ accuracy. Similarly, removing incomplete patients’ records or applying inaccurate techniques for imputation of missing values can lead to information loss or information pollution. To address these challenges, we proposed a new method which is comprised of three stages, including missing values imputation, balancing classes, and predictive models. First, we presented an iterative multivariate imputer based on GP to impute missing values of incomplete records, which has around 92% accuracy. Following this, Borderline-SMOTE oversampling technique was implemented to overcome the unavoidable skewness challenge of LOS data which has been ignored by most studies. Finally, five ML models, including SVM, DNN, RF, XGB, and DT were applied to predict patients’ LOS. This study has demonstrated that our proposed solutions for addressing existing challenges, including missing values and LOS data skewness, play a key role in the improvement of ML predictive models’ performance. The results showed that our proposed imputation and oversampling techniques have improved the performance of models by an average of 4% and 9%, respectively. This improvement in the performance of predictive models can significantly help hospitals to manage their resources as well as clinicians to better understand patients’ health conditions and needs. The performance of these predictive models can be improved by using a regression approach as it overcomes the limitation of the classification method. Thus, in our future study, instead of considering LOS prediction as a classification task and grouping patients to the limited number of classes, we will consider a regression approach to predict patients’ LOS at ED.

152

A. Naemi et al.

References 1. Awad, A., Bader-El-Den, M., McNicholas, J.: Modeling and predicting patient length of stay: a survey. Int. J. Adv. Sci. Res. Manage. 1(8), 90–102 (2016) 2. Carter, E.M., Potts, H.W.W.: Predicting length of stay from an electronic patient record system: a primary total knee replacement example. BMC Med. Inform. Dec. Mak. 14(1), 26 (2014) 3. Roberts, A., Marshall, L., Charlesworth, A.: A decade of austerity. The funding pressures facing the NHS from (2010) 4. Lim, A., Tongkumchum, P.: Methods for analyzing hospital length of stay with application to inpatients dying in Southern Thailand. Glob. J. Health Sci. 1(1), 27 (2009) 5. Vincent, J.-L., Singer, M.: Critical care: advances and future perspectives. Lancet 376(9749), 1354–1361 (2010) 6. Chaou, C.-H., Chen, H.-H., Chang, S.-H., Tang, P., Pan, S.-L., Yen, A.M.-F. et al.: Predicting length of stay among patients discharged from the emergency department—using an accelerated failure time model. PloS One 12(1), e0165756 (2017) 7. Houthooft, R., Ruyssinck, J., van der Herten, J., Stijven, S., Couckuyt, I., Gadeyne, B., et al.: Predictive modelling of survival and length of stay in critically ill patients using sequential organ failure scores. Artif. Intell. Med. 63(3), 191–207 (2015) 8. Kudyba, S., Gregorio, T.: Identifying factors that impact patient length of stay metrics for healthcare providers with advanced analytics. Health Informat. J. 16(4), 235–45 (2010) 9. Cheng, T.-H., Hu, P.J.-H.: A data-driven approach to manage the length of stay for appendectomy patients. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 39(6), 1339–47 (2009) 10. Hachesu, P.R., Ahmadi, M., Alizadeh, S., Sadoughi, F.: Use of data mining techniques to determine and predict length of stay of cardiac patients. Healthcare Informat. Res. 19(2), 121–129 (2013) 11. Combes, C., Kadri, F., Chaabane, S.: Predicting hospital length of stay using regression models: application to emergency department (2014) 12. Rahman, M.A., Honan, B., Glanville, T., Hough, P., Walker, K.: Using data mining to predict emergency department length of stay greater than 4 hours: derivation and single-site validation of a decision tree algorithm. Emerg. Med. Aus. 32(3), 416–421 (2020) 13. Barnes, S., Hamrock, E., Toerper, M., Siddiqui, S., Levin, S.: Real-time prediction of inpatient length of stay for discharge prioritization. J. Am. Med. Inform. Ass. 23(e1), e2-10 (2016) 14. Schmidt, T., Wiil, U.K.: Designing a 3-Stage Patient Deterioration Warning System for Emergency Departments 15. Naemi, A., Mansourvar, M., Schmidt, T., Wiil, U.K.: Prediction of patients severity at emergency department using NARX and ensemble learning. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2793–9 (2020). IEEE 16. Hu, Z., Melton, G.B., Arsoniadis, E.G., Wang, Y., Kwaan, M.R., Simon, G.J.: Strategies for handling missing clinical data for automated surgical site infection detection from the electronic health record. J. Biomed. Inform. 68, 112–120 (2017) 17. Bonaccorso, G.: Machine Learning Algorithms. Packt Publishing Ltd. (2017) 18. Turgeman, L., May, J.H., Sciulli, R.: Insights from a machine learning model for predicting the hospital Length of Stay (LOS) at the time of admission. Expert Syst. Appl. 78, 376–385 (2017) 19. Awad, A., Bader-El-Den, M., McNicholas, J., Briggs, J.: Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach. Int. J. Med. Inform. 108, 185–195 (2017) 20. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority oversampling technique. J. Artif. Intell. Res. 16, 321–357 (2002) 21. Kovács, G.: An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Appl. Soft Comput. 83, 105662 (2019)

Prediction of Length of Stay Using Vital Signs …

153

22. Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International conference on Intelligent Computing, pp. 878–87 (2005). Springer 23. Alhassan, Z., Budgen, D., Alshammari, R., Daghstani, T., McGough, A.S., Al Moubayed, N.: Stacked denoising autoencoders for mortality risk prediction using imbalanced clinical data. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 541–6 (2018). IEEE 24. Sagi, O., Rokach, L.: Ensemble learning: a survey. Wiley Interdisc. Rev. Data Min. Knowl. Discovery. 8(4), e1249 (2018) 25. Safari, S., Baratloo, A., Elfil, M., Negida, A.: Evidence based emergency medicine; part 5 receiver operating curve and area under the curve. Emergency 4(2), 111 (2016)

Regulated Digital Pharmacy Based on Electronic Health Record to Improve Prescription Services Junhao Zhong, Zhengjia Mao, Hangpeng Li, Yoshimasa Masuda, and Tetsuya Toma

Abstract In Japan, the paper-based process of getting prescription medications leads to problems including inconvenience for patients, high government expenditure, and the lack of regulation on online counterfeit drug transactions, especially given Japan has an aging society and long-existing National Health Insurance (NHI) system. Regulated digital pharmacy (RDP) could be an innovative solution based on Electronic Health Records (EHR) and Computerized Physician Order Entry (CPOE) systems. To make the implementation of RDP more effective and secure, the authors reviewed the effectiveness of three systems that manage EHR and recommended an optimal system as a basis for RDP in Japan. The authors also suggest integrating with the recently proposed Adaptive Integrated Digital Architecture Framework (AIDAF), and integration schemes are discussed. Moreover, the challenges and future research are addressed. Keywords Digital pharmacy · Online pharmacy · Prescription · Medical service · Enterprise architecture

J. Zhong (B) Graduate School of Media and Governance, Keio University, Tokyo, Japan Z. Mao Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, USA H. Li Department of Biotechnology, University of Pennsylvania, Philadelphia, USA Y. Masuda School of Computer Science, Carnegie Mellon University, Pittsburgh, USA Y. Masuda · T. Toma Graduate School of System Design and Management, Keio University, Tokyo, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_13

155

156

J. Zhong et al.

1 Introduction The paper-based procedures of obtaining prescriptions in Japan leads to inefficient drug dispensing processes, unnecessary labor cost and extra hospital visit fee. Under the National Health Insurance (NHI) system, the government is spending a significant portion of the annual budget to cover the citizens’ healthcare cost, and the expenditure continues to increase. Since online drug trading policies are still highly restricted, the existence of illicit and unregistered online pharmacies’ facilitates tax evasion and creates drug-safety issues in patients [1]. The present trend under the influence of big data focuses on digitizing traditional industry, standardizing existing information, and precising information input. Scholars worldwide have investigated storing data in the cloud or blockchain instead of using the traditional client-server model. Digital Pharmacy is an innovative pharmaceutical model based on the computerized physician order entry (CPOE) system [2]. Nevertheless, Digital Pharmacy with regulations that build the network that is interorganizational, intra-organizational, and cross-disciplinary is the ultimate goal for the healthcare ecosystem in Japan. A regulated digital pharmacy (RDP) industry could potentially bring profound benefits to the healthcare ecosystem in Japan beyond convenience. Since RDP is a novel concept and it hasn’t been implemented anywhere in Japan, challenges including data security and privacy concern healthcare providers and patients. To take measures on these problems, selecting appropriate systems and Enterprise Architecture (EA) should be an effective management strategy. This paper discusses how RDP is constructed and proves its potentials based on related case studies. The paper also includes a comparison of three EHR systems that support RDP for data and sensitive information management and recommends an optimal system. Moreover, the related privacy policies are discussed at the end of the paper.

2 Related Works and the Direction of RDP 2.1 Current Healthcare Ecosystem in Japan The national health insurance (NHI) was first introduced in 1961, and since then Japanese citizens have the benefits to access healthcare facilities and have high medical coverage according to their ages [3]. The NHI system takes an increasing amount of tax subsidy year by year, but the stagnant economy of Japan and increasing aging population are adding pressure on the government’s social security system. It is essential to find and solve healthcare issues in order to maintain the sustainability of social welfare [4]. The specific issues are listed below.

Regulated Digital Pharmacy Based on Electronic …

157

Pressure of Healthcare Cost. Given the cover rate of NHI is age-dependent, the aging society has become a financial problem for the Japanese government. Each year, 25% of national GDP goes directly to healthcare. The national expenditure on medical care was 43 trillion yen (around 403 billion dollars) in 2017, and continues to increase at an average rate of 2.2%. The up-to-date medical equipment, aging population and hospital visits take up the majority government social security expenditure (in % of GDP). 72.2% of patients’ medical expenses are hospital and clinic visits, and 17.1% goes to pharmacy. Within the expenses for hospitals and clinics, 46.9% of the budget is paid to medical professionals [5]. Quality of Care. Safety, effectiveness, timeliness, efficiency, equitability, and patient-centeredness are the six main components of quality care [6]. Japan was rated for the world’s ninth for performance on health level [7]. However, Japan’s universal health care system has potential defects, especially when it comes to the aging population and healthcare expenditure. The current policy for Japanese healthcare ecosystem might have worked decades ago. Nonetheless, years of demographic component change regarding aging population increase with no political responses has stimulated and cultivated a loose operational environment. Studies have shown an increase of patients’ safety concerns due to medical malpractices. Government officials and healthcare providers have to take actions to ensure the quality of healthcare and patients’ rights [8]. Distribution in Retail Pharmacy. Japan has a large quantity of retail pharmacies in urbanized areas, conversely it lacks access to retail pharmacies outside of major cities. Retailers also undergo an inventory management crisis when it comes to store adequate medication. Patients suffer from drug supply shortage and interruption, and are forced to wait for an average of two days to acquire medication. [9, 10] The inadequate drug supply results and furthers nonadherence to medication. Hospital Capacity during COVID Pandemic. The healthcare system is under unprecedented pressure during the COVID pandemic, according to the Ministry of Health, Labor and Welfare, the average hospital beds have been occupied more than 50% in each major city [11]. For instance, Tokyo’s hospital occupancy rate for critical patients has exceeded 113%, Kanagawa’s hospital occupancy rate for critical patients has exceeded 94.4% in January 2021[12]. According to the Japan Times, related hospital managers state that increasing hospital beds and staff (doctors and nurses) is impossible on top of the existing utilization [13]. The current healthcare system is overwhelmed, and needs to be overturned by applying a new economic-efficient architecture enterprise.

2.2 Electronic Health Record (EHR) Electronic Health Record (EHR) was first introduced in the 1960s in the United States. It is a digital record of patient’s medical and wellness information that can

158

J. Zhong et al.

be accessed by healthcare providers [14]. Over 60 years of continuous development, EHR has accomplished more than migrating patient records from paper-based to digital forms. It also exploits new medical services such as keeping patients’ medications, immunizations, allergy statistics, laboratory data, etc. [14, 15]. EHR measures medical progress, improves quality of care, and ensures accuracy when it aligns with appropriate IT systems and platforms [16]. Studies indicate that hospitals that implement EHRs experience the reduction in cost and enhancement in efficiency, availability, accuracy, and fulfillment of medical services.

2.3 Computerized Physician Order Entry (CPOE) Computerized Physician Order Entry (CPOE) is defined as electronic prescribing systems that the orders are electronically entered by physicians into computers. Compared to a plain paper prescription, CPOE systems support healthcare providers to access the comprehensive health information of a patient. In the United States, 95.6% of hospitals have implemented CPOE to improve the prescribing process with clinical decision support [17]. Being more transparent and reflective, healthcare providers could track processes, acknowledge orders, and give suggestions through CPOE systems. It guarantees safety and measures outcomes by checking Drug-Drug interaction, drug-disease interactions, and drug dosage for different age groups. The safety feature management system has significantly reduced prescription and related medical errors [18].

2.4 Regulated Digital Pharmacy (RDP) The concept of regulated digital pharmacy (RDP) is defined as a platform that enables patients to interact with medical professionals remotely on getting prescriptions or any other medical services, with regulations. The platform operates based on CPOE systems and EHR. The word “regulated” is added because online drug trading in Japan is still an untapped niche market and lacking regulations. RDP aims to optimize the prescription dispensing process by replacing paper-based prescriptions and inperson drug purchasing with electronic prescriptions and remote drug delivering. Refilling medications can be automatically sent to the patient’s hands if there is no change in prescriptions. storing patients’ information through EHR. Not only expenditure and time spent on traffic could be largely reduced, but also doctors and pharmacists could spend more time on the patients with urgent needs. Besides, RDP provides advantages in medication availability, prescription history tracking, and drug safety guarantee.

Regulated Digital Pharmacy Based on Electronic …

159

2.5 Adaptive Integrated Digital Architecture Framework (AIDAF) Enterprise Architecture (EA) is defined as a conceptual framework that illustrates the relationships among the different components in a collaborative collection. EA is helpful to transform a corporate or system to align with new emerging technologies, regulations, and customer needs [19]. In this case, transforming traditional healthcare systems to cloud-based systems. Coherence is a major challenge faced by healthcare providers as a result of rapid emerging technologies like cloud, big data, digitize IT. Currently, many institutes are not adopting CPOE systems yet, and therefore it is difficult to transfer the stored patient data to RDPs and ensure the data security and users’ privacy. Designing a suitable EA can make better communication, planning, and cooperation between IT companies and healthcare institutes. It could be an essential method for RDP to be implemented across different platforms and healthcare institutes. In the diagram in Sect. 4.1, the data flow goes through several different stages. Because different stages usually have inconsistent software and database platforms, managing the data flow becomes a challenge. Studies reveal defects of traditional EAs when it comes to cloud computing implementation and hospital affiliation. Adaptive Integrated Digital Architecture Framework (AIDAF) is a recently proposed concept that comprises adaptive and flexible models [20]. Compared to the traditional EAs, AIDAF reduces risks of disordered and disorganized categories among the different domains.

3 Research Methodology Please Related literature reviews in the healthcare field mainly from both Japan and the United States were used to compare and analyze. Up-to-date journal statistics from Japanese government data-base have been used as evidence to prove the current burden of Japanese healthcare ecosystem. Two research questions from this paper were brought up to investigate the effectiveness of RDP to the Japanese healthcare industry. Because the implementation of RDP is based on EHR and CPOE, the authors studied the applications of EHR and CPOE in other countries and used the outcomes to speculate the potentials of RDP. To recommend an optimal information system for secure and effective EHR management, the authors also reviewed the literature and existing technologies of EHR systems and listed out three major systems for comparisons: client-server systems, cloud systems, and blockchain systems. The comparisons were evaluated based on three factors: functionality, security, and cost. The authors addressed the challenges of implementing RDP and related privacy policies in the Japanese healthcare industry. In general, the following RQs were verified and evaluated.

160

J. Zhong et al.

RQ1: How was the RDP framework designed, and why did it benefit the stakeholders? RQ2: How was the architecture of RDP constructed to achieve effective and secure management?

4 Results 4.1 RDP Framework The RDP framework comprises three user groups: patients, physicians, and pharmacists. The data flows are illustrated in Fig. 1. For a patient who needs to get prescription medications, the prescription diagnosed by physicians will be sent to RDP electronically, and the prescription is stored and linked to the patient’s EHR. Then RDP sends the electronic prescription access to pharmacists for preparing or selecting the medications. Once confirmed, the medications from the RDP warehouse will be dispensed to the patient by delivering or picking up. Then RDP will update this medication use in the patient’s medical history for future reference. Lastly, the

Fig. 1 Regulated Digital Pharmacy Architecture within the CPOE system

Regulated Digital Pharmacy Based on Electronic …

161

prescription data on the two ends of physicians and pharmacists will be synchronized and checked by Artificial Intelligence. In some urgent cases that delivering might be too slow, RDP can serve as an electronic reference for patients to get medications in local pharmacies.

4.2 Case Study Results Improve quality of care by optimizing data interface. Data interface is a method that connects practice management systems and computer systems of healthcare trading partners. It enables more efficient EHR transmission from one provider to the others [21]. As a branch application of eHealth, the concept of regulated digital pharmacy (RDP) serves to provide a platform that enables patients to interact with medical professionals effectively on getting prescriptions or some other medical services. Moreover, with RDP, Patients do not have to go to small local pharmacies where there is a shortage of supply due to the unavailability. Reduce re-visit rate. The RDP platform is operating based on the CPOE model using digitized health records. Not only expenditure and time spent on traffic could be largely reduced, but doctors and pharmacists could spend more time on the patients with urgent needs. Reduce cost, boost profit. A regulated digital pharmacy could greatly reduce the patients’ medical cost, as the frequency of unnecessary hospital visits are greatly reduced. Also, it could cut government’s expenditure on the national health insurance system by providing a lower price of generic drugs. Furthermore, as digital pharmacy can potentially replace illegal pharmacy websites, issues like taxes evading could be largely mitigated. On average, hospitals that implement EHRs save 9.6% or $730 medical costs per patient [22]. Furthermore, better communication, unified management, and detailed digitized records can reduce medical errors and save billions of dollars from malpractice lawsuits and mistake recoveries [23]. Maximize the hospital capacity. If there is not much adjustment in the details of prescriptions, medications can be automatically sent to the patient’s hands. This eliminates the need for patients to physically get medications from pharmacies with a paper prescription and provides patients a simple online shopping experience. In order to shunt the existing congestion, expand the maximum of existing service, and achieve centralized patient care for those in the most urgent situations. Medical adherence, medical error reduction. The proposed RDP will be supported by the use of EHR to better manage one’s prescription history and other medical information. Studies indicate that EHRs improve efficiency, availability, accuracy, and fulfillment of medical services and reduce the cost in prescription, chart pull, storage, and refiling [24, 25]. As patient’s health records could be tracked easily, pharmacists could ensure patients’ medical adherence by auto-filling the needed medication. Additionally, an automated medication system could not only minimize

162

J. Zhong et al.

the distribution error, but provide a full insight into a patient’s previous prescription history. Potential medical errors can be reduced as doctors and pharmacists would be informed of patients’ potential allergies and adverse drug reactions in time.

4.3 Information System Selection Because the proposed RDP will process and store enormous amounts of data associated with EHRs, security and privacy are matters of keen concern. A poorly designed RDP system can negatively impact patients and healthcare providers. These concerns can potentially lead to problems like data error, data loss, jumble information, and sensitive information leaks, which expose safety risks to users [26]. To select an optimal format of managing data, three EHR systems are selected for comparisons as a baseline for constructing a RDP system: client–server systems, cloudbased systems, and blockchain-based systems. The following sections briefly introduce, evaluate, and compare the three systems based on the criteria of data storage functionality, data security, and cost of implementation. Client–server systems are the most used model in hospitals internationally since the 1990s. A client–server model has a centralized computer that manages and exchanges information with distributed workstations. Such systems require frequent data exchange between server and mobile terminals. Individual EHRs are documented and governed by a central server, and the systems usually adopt a unified description language for information exchange across different databases [27]. Client-server systems are traditional but continuously improving by integrating new technologies and algorithms to increase system capacity. Cloud-based systems are also called web-based systems because they allow an application to be run through an internet browser on any device rather than required software installation and centralized server. Cloud-based systems shift the information storage from in-house servers to cloud servers, and EHRs are stored and retrieved from the cloud [28]. These cloud servers are usually managed by host companies. Blockchain-based systems store information in a blockchain. Blockchain is distributed and immutable, delivering data security and calculation speed improvements. The workflow starts with the registration of patients through the client application by requesting a certificate and getting a private key with a new ID. Then all the transactions between patients and clinicians are committing into the blockchain network and distributed over the ledger network. Each transaction requires validation by all verified users, which is called a consensus property. After validation, a new block is appended to the chain, and the transaction will be officially committed to the network [29].

Regulated Digital Pharmacy Based on Electronic …

163

4.4 Comparison of Three Systems Storage Functionality. Client-server systems are still widely used for easy maintenance, scalability, centralization of control, and security [30]. Since client-server systems are usually installed locally, they can be customized to the needs of hospitals. Running on a local network guarantees a reliable connection [31]. These systems are also easy to use and maintain, but they possess limited physical scalability and require frequent hardware and software upgrades [32]. A client-server system’s performance closely depends on the robustness of the center server/computer, and it always requires a reliable local data backup [33]. Cloud-based systems offer flexible and scalable data storage and exchange [34]. A cloud-based system does not depend on a centralized server. Instead, external host companies provide customized data storage and backup, and healthcare providers access and manage the data remotely through a browser or dedicated software. Moreover, the framework of cloud-based systems can easily incorporate Artificial Intelligence (AI) features like natural language recognition [28]. So far, there is no noteworthy functionality shortage of cloud-based systems. Blockchain-based systems store information in a blockchain in a distributed and immutable manner. A blockchain network has no centralized dependency, which guarantees stability. The calculation speed and accessibility are improved because a transaction in blockchain uses the power from multiple inner nodes. The network includes indexes, time stamps, and other structural variables that enable doctors to easily trace heterogeneous health records [29]. However, the prominent shortage of a blockchain-based system is the limited capacity and scalability [35]. Many current studies point out that the majority of current medical systems have been over focusing on the storage and capacity while forgoing the true value of blockchains as a stable data carrier [36]. Data Security. For a client-server system, the centralized control and in-house information storage deliver high data security and patient privacy [30]. Because data is stored on a local centralized database and distributed workstations, data is relatively secure unless people with access carelessly or deliberately breach the hardware. Also, because each client-server system is specially designed for the given institute, the conversion of data formats between different client-server systems is risky and unreliable [37]. The most considerable disadvantage of a cloud-based system is that it is not hack-proof because data is stored in a remote cloud database. These systems are quite vulnerable to attacks, and any unauthorized access to databases is a threat to sensitive user information and privacy [34]. Moreover, the ultimate ownership of cloud databases belongs to host companies not medical institutes. One concern in the healthcare industry is public trust on EHRs, and these trusts are not guaranteed for being placed on host companies who provide cloud services. In simple, solely cloud-based systems are not secure enough to give protections on users’ privacy from collisions, cross-domain data sharing mechanisms, and the conflict with host companies’ Service Level Agreements (SLA) [36].

164

J. Zhong et al.

Blockchain-based systems offer stronger security. A blockchain is unmodifiable and unbreakable [37]; it only allows a new block to be appended to the chain. Scholars also claim that blockchain makes the system tamper-resistant. The consensus requirement guarantees the security and authorization of each transaction [29]. Many emerging researches also suggests novel blockchain-based system designs to achieve fine-grained access control that differentiates the accesses of different ends to sensitive medical data and user privacy [36]. The drawback is that blockchain technology is still relatively immature, so that there is no standard in the blockchain industry [38]. Cost of Implementation. Client-server systems are usually expensive because of the installation of hardware and software involving both the central server and distributed workstations. The software development is specifically designed for each institute depending on its requirements, and that’s why end-user labor and support labor takes 56% costs of setting up a client-server system [37]. The initial setup of a client-server system usually costs more than 40,000, excluding the license fee [32]. Moreover, a client-server system requires frequent manual upgrades and maintenance to accord with other technology development, and the total five-year costs for client-server systems range from 40,000 to 70,000 dollars depending on the enterprise size and structures. Noting these data are based on A Guide for Estimating Client/Server Costs released by the Gartner Group in 1994, and nowadays there are only few solely client-server systems without integrating clouds or computing features. Cloud-based systems eliminate the physical installation cost because they run applications through browsers on any devices. Meanwhile, host companies usually provide various kinds of service packages from personal level to enterprise level. Therefore, the cost is flexible and usually ranges from 10,000 to 70,000 dollars per system [39]. Also, host companies provide professional technical support for maintenance and upgrade without additional charge [32]. Since the blockchain technology is still immature, there is no narrowly specified price for setting up a blockchain-based EHR system. The cost for a blockchain project usually ranges from 5,000 to 200,000 dollars, depending on the host and needs [40]. An existing application of using a Blockchain-based system to manage solid waste indicates that the current implementation costs of a blockchain-based system is quite expensive, which involve development costs, energy consumption costs, financial transactions, cloud computing, software interface, smart contract, etc. But the cost can be optimized using economic models [41]. Comparison Results. Overall, while each system exposes some security limitations, cloud-based systems offer the most advantages in storage functionality. Blockchains provide the most reliable data security owing to its unique structure but blockchainbased systems still need more research and applications to prove its reliability. Cloudbased systems offer the most certain, flexible, and transparent cost of implementation.

Regulated Digital Pharmacy Based on Electronic …

165

4.5 Recommendation of an Optimal System to Implement in Japan The comparison of the three proposed EHR systems based on the criteria of storage functionality, data security, and cost of implementation shows cloud-based systems and blockchain-based systems overall offer more advantages than client-server systems. Since cloud-based systems and blockchain-based systems have their own limitations, a hybrid implementation of cloud and blockchain offers more advantages and less shortages [36, 42]. Cloud servers offer a less expensive method for RDP to create and execute patients’ orders; the blockchain system provides a secured functionality to EHR. Many encryption techniques have been used to solidify cloud systems’ resistance to hack attacks. Healthcare ecosystem is a network of highly complicated sectors that organized affiliate. As Fig. 2 has shown, patients, RDP, and healthcare providers are connected in the hybrid-implementation system. RDP with AIDAF framework is to provide enhanced communication to meet the demand of healthcare consumers; to synchronize up-to-date technologies and regulations between providers; to ensure the data security and users’ privacy simultaneously. Three parties register identities within the network to access EHR. Healthcare providers create patient’s smart contracts and upload each unit as unique hashes in blockchain. As RDP or patient requests for EHR by providing legitimate identification, cloud-servers decrypt the stored EHR and emit results to users. One concern for most of the existing encryption techniques is that they do not prevent the inner collusion between medical institutions and host companies [36, 38]. In the last century in Japan, doctors colluded with pharmacists by taking advantage of fee-for-service, which they were rewarded for the volume and quantity of services,

Fig. 2 Hybrid-implementation of cloud server and blockchain between healthcare users

166

J. Zhong et al.

and then the law prohibited the direct connection between them [43]. Blockchain can prevent collusion by requiring verifications on patients’ side and generating transactions in a tamper-proofing and distributed manner [38]. Therefore, while cloud-based systems provide flexible and cost-effective data storage, blockchain has great potential to remedy the security limitations of cloud-based systems. Based on the hybrid implementation, RDP could greatly guarantee the protection of the medical data and user privacy. As the session mentioned above, the RDP is based on the CPOE system and suited for enterprise architecture to improve quality care, reduce unnecessary healthcare costs, maximize hospital utility rate, and control medical error. RDP utilizes hybridsystem to transfer healthcare data and secure the personal data effectively.

5 Discussion 5.1 Ensuring Patients’ Privacy Protecting patients’ private medical data is a major task in designing any enterprise architectures. Data privacy cannot be protected only by impregnable security software, but also needs policies that assist along with it. In the United States, The Health Insurance Portability and Accountability Act of 1996 (HIPAA) was both the guideline for healthcare professionals and the foundation of health data security system. In Japan, The Act on the Protection of Personal Information (APPI) has been modified and passed in June 2020. APPI is a comprehensive guideline towards all personal information, and especially in the medical sector, healthcare professionals have the moral duty to avoid any sort of health data leakage by discussing with irrelevant third parties; log-in and out with identities to access the medical data [44]. It is necessary to bring healthcare professionals, one of the major stakeholders, into consideration during the development process of enterprise architecture. Strict regulations enforce and administers all security rules to any healthcare practice [45]. AIDAF gives faster access to and connects different healthcare users together, on the other hand, users must follow APPI to create an information-secured environment.

5.2 Comparison of Healthcare Insurance and Information Between Japan and America In the US, private insurance and government insurance are auxiliary to each other. Millions of America’s poorest and most vulnerable US citizens are covered by Medicaid. Medicaid bolsters the private insurance market by filling in the gaps in private health insurance and alleviating financial burden. People (low income, disability, poor health status) who are excluded from the private insurance plan will mostly be

Regulated Digital Pharmacy Based on Electronic …

167

covered by Medicaid [46]. Japanese government, on the other hand, has been long bearing financial pressure from the National Health Insurance (NHI) system. Such a difference brought about the need of implementing the RDP in Japan. Computerized Physician Order Entry (CPOE) has been widely implemented in the United States [17]. Its emergence has been shown to greatly improve the interaction between patients and the medical providers and guarantee the accuracy during the prescription process [18]. The success of CPOE applications in other countries reinforce the necessity of the current Japanese healthcare ecosystem to utilize an innovative pharmaceutical model.

5.3 Future Challenges and Research Based on the above research, adopting cloud systems with appropriate EAs and encryption techniques could bring profound benefits in operating electronic prescription services through RDP. As this paper serves as a conceptual proposal, future research should focus on examining various cloud systems and encryption techniques on protecting user privacy. It is also important to investigate the compatibility of traditional software and databases with RDP. Researchers may develop and test software and databases guided by well-designed EAs. Meanwhile, case studies should be conducted in certificated hospitals to prove the actual effectiveness of RDP in improving medical service quality, providing convenience to patients, reducing the government cost, and better regulating online counterfeit drug trading. Furthermore, RDP is expected to restructure the traditional Japanese pharmacy industry. The reformation of the industry may negatively impact some stakeholders’ profits in the early stage of implementation. While RDP will encourage innovations, the chain effect is unknown yet. Future research should also focus on the impact of RDP to the Japanese healthcare ecosystem from economic and political perspectives.

6 Conclusion It is a deliberate decision for researchers and healthcare providers to prepare for the potential discontinuity or retention of the industry, given the fact not advancing simply means regression. In the paper, data evidence demonstrates the current problems with the Japanese health ecosystem. Healthcare stakeholders are facing various kinds of problems that ranged from inconvenience to medical safety. The literature review suggests the benefits of EHR, CPOE, and AIDAF in reducing medical cost and error, improving quality, and regulating counterfeit drug trading. As the proposed RDP platform is based on EHR and CPOE, it will process the enormous data and sensitive information. The comparison of the three common EHR systems provides an insight into each major information system and helps software developers to select an optimal system as a basis. While the actual software-based construction and implementation

168

J. Zhong et al.

of RDP is still a challenge, this paper suffices to prove RDP as a benefit-over-harm innovation. It serves to conceptually propose the design and provide a guideline for management.

References 1. Williams: The Real Impact of Counterfeit Medications. https://www.uspharmacist.com/article/ counterfeit-meds. Retrieved 13 Feb 2020 2. Masuda, Y., Viswanathan, M.: future direction—open healthcare platform 2030 and IoT healthcare platform. In: Enterprise Architecture for Global Companies in a Digital IT Era: Adaptive Integrated Digital Architecture Framework (AIDAF), pp. 147–151 (2019) 3. Matsuda, S.: Health Policy in Japan—Current Situation and Future Challenges. JMA J. 2(1), 1–10 (2019). https://www.jmaj.jp/detail.php?id=10.31662%2Fjmaj.2018-0016 4. Uetsuka, Y.: Characteristics of Japan’s Healthcare Systems and the Problems (2012). https:// www.med.or.jp/english/journal/pdf/2012_04/330_333.pdf 5. Ministry of Finance: Japanese Public Finance Fact Sheet (2020). https://www.mof.go.jp/eng lish/budget/budget/fy2021/01.pdf 6. WHO: What is Quality of Care and why is it important. http://158.232.12.119/maternal_child_ adolescent/topics/quality-of-care/definition/en/ 7. World Health Organization: The World Health Report, Health Systems: Improving Performance (2000). https://apps.who.int/iris/handle/10665/42281 8. Hirose, M., Imanaka, Y., Ishizaki, T., Evans, E.: How can we improve the quality of health care in Japan? Learning from JCQHC hospital accreditation. Health Pol. 66(1), 29–49 (2003). https://doi.org/10.1016/s0168-8510(03)00043-5. PMID: 14499164 9. Flam, F.: The unseen crisis of drug shortages. https://www.japantimes.co.jp/opinion/2019/04/ 27/commentary/world-commentary/unseen-crisis-drug-shortages/. Retrieved 09 Mar 2021 10. Acosta, A., Vanegas, E.P., Rovira, J., Godman, B., Bochenek, T.: Medicine shortages: gaps between countries and global perspectives. Front. Pharmacol. 10, 763 (2019). https://doi.org/ 10.3389/fphar.2019.00763 11. Buchholz, K., Richter, F.: Infographic: Where COVID-19 Is Putting a Strain on Japanese Hospitals (2021). https://www.statista.com/chart/22483/hospital-occupancy-japanese-prefec tures-covid-19/ 12. Japan Broadcasting Corporation: COVID-19 Tokyo Osaka Aichi Hokkaido Number of InPatient, Severely Ill Patients, and Usage of Hospital Beds (2021). https://www3.nhk.or.jp/ news/special/coronavirus/severe/ 13. KYODO: Overloaded hospitals: ‘Medical care system is already in a state of collapse’ (2021). https://www.japantimes.co.jp/news/2021/01/09/national/overloaded-hospitals-japancoronavirus/ 14. Byrd, G.D., Wei, D.: Leveraging the electronic health record system to enhance hand surgery practice. Hand Clin. 36(2), 181–188 (2020). https://doi.org/10.1016/j.hcl.2020.01.016. PMID: 32307048 15. Turk, M.: Electronic health records: how to suture the gap between privacy and efficient delivery of healthcare, 80 Brook. L. Rev. (2015). https://brooklynworks.brooklaw.edu/blr/vol80/iss2/8 16. Silverman, R.D.: EHRs, EMRs, and health information technology: to meaningful use and beyond: a symposium introduction and overview. J. Leg. Med. 34(1), 1–6 (2013). https://doi. org/10.1080/01947648.2013.768134. PMID: 23550980 17. Pedersen, C.A., Schneider, P.J., Scheckelhoff, D.J.: ASHP national survey of pharmacy practice in hospital settings: Prescribing and transcribing-2016. Am. J. Health Syst. Pharm. 74(17), 1336–1352 (2017). https://doi.org/10.2146/ajhp170228. PMID: 28743758 18. Connelly, T.P., Korvek, S.J.: Computer provider order entry. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2021 Jan–. PMID: 29261903

Regulated Digital Pharmacy Based on Electronic …

169

19. Masuda, Y., Shepard, D. S., Yamamoto, S., Toma, T.: Clinical decision-support system with electronic health record: digitization of research in pharma. In: Jain, L.C., Chen, Y.-W., Zimmermann, A., Howlett, R.J. (eds.) Innovation in Medicine and Healthcare Systems, and Multimedia—Proceedings of KES-InMed 2019 and KES-IIMSS 2019 Conferences, pp. 47–57. (Smart Innovation, Systems and Technologies, Vol. 145). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-13-8566-7_5 20. Masuda, Y., Shirasaka, S., Yamamoto, S., Hardjono, T.: Architecture board practices in adaptive enterprise architecture with digital platform: a case of global healthcare enterprise. Int. J. Enterp. Inf. Syst. 14, 1–20 (2018). https://doi.org/10.4018/ijeis.2018010101 21. Plastiras, P., O’Sullivan, D.: Exchanging personal health data with electronic health records: a standardized information model for patient generated health data and observations of daily living. Int. J. Med. Inform. 120, 116–125 (2018). https://doi.org/10.1016/j.ijmedinf.2018. 10.006. PMID: 30409336 22. Kazley, A.S., Simpson, A.N., Simpson, K.N., Teufel, R.: Association of electronic health records with cost savings in a national sample. Am. J. Manage. Care. 20(6), e183–90 (2014). PMID: 25180501 23. NueMD: EHR software helps both patients and providers save money(2016). https://nuemd. com/news/2016/07/22/ehr-software-helps-both-patients-providers-save-money 24. Poon, E.G., Wright, A., Simon, S.R., Jenter, C.A., Kaushal, R., Volk, L.A., Cleary, P.D., Singer, J.A., Tumolo, A.Z., Bates, D.W.: Relationship between use of electronic health record features and health care quality: results of a statewide survey. Med. Care. 48(3), 203–9 (2010). https:// doi.org/10.1097/MLR.0b013e3181c16203. PMID: 20125047 25. Ayaad, O., Alloubani, A., ALhajaa, E.A., Farhan, M., Abuseif, S., Al Hroub, A., Akhu-Zaheya, L.: The role of electronic medical records in improving the quality of health care services: comparative study. Int. J. Med. Inform. 127, 63–67 (2019). https://doi.org/10.1016/j.ijmedinf. 2019.04.014. PMID: 31128833 26. Bowman, S.: Impact of electronic health record systems on information integrity: quality and safety implications. Perspect. Health Inf. Manage. 10(Fall), 1c (2013) 27. Yoshihara, H.: Development of the electronic health record in Japan. Int. J. Med. Inform. 49, 53–8 (1998). https://doi.org/10.1016/S1386-5056(98)00010-0 28. Liu, Z., Weng, J., Li, J. et al.: Cloud-based electronic health record system supporting fuzzy keyword search. Soft Comput. 20, 3243–3255 (2016). https://doi.org/10.1007/s00500-0151699-0 29. Tanwar, S., Parekh, K., Evans, R.: Blockchain-based electronic healthcare record system for healthcare 4.0 applications. J. Inf. Secur. Appl. 50, 102407 (2020). https://doi.org/10.1016/j. jisa.2019.102407 30. Google Sites: Advantages and Disadvantages of Client-Server Architecture—ClientServer Architecture. https://sites.google.com/site/clientserverarchitecture/advantages-of-cli ent-server-architecture 31. EHRIntelligence: EHR Best Practices: Choosing a Client-Server EHR (2017). https://ehrintell igence.com/news/ehr-best-practices 32. O’Connor, S.: Cloud-Based EHR vs. Client-Server EHR: 4 Key Differences (2017). https:// www.adsc.com/blog/cloud-based-ehr-vs.-client-server-ehr-4-key-differences 33. Patient Account Services: Web-based vs Client/Server Comparison: Articles. https://www.pat ientaccountservices.com/articles/ 34. Xhafa, F. et al.: Designing cloud-based electronic health record system with attribute-based encryption. Multimed. Tools Appl. 74, 3441–3458 (2015). https://doi.org/10.1007/s11042-0131829-6 35. MTBC: 5 Benefits & Challenges of Blockchain Technology in Healthcare Industry (2018). https://www.mtbc.com/learningcenter/blockchain-technology-benefits-challenges/ 36. Huang, H., Sun, X., Xiao, F., Zhu, P., Wang, W.: Blockchain-based eHealth system for auditable EHRs manipulation in cloud environments. J. Parallel Distrib. Comput. 148, 46–57 (2021). https://doi.org/10.1016/j.jpdc.2020.10.002

170

J. Zhong et al.

37. Mignerey, L.J.: Client/server conversions: balancing benefits and risks. Cause/Effect 19(3), 40–45 (1996). https://www.educause.edu/ir/library/html/cem/cem96/cem9638.html 38. Cao, S., et al.: Cloud-assisted secure ehealth systems for tamper-proofing EHR via blockchain. Inf. Sci. 485 (2019). https://doi.org/10.1016/j.ins.2019.02.038 39. HealthIT: How much is this going to cost me? (2014). https://www.healthit.gov/faq/how-muchgoing-cost-me 40. Azati: How much does it cost to develop blockchain in 2019 (2019). https://azati.ai/how-muchdoes-it-cost-to-blockchain/ 41. Gopalakrishnan, P.K., Hall, J., Behdad, S.: Cost analysis and optimization of Blockchain-based solid waste management traceability system. Waste Manage. (New York, N.Y.), 120, 594–607 (2021). https://doi.org/10.1016/j.wasman.2020.10.027 42. Kim, M. et al.: Design of secure protocol for cloud-assisted electronic health record system using blockchain. Sensors (Basel, Switzerland), 20(10), 2913 (2020). https://doi.org/10.3390/ s20102913 43. Kaneko, M., Matsushima, M.: Current trends in Japanese health care: establishing a system for board-certificated GPs. Br. J. Gen. Pract. J. R. Coll. Gen. Practition. 67(654), 29 (2017). https://doi.org/10.3399/bjgp17X688669 44. Ministry of Health, Labour and Welfare: Guidelines for Medical Information System of Privacy Management (2017). https://www.mhlw.go.jp/file/05-Shingikai-12601000-Seisakuto ukatsukan-Sanjikanshitsu_Shakaihoshoutantou/0000166260.pdf 45. HHS Office of the Secretary, & Office for Civil Rights: Summary of the HIPAA Security Rule (2013). https://www.hhs.gov/hipaa/for-professionals/security/laws-regulations/index.html 46. Robin Rudowitz Follow, R.: 10 things to know About Medicaid: Setting the facts straight (2020). https://www.kff.org/medicaid/issue-brief/10-things-to-know-about-medicaid-settingthe-facts-straight/. Retrieved 10 Mar 2021

Performance Verification of a Text Analyzer Using Machine Learning for Radiology Reports Toward Phenotyping Takanori Yamashita, Rieko Izukura, and Naoki Nakashima

Abstract The medical field is embracing the information age, and the rapidly increasing medical data generated from hospital information system signified the advent of Big Data in the healthcare arena, such that real-time data are now available to assist many clinical decisions. Real World Data (RWD) from hospital information system structured numerical data and unstructured text data, and it is imperative that phenotyping reproducibly extracts patients with an accurate phenotype from RWD using a rule-based approach. In this study, of sampling computed tomography reports from 100 patients, 48 were diagnosed with interstitial pneumonia. Three machine learning methods (Support Vector Machine, Feature Selection and Gradient Boosting Decision Tree (GBDT)) were combined for development of a text phenotyping method, which was applied for the analysis to achieve prediction with good performance. We extracted several feature words to predict true cases of interstitial pneumonia and recognized that the effect of feature selection was identified from a good performance of GBDT’s AUC. We also identified that while applying machine learning to text-based RWD, variables have to be narrowed down. Keywords Interstitial pneumonia · Support vector machine · Gradient boosting decision tree

1 Introduction 1.1 Background Utilization of Real World Data (RWD) and Big Data is a burgeoning area in medicine. The hospital information system is an integral part of medical institutes and is in harmony with the progresses made with digitalization of the medical practice. Medical data comprise structured numerical data and unstructured text data. The Japanese T. Yamashita (B) · R. Izukura · N. Nakashima Medical Information Center, Kyushu University Hospital, Kyushu, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_14

171

172

T. Yamashita et al.

hospital information system has evolved from the medical accounting system established in the 80s to the current electronic medical record system in vogue now, after a transition through the ordering system. While medicine, laboratory test, diagnosis, and surgery can be expressed as structured numerical data, the patients’ condition, purpose of medical treatment, and patient outcomes are often described as free text.

1.2 Phenotyping Originally, medical data were associated with the treatment process as “patient condition”—“laboratory test/imaging test”—“diagnosis by a doctor”—“medicine, operation and rehabilitation.”—“outcome”. However, in the current Japanese hospital information system, data on each medical treatment are gathered independently, rendering it difficult to accurately extract outcomes from disease, laboratory test, and treatment, from the data. Phenotyping is a technique that accurately and reproducibly extracts diagnosis from RWD using a rule-based approach [1]. The four issues, completeness, accuracy, complexity, and bias are important considerations when phenotyping from RWD [2]. Data modalities and methods used for phenotype algorithms have been published as phenotype knowledgebase.1 These algorithms demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others [3]. While structured RWD is better for accurate phenotyping, it is essential that one could use unstructured text data as well.

1.3 Related Work Unstructured text data consist of a wide variety of expressions, including free texts written by medical staff, including doctors and nurses, who are at the frontline of patient care. Medical text data within an electronic medical record consist of medical reports and discharge summary, whose analysis is expected to improve medical process and support governing clinical decisions [4–6]. The data is currently not sufficiently utilized because its analysis requires considerable time and effort. It is therefore crucial to establish methods that facilitate this analysis, thereby contributing significantly to improvement in health care quality. In this context, it has been shown through modeling approaches that natural language processing (NLP) automatically populates pertinent parts of the model from unstructured text reports [7]. However, when NLP and machine learning were applied to medical text data performance evaluation did not yield significant feedback for the direct benefit of medical practice [8]. 1 PheKB,

https://www.phekb.org/phenotypes.

Performance Verification of a Text Analyzer …

173

Our investigations focused on the use of machine learning and feature selection for the prediction of patient hospitalization from medical text data [9], then feature selection by applying medical dictionary achieved optimal prediction performance [10]. The objective of this study was to verify the effectiveness of diagnosis extraction from unstructured text data to improve phenotyping.

2 Data First, we set eight key words that were related to interstitial pneumonia diagnosed by a chest physician. Then, we collected computed tomography (CT) scan reports from the medical records of 5,141 patients who visited Kyushu University Hospital between 2014 and 2015. The key words from CT reports had certain attributes, including patient ID, objective, region, diagnosis and text-based observation, but we focused on the latter. To analyze the CT reports of 100 cases, we constructed a search engine to extract data from the textual records. We used GETA system available at NII GETA project.2

3 Methods From a random sample of 100 patients, 48 were diagnosed with interstitial pneumonia by a chest physician. We applied machine learning methods (Support Vector Machine (SVM]) Feature Selection, and Gradient Boosting Decision Tree (GBDT)) to their CT reports for obtaining true diagnosis for clinical interpretation(Fig. 1).

3.1 Classification We applied SVM to CT reports from 100 patients following a specific procedure. All CT reports were vectorized after morphological analysis using the medical dictionary (about 80,000 words). A case was positive if diagnosed as an interstitial pneumonia by a chest physician, or negative if the case was not marked. A classification model was constructed using SVM (SVM-light [12]), which was applied to the imaginary GETA, document that consisted of a single word “wi”. The predicted score of the word was defined as score(wi) of the word, and denoted the SVM score of a word “wi,” which was obtained by applying the model to the imaginary document that contains only the word.

2 GETA,

http://geta.ex.nii.ac.jp/geta.html.

174

T. Yamashita et al.

Fig. 1 Analysis procedure

3.2 Feature Selection The score(wi) has previously been used for feature selection [11]. Next, we proposed two other measures to evaluate the importance of each word [10]. The first measure score(wi)*df(wi) was obtained as the product of the document frequency df(wi) of the word. The second measure log(score(wi)*df(wi)) was the product of the log of the document frequency of the word and the score. Those measures are defined as “w.o, d.o, and l.o.” We applied the feature selection method for the feature words [9, 10]. The top N of positive and negative words by SVM was set to build a model to predict if interstitial pneumonia had been diagnosed. Then, the method varied the number of words N (N = 1, 2, . . . , 10, 20, . . . , 100, 200, 300). We used 5-fold cross validation in the evaluation experiment. We evaluated the prediction performance indexes by each measurement (precision, recall, accuracy, and F-measure).

3.3 Feature Extraction and Evaluation We applied the Gradient Boosting Decision Tree (GBDT) [13], the most powerful technique for building predictive models, to the words by feature selection of incorrectly classified in the classification model. Also, samples with large errors in the regression model were constructed with a new model so that the loss function value decreased and improved in GBDT. This model works satisfactorily even with several explanatory variables and for interpreting results of variable importance. Feature selection in this study was analyzed by increasing the number of words as described above. Then, we targeted the words at higher point instead of all words. Next, we

Performance Verification of a Text Analyzer …

175

analyzed whether the true case judged by a chest physician affected the true case among words that display good performance by feature selection using GBDT. Then, we compared the area under the curve (AUC) the obtained with each pattern.

4 Results 4.1 Feature Words by Classification Morphological analysis generated 2,301 words from the CT reports of 100 patients, which was narrowed down to 789 words, and excluded stop words, one length character, and symbols. Positive or negative words were classified to the true case of interstitial pneumonia by the SVM. The top 10 feature words among positive and negative words by SVM score shown in Table 1. The extracted words represent the site of lung and their condition.

4.2 Prediction for Performance Measures Three measures (w.o, d.o and l.o), which are defined in Sect. 3.2, were used for feature selection. Then, the prediction performance was evaluated with respect to each measure. The baseline of precision that used all words was 0.700. The measure by l.o attained two peaks of 0.894 (N = 2) and 0.795 (N = 4). The other measure by w.o and d.o attained peaks of 0.828 and 0.773 (N = 20) respectively (Fig. 2). The

Fig. 2 Precision in feature selection

176

T. Yamashita et al.

Table 1 Top 10 of positive and negative words by SVM score

Seq

SVM score Df1Word

Positive words 1 2 3 4 5 6 7 8 9 10

0.289 0.195 0.184 0.179 0.147 0.145 0.139 0.133 0.131 0.131

59 63 12 17 99 9 27 11 22 33

-0.227 -0.213 -0.193 -0.185 -0.160 -0.157 -0.156 -0.151 -0.150 -0.148

11 21 38 73 30 22 7 12 13 13

: reticulate : double lung : after resecting : right lung lower lobe : effect : shade and shadow : subject : acanthosis : same

Negative words 1 2 3 4 5 6 7 8 9 10 1

: :

: :

: activity : granular shadow : inflammatory mediastinum hilar : obsolete : hepatosplenomegaly : left lung lower lobe degree lung field

Document frequency

baseline of recall that used all words was 0.337. The measure by l.o and w.o attained peaks of 0.975 (N=3) and 0.914 (N = 6) respectively. Other peaks were shown at N=30–60 (Fig. 3). The baseline of accuracy that used all the words was 0.597, and attained peaks 0.774 at N = 9, 0.905 at N = 60 (w.o) (Fig. 4). The baseline of Fmeasure that used all the words was 0.419, and attained peaks 0.794 at N = 9, 0.906 at N = 60 (w.o) (Fig. 5). The measures of w.o and l.o exhibited high performance in the feature selection.

4.3 Evaluation for Performance Measures The GBDT was applied to the following patterns. – Positive and negative words (789 words) – Positive words (177 words)

Performance Verification of a Text Analyzer …

Fig. 3 Recall in feature selection

Fig. 4 Accuracy in feature selection

177

178

Fig. 5 F-measure in feature selection Fig. 6 GBDT (All words)

T. Yamashita et al.

Performance Verification of a Text Analyzer …

179

Fig. 7 GBDT (Only posistive words)

– Positive and negative words at N = 60 (w.o) (120 words) – Positive and negative words at N = 30 (l.o) (60 words) The AUCs obtained with each pattern were as follows: all words used, AUC = 0.643 (Fig. 6); only positive words by the SVM, AUC = 0.642 (Fig. 7); words with high SVM score (w.o), AUC = 0.762 (Fig. 8); words with high SVM log score (l.o), AUC = 0.751 (Fig. 9). “ : reticulate”, “ : inflammatory”, “ : : tumor”, “ : shade and shadow” and “ : activity” were commonly mediastinum”, “ recognized in the upper words in Figs. 8 and 9.

5 Discussion Using the CT reports, words were classified as true diagnosis of interstitial pneumonia by the SVM, and using the words we then predicted performance by feature selection. We compared the words with high performance in feature selection and all other words by the GBDT. Finally, we extracted words to predict interstitial pneumonia diagnosis. In feature selection by SVM, precision of l.o and recall of w.o and

180

T. Yamashita et al.

Fig. 8 GBDT (Postive and negative words at N=60 (w.o))

l.o showed that prediction performance was good with small N. The prediction performance of accuracy gradually increased along with a variety of N in all indexes. However, the prediction performance of F-measure was better for w.o than other indexes. The performance of the GBDT was better for selected words by SVM feature selection than all words. The effect of feature selection was specified from the AUC of GBDT. An accurate analysis of RWD is an arduous task, particularly when text-based RWD is not structured. This necessitates the development of variables, which have to be narrowed down when applying machine learning to text-based RWD, it is considered that narrowing down of variables may necessary.

6 Conclusion and Future Work Though machine learning can satisfactorily predict performance with many variables, text-based medical data are hard to process and interpret. Therefore, in this study, we combined three machine learning methods to achieve prediction with good performance, and some words that could be specifically interpreted were extracted.

Performance Verification of a Text Analyzer …

181

Fig. 9 GBDT (Postive and negative words at N=30 (l.o))

Consequently, we could construct text phenotyping methods. In the future, we aim to create an accurate rule to extract true cases using extracted words from RWD comprising medical text data in electronic medical records including, but not limited to, an admission report, clinical note, operation report, and discharge summary.

References 1. Newton KM, Peissig PL, Kho AN, Bielinski SJ, Berg RL, Choudhary V, Basford M, Chute CG, Kullo IJ, Li R, Pacheco JA, Rasmussen LV, Spangler L, and Denny JC. 2013. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. Journal of the American Medical Informatics Association 20, 20 (2013), e147–154 2. Hripcsak G and Albers DJ. 2013. Next-generation phenotyping of electronic health records. Journal of the American Medical Informatics Association 20, 1 (2013), 117–121 3. Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, Ellis SB, Lingren T, Thompson WK, Savova G, Haines J, Roden DM, Harris PA, and Denny JC. 2016. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. Journal of the American Medical Informatics Association 23, 6 (2016), 1046–1052

182

T. Yamashita et al.

4. Demner Fushman D, Chapman WW, and McDonald CJ. 2009. What can Natural Language Processing do for Clinical Decision Support? Journal of Biomedical Informatics 42, 5 (2009), 760–772 5. Zhu F, Patumcharoenpol P, Zhang C, Yang Y, Chan J, Meechai A, VongsangnakW, and Shen B. 2013. Biomedical text mining and its applications in cancer research. Journal of Biomedical Informatics 46, 2 (2013), 200–211 6. Meystre, S.M., Savova, G.K., Kipper-Schuler, K.C., Hurdle, J.F.: Extracting information from textual documents in the electronic health record: a review of recent research. Yearbook of medical informatics 2018, 128–144 (2018) 7. Coden A, Savova G, Sominsky I, Tanenblatt M, Masanz J, Schuler K, Cooper J, Guan W, and de Groen PC. 2009. Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model. Journal of Biomedical Informatics 42, 5 (2009), 937–949 8. Xu, H., Fu, Z., Shah, A., Chen, Y., Peterson, N.B., Chen, Q., Mani, S., Levy, M.A., Dai, Q., Denny, J.C.: Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annual Symposium proceedings 2011, 1564–1572 (2011) 9. Yamashita T, Wakata Y, Hamai S, Nakashima Y, Iwamoto Y, Flanagan B, Nakashima N, and Hirokawa S. 2014. Extraction of Key Factors from Operation Records by Support Vector Machine and Feature Selection. Indian Journal of Medical Informatics 8, 2 (2014), 70–71 10. Yamashita T, Wakata Y, Nakashima N, Hamai S, Nakashima Y, Iwamoto Y, Flanagan B, and Hirokawa S. 2015. Presumption Model for Postoperative Hospital Days from Operation Records. International Journal of Computer and Information Science 16, 1 (2015), 50–59 11. Sakai T and Hirokawa S. 2012. Feature Words that Classify Problem Sentence in Scientific Article. Proceedings of the 14th International Conference on Information Integration and Webbased Applications and Services (2012), 360–367 12. Joachims, Thorsten: Making large-scale support vector machine learning practical. Advances in kernel methods 1999, 169–184 (1999) 13. Friedman JH. 2001. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 29, 5 (2001), 1189–1232

An Optimization Model for the Tradeoff Between Efficiency and Equity for Mobile Stroke Unit Placement Saeid Amouzad Mahdiraji , Johan Holmgren, Radu-Casian Mihailescu, and Jesper Petersson

Abstract A mobile stroke unit (MSU) is an ambulance, where stroke patients can be diagnosed and treated. Recently, placement of MSUs has been studied focusing on either maximum population coverage or equal service for all patients, termed efficiency and equity, respectively. In this study, we propose an unconstrained optimization model for the placement of MSUs, designed to introduce a tradeoff between efficiency and equity. The tradeoff is based on the concepts of weighted average time to treatment and the time difference between the expected time to treatment for different geographical areas. We conduct a case-study for Sweden’s Southern Health care Region (SHR), generating three scenarios (MSU1, MSU2, and MSU3) including 1, 2, and 3 MSUs, respectively. We show that our proposed optimization model can tune the tradeoff between the efficiency and equity perspectives for the MSU(s) allocation. This enables a high level of equal service for most inhabitants, as well as reducing the time to treatment for most inhabitants of a geographic region. In particular, placing three MSUs in the SHR with the proposed tradeoff, the share of inhabitants who are expected to receive treatment within an hour potentially improved by about a factor of 14 in our model. Keywords Driving time estimation · Efficient coverage · Equal treatment · Mobile stroke unit · Time to treatment · Tradeoff function S. A. Mahdiraji (B) · J. Holmgren · R.-C. Mihailescu Malmö University, Bassänggatan 2, 21119 Malmö, Sweden e-mail: [email protected] J. Holmgren e-mail: [email protected] R.-C. Mihailescu e-mail: [email protected] J. Petersson Region Skåne, Fritz Bauersgatan 5, 21428 Malmö, Sweden Lund University, Entrégatan 7, 22242 Lund, Sweden J. Petersson e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_15

183

184

S. A. Mahdiraji et al.

1 Introduction Stroke, which is a medical condition resulting in reduced blood flow in the brain, is the second most common cause of death worldwide and a leading cause of permanent physical and cognitive disability, leaving people paralyzed and unable to perform their daily activities [1]. There are three main types of stroke: ischemic, hemorrhagic, and transient ischemic attack (TIA), each requiring specific treatment. In ischemic stroke, one or more clots reduce the blood flow inside the brain, and patients should receive thrombolysis and sometimes thrombectomy if the clot is large. A hemorrhagic stroke occurs when a blood vessel in the brain ruptures and blood flows into the surrounding tissues. Current guidelines for hemorrhagic stroke recommend very early start of blood pressure lowering therapy. A TIA occurs when the blood flow is temporarily blocked during a short period of time, allowing the brain function to fully recover. The type of stroke is determined by performing a computed tomography (CT) scan on the brain of the patient. It is well established that immediate treatment is essential for all types of stroke; in particular, the term “golden hour” is proposed for ischemic stroke, asserting that the patients who receive treatment within an hour of symptom onset, have a much higher chance for full recovery than the patients with later start of treatment [2]. However, due to logistical challenges, it is often difficult to provide fast enough treatment for stroke patients. As CT scanners are typically only available at hospitals, the patient first must be transported to an acute hospital for diagnosis and treatment. Mobile stroke units (MSUs), in addition to regular ambulances, have been deployed in some areas (for example: Berlin, Cleveland and Melbourne) as an alternative for prehospital diagnosis and care [2]. An MSU is a specialized ambulance equipped with a CT scanner allowing the ambulance personnel to diagnose the stroke patients and provide intravenous stroke treatment in the ambulance. Therefore, the use of MSUs reduces at least the time required to transport and to diagnosis the patient at the hospital. However, MSUs are expensive and it is important to locate them in order to provide maximum benefit for the patients. Optimization modeling has been used to identify efficient ambulance/MSU locations in a region using two perspectives: efficiency and equity [3–5]. The majority of the existing studies aim to reach maximum population coverage by considering efficiency, i.e., placing ambulances in an optimal way to cover as many persons as possible. Equity, which is covered in some studies, seeks to provide equal service for all patients, regardless of where they live. It should be emphasized that each of these perspectives has a bias towards a group of inhabitants, i.e., residents of densely populated or rural areas. We contribute an objective function used in optimization modeling in order to tune the tradeoff between efficiency and equity for the optimal placement of MSUs in a geographic region. In a scenario study, we evaluate the proposed model by comparing the current situation in the Sweden’s Southern Health care Region (SHR) with three generated MSU scenarios. The computational results show that the presented approach can be used to tune the tradeoff between efficiency and equity.

An Optimization Model for the Tradeoff …

185

The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 introduces the time to treatment estimation model. Section 4 presents an optimization model to make a tradeoff between efficiency and equity for optimal MSU(s) placement. Our scenario study is presented is Sect. 5 and the results and discussion are given in Sect. 6. Finally, Sect. 7 concludes the paper.

2 Related Work In the emergency medical services (EMS) literature, efficiency and equity have been assessed by the coverage of urban and rural areas, respectively. Likewise, efficiency emphasizes the deployment of an MSU in a place where it potentially helps a higher amount of people to get a shorter time to thrombolysis. The focus of equity is to cut down the time to thrombolysis for patients residing far from the hospitals providing thrombolysis and CT scan. However, while the goal of efficiency is to provide maximum population coverage, there is no agreed definition of equity or how it should be measured. Equitable service could be provided using range [6], variance [7], mean absolute deviation [8], squared coefficient of variation [9], Gini coefficient [10], and envy criteria [11]. The tradeoff between efficiency and equity for optimal placement of regular ambulances has recently gained attention in the research community. Enayati et al. [10] use multi-objective optimization for location and dispatching problems and achieve a balanced solution considering both efficiency and equity. Chanta et al. [12] use a bicriteria optimization framework to combine efficiency and equity. Toro-Díaz et al. [9] develop a large-scale EMS system by considering the efficiency and equity criteria using a Tabu Search-based heuristic with an embedded approximation procedure. Some articles have been written addressing optimal placement of MSUs, with main focus on the benefits for residents in urban areas (efficiency perspective) [3, 4] or rural areas (equity perspective) [5]. Rhudy Jr. et al. [4] employ a geospatial analysis of the distribution of MSUs to optimize service delivery for stroke patients in the city of Memphis. Phan et al. [3] use Google maps to find the optimal location of an MSU in Sydney. Dahllöf et al. [13] use expected value optimization in order to identify the optimal placement of an MSU in the Skåne county of Sweden. To the best of our knowledge, no prior studies have tried to optimally place an MSU in a geographical region considering a tradeoff between efficiency and equity. In the current study, we extend the work of Dahllöf et al. [13] by explicitly incorporating the tradeoff between efficiency and equity. While the abovementioned studies have a restricted set-up with only one MSU, in the present paper, we provide a generalized analysis for one or more MSUs. Finally, while prior studies mostly assess their approaches in highly populated areas, we apply our approach in Sweden’s Southern Health care Region, which includes both urban and rural areas.

186

S. A. Mahdiraji et al.

3 Time to Treatment Estimation Model In this section, we present our time to treatment estimation model for the patients located at different places in a geographical region. See the companion paper by Amouzad Mahdiraji et al. [14], for a more detailed description the model. We divided the region of study (ROI) into disjoint sub-regions, a set of 1 × 1 km2 R, enabling us to take into account the variation of population density and the expected time to treatment over various parts of the ROI. In our calculations, we assumed that all of the patients in the square r ∈ R are located in its center cr , which means that all transport to and from sub-region r ∈ R, is assumed to be made to and from the center cr (of r). Let LAMB , LMSU , and LH denote the set of regular ambulance sites, MSU locations, and acute hospital locations in the ROI, respectively. For a regular ambulance located at ambulance site l ∈ LAMB , we let tlAMBRESP be the expected response time, i.e., the time from an emergency call until an ambulance dispatches, tlrAMBLR the expected time to drive from ambulance location l to the centroid cr of sub-region r, t AMBLAY the expected layover time, i.e., the time from the ambulance has arrived at the patient AMBRH the expected time to drive from cr to acute hospital location until it departs, trh DTN H h ∈ L , and th the expected time for diagnosis at acute hospital h ∈ LH , i.e., the expected time from the arrival of the patient to the hospital until the treatment is initiated. For an MSU located at ambulance site l ∈ LMSU , tlMSURESP is the expected time from an emergency call until an MSU starts driving towards the patient site, tlrMSULR is the expected time to drive from l to cr , t MSULAY is the expected layover time for an MSU, and t MSUDIAG is the expected time to diagnose a stroke patient inside an MSU. The expected time to treatment, using only the regular ambulances located in LAMB , for a patient located in square r ∈ R is estimated as: trAMBTT = min

l∈L,h∈H

 AMBRESP  AMBRH tl + tlrAMBLR + t AMBLAY + trh + thDTN .

(1)

Equation (1) is the minimum expected time to treatment with respect to the nearest ambulance site l and the nearest acute hospital h to the patient located in square r. Assuming that only the MSUs located in LMSU can be used, the expected time to treatment for a patient located in square r ∈ R is estimated as: trMSUTT = min

l∈L,h∈H

 MSURESP  tl + tlrMSULR + t MSULAY + t MSUDIAG .

(2)

When both regular ambulances and MSUs are available, the expected time to treatment for a patient located in square r ∈ R is estimated as:   trTT = min trAMBTT , trMSUTT .

(3)

An Optimization Model for the Tradeoff …

187

4 Tradeoff Between the Efficiency and Equity Perspectives In the health care domain, efficiency can be loosely defined as helping as many patients as possible as much as possible. Efficiency can be measured in terms of the total cost for society or the average time to treatment for the patients with the same diagnosis. In the current study, we chose the time to treatment as efficiency measure; in particular we used the weighted average time to treatment (WATT) t TT , which is calculated as:  trTT · Qr , (4) t TT = r∈R

where Qr is the share  of stroke cases (in the ROI) that is expected to take place in sub-region r ∈ R ( r∈R Qr = 1) and trTT is the expected time to treatment for a patient located in sub-region r ∈ R. Efficiency aims to place MSUs where they reduce the expected time to treatment most for the whole region. Urban areas seem to be a favorable location of MSUs in this perspective due to being highly populated. The focus of our choice of efficiency definition is to find an MSU location set that minimizes Eq. (4). This corresponds to the optimization problem:  argmintsTT s∈S

=



 TT tr,s

· Qr ,

(5)

r∈R

where S denotes the set of all possible MSU allocations, or in other words, all possible solutions to the optimization problem, and s ∈ S denotes a particular MSU allocation. It is worth noting that s defines whether or not an MSU is allocated to each of the ambulance locations. Furthermore, tsTT is the WATT for the whole ROI considering TT is the expected time to treatment for sub-region r ∈ R the MSU allocation s and tr,s considering ambulance allocation s. In a system with perfect equity, all patients have the same time to treatment, regardless of who they are and where they are located. Several quantitative measures have been suggested in order to measure the equity of a system. Among the equity measures described in Sect. 2, we believe that the range, variance, mean absolute deviation, squared coefficient of variation, and Gini index are the most practical for the MSU location problem as they are among the strictest measures of equity and they do not have a built-in tradeoff due to variation of the population density. We made a comparison between the mentioned equity measures and decided to employ the range as our measure of equity. The range measure strives to minimize the difference between the expected time to treatment for patients located at different places in the ROI. The optimal location, considering the equity perspective, of MSUs can be identified by solving the following optimization model:

188

S. A. Mahdiraji et al.

  TT   TT − min{tr,s argmin max tr,s }

(6)

s∈S

We make use of the efficiency and equity equations defined above to establish our tradeoff function, which is the objective function in the optimization problem:     argmin (1 − W ) Fefficiency + W Gequity ,

(7)

s∈S

where Fefficiency and Gequity are the efficiency (see Eq. 5) and equity (see Eq. 6) functions, respectively. W is a controlling weight, which is used to determine the impact of each of the efficiency and equity on the allocation of MSUs, varying from 0 to 1. After assigning a value to W , the optimization problem is solved by finding the MSU allocation s that minimizes the Eq. (7). In particular, when W = 0 or W = 1, only efficiency or equity is considered, respectively. If, for example, W = 0.5, equal weight is given to efficiency and equity. The reason that we applied the weight W to the tradeoff function is to enable to support the public health authorities’ priorities regarding efficiency and equity when deciding where to place the MSU(s). For smaller problem instances, our optimization problem can be solved using exhaustive search; however, larger problem instances might require more sophisticated solution methods. It is also possible to add additional constrains to the presented optimization models, e.g., limitation of the number of MSUs to locate, which is a restriction of the solution set (i.e., S).

5 Scenario Study In order to evaluate the proposed optimization model for trading off between efficiency and equity when placing MSUs, we have applied our optimization model (see Eq. 7) on Sweden’s Southern Health care Region (SHR), where about 3900 stroke incidents occur annually [15]. The SHR consists of 4 counties and 49 municipalities, where there are 13 acute hospitals equipped with CT scanner and 39 ambulance sites. An overview of the SHR is provided in Fig. 1, where the green triangles and purple circles represent the locations of ambulance sites and acute hospitals, respectively. For future reference, each circled number in Fig. 1 specifies the corresponding ambulance site ID. We used two types of data: demographic data from Statistics Sweden [15] and stroke data for 2018, provided by Sweden’s Southern Regional Health Care Committee [16]. The ROI (i.e., SHR) was divided into a disjoint set of 1 × 1 km2

denoted by r ∈ R. The union of all squares r∈R r equals to the SHR. The location of ambulance sites and acute hospitals were acquired using Google maps and official documentation provided by the health care authorities in the region. The stroke data included the number of stroke cases for 21 age groups, {[0, 4), [4, 8), . . . , [95, 99), [100, ∞)}. In addition, the demographic data contained

An Optimization Model for the Tradeoff …

189

Fig. 1 An overview of SHR, where ambulance sites and acute hospital locations are shown by green triangles and purple circles, respectively. The circled numbers show the corresponding ambulance site ID

the number of inhabitants for each age-group and each of the 1 × 1 km2 covering the SHR. Using this information, we calculated, for each age-group, the likelihood that a person will get a stroke during a specific time-period, i.e., the year 2018 in this study. Using the calculated stroke likelihoods for each sub-region r ∈ R, we calculated the expected number of stroke incidents Ir for each of the sub-regions r ∈ R. The share of the stroke incidents that is expected to occur in sub-region r ∈ R is given by: Qr =

Ir , I

(8)

 where I = r∈R Ir is the expected number of stroke cases in SHR. It should be emphasized that efficiency and equity may have different range of values; hence, it is likely that one of them dominates the other in the tradeoff function

190

S. A. Mahdiraji et al.

(Eq. 7). In order to tackle this problem, we normalized the values of each perspective, using min–max scaling, before applying them to Eq. (7). In our scenario study, we created four different scenarios for possible MSU allocations. The first one is a baseline scenario, the current situation where only regular ambulances are located in all 39 ambulance sites in the SHR. We also created three scenarios including 1 (called MSU1), 2 (called MSU2), and 3 MSUs (called MSU3), respectively, in addition to the regular ambulances. In MSU1-MSU3, we assumed that all 39 ambulance sites located in the SHR are candidate locations for MSUs. For example, in MSU2, the MSU location set S contains all possible ways to allocate two MSU2, i.e., the following combinations of ambulance locations {[1, 2], [1, 3], . . . , [1, 39], . . . , [38, 39]. For example, [1, 3] means that MSUs are located in the ambulance sites 1 and 3 (see Fig. 1). For each of the considered scenarios (baseline, MSU1, MSU2, and MSU3), we solved our tradeoff optimization problem (Eq. 7) for each of a number of tradeoff weights W = {0, 0.05, 0.1, . . . , 0.95, 1}. It should be noted that each optimization problem 2n solutions (i.e., MSUcontains ). In order to solve the possible ambulance allocations S, where n = L optimization problem, we enumerated over all of the possible solutions. We estimated the driving times of both regular ambulances and MSUs using the driving time generation functionality provided by the open street map (OSM). Due to the limitations of the available data, we consulted a neurologist with insight into stroke logistics in order to make a few other assumptions. We assumed that a regular ambulance drives 5% faster than a normal car, and an MSU drives at the same speed as a normal car; the response time for both regular ambulances and MSUs is 3 min, tlAMBRESP = tlMSURESP = 0.05h; the layover time for both regular ambulances and MSUs is 15 min, i.e.,t AMBLAY = t MSULAY = 0.25h; the time for diagnosis of a patient inside an MSU is 15 min, t MSUDIAG = 0.25h; the expected time for diagnosis for each of the considered hospitals is 35 min, thDTN = 0.583h; it is assumed that ambulance sites and hospitals are open and can provide 24/7 service; the MSUs in each MSU scenario can provide service over the whole SHR; and all stroke cases occur at the home of the patients.

6 Results and Discussion In Fig. 2, we display, for each of our MSU scenarios, the performance of the tradeoff function regarding the WATT and the range when solving the optimization model for different values of W . In each of the plots, the x-axis is the W , which varies between 0 and 1, and the y-axis is the normalized value. When W = 0, the tradeoff represents the efficiency perspective; however, as W increases towards 1, the tradeoff is gradually shifted towards equity. In Table 1, we compare the performance of the efficiency, equity, and tradeoff (when efficiency and equity have the same weight in the tradeoff, W = 0.5) for each of the three MSU scenarios with the baseline scenario. The results indicate that after placing MSUs in the SHR, the WATT will decrease for all MSU scenarios, and the

An Optimization Model for the Tradeoff …

191

Fig. 2 Performance of the tradeoff the WATT and the range for different values of W concerning a MSU1, b MSU2, and c MSU3

Table 1 WATT (in hours), range (in hours), and the share of the inhabitants in the SHR who are expected to receive treatment within 60, 75, and 90 min for each scenario and perspective. The information within square brackets is the optimal ambulance site IDs Perspective

Scenario MSU sites

WATT t TT

Range Expected time to treatment (%) < 60 min

Efficiency

Equity

< 75 min < 90 min

Baseline –

1.33

1.91

3.96

47.76

80.89

MSU1

[38]

1.16

2.24

40.13

66.76

86.59

MSU2

[13, 38]

1.09

2.21

51.54

74.71

89.55

MSU3

[13, 17, 38] 1.04

2.21

57.67

81.25

92.44

MSU1

[3]

1.28

1.67

9.2

51.37

84.43

MSU2

[2, 3]

1.21

1.62

18.85

61.86

88.44

MSU3

[10, 14, 22] 1.21

1.57

12.24

64.33

91.13

Tradeoff [0.5, MSU1 0.5] MSU2 MSU3

[3]

1.28

1.67

9.2

51.37

84.43

[3, 38]

1.11

1.67

45.37

70.38

90.13

[2, 3, 38]

1.05

1.60

55.02

78.52

93.21

higher the number of MSUs is, the more the reduction in the WATT is expected. However, the tradeoff strives to significantly shorten not only the WATT but also the range for all of the three MSU scenarios in comparison to the baseline. In particular, the results of the tradeoff, when the weight is 0.5, reduced the WATT by 3, 13.2, and 16.8 min respectively for MSU1, MSU2, and MSU3. Furthermore, the range measure is expected to be reduced from 1.91 h for the baseline to 1.67 h for MSU1 and MSU2; and 1.6 h for MSU3. In Table 1, we also present the share of the total population of SHR, i.e., 1,687,190, that are expected to get treatment within 60, 75, and 90 min for each of the scenarios and perspectives. The share of the population who are expected to receive treatment within 60 min is expected to increase from 3.96% for the baseline to 9.2%, 45.37%, and 55.02% for MSU1, MSU2, and MSU3, respectively. In particular, for MSU3, placing 3 MSUs in the SHR, using the proposed tradeoff, is expected to lead to an

192

S. A. Mahdiraji et al.

improvement of approximately a factor 14 concerning the share of persons who get treatment within an hour. It is noteworthy that comparing the results for the different perspectives verify that the WATT has an inverse relationship with the share of persons who are expected to receive treatment within 60 min. The results also indicate that after placing MSU(s) using the tradeoff weight W = 0.5, the time to treatment is expected to decrease for 29% (187,150 inhabitants), 59% (1,003,167 inhabitants), and 75% (1,272,676 inhabitants) of the total population considering MSU1, MSU2, and MSU3, respectively. In particular, the time to treatment of 142,604, 799,336, and 1,023,684 inhabitants is expected to decrease up to 30 min.

7 Conclusion We contribute an unconstrained optimization algorithm designed to introduce a tradeoff between the two perspectives of efficiency and equity when allocating mobile stroke units (MSUs) in a geographical region. The weighted average time to treatment (WATT) and range were our choice of efficiency and equity measures, respectively. In a scenario study, we evaluated our optimization model by comparing the current situation, represented by a baseline scenario, in the Sweden’s Southern Health care Region (SHR) with three MSU scenarios including 1, 2, and 3 MSUs, respectively. Our experimental results show that the use of the proposed tradeoff function has the potential to balance efficiency and equity by reducing both WATT and range compared to the baseline scenario. We also conclude that the use of the proposed optimization model for MSU placement may contribute to equal service for most inhabitants as well as substantially increasing the number of persons who are expected to get treatment within 60 min. Funding This work was partially funded by Sweden’s Southern Regional Health Care Committee and The Kamprad Family Foundation for Entrepreneurship, Research & Charity.

References 1. World Stroke Organization. https://www.world-stroke.org/world-stroke-day-campaign/whystroke-matters/learn-about-stroke/. Accessed 20 Dec 2019 2. Ebinger, M., Winter, B., Wendt, M., Weber, JE., Waldschmidt, C., Rozanski, M., Kunz, A., Koch, P., Kellner, PA., Gierhake, D.: Effect of the use of ambulance-based thrombolysis on time to thrombolysis in acute ischemic stroke: a randomized clinical trial. JAMA 311(16), 1622–1631 3. Phan, TG., Beare, R., Srikanth, V., Ma, H.: Googling location for mobile stroke unit hub in metropolitan Sydney. Front. Neurol. 10, 810 (2019)

An Optimization Model for the Tradeoff …

193

4. Rhudy Jr, P., Alexandrov, A.W., Rike, J., Bryndziar, T., Maleki, A.H.Z., Swatzell, V., Dusenbury, W., Metter, E.J., Alexandrov, A..V.: Geospatial visualization of mobile stroke unit dispatches: a method to optimize service performance. Intervent. Neurol. 7(6), 464–470 (2018) 5. Mathur, S., Walter, S., Grunwald, I.Q., Helwig, S.A., Lesmeister, M., Fassbender, K.: Improving prehospital stroke services in rural and underserved settings with mobile stroke units. Front. Neurol. 10 (2019) 6. Drezner, T., Drezner, Z.: Equity models in planar location. Comput. Manage. Sci. 4 (1), 1–16 (2007) 7. Drezner, T., Drezner, Z.: A note on equity across groups in facility location. Nav. Res. Logist. (NRL) 58(7), 705–711 (2011) 8. Mulligan, G.F.: Equality measures and facility location. Pap. Region. Sci. 70(4), 345–365 (1991) 9. Toro-Díaz, H., Mayorga, M.E., McLay, L.A., Rajagopalan, H.K., Saydam, C.: Reducing disparities in large-scale emergency medical service systems. J. Oper. Res. Soc. 66(7), 1169–1181 (2015) 10. Enayati, S., Mayorga, M.E., Toro-Díaz, H., Albert, L.A.: Identifying trade-offs in equity and efficiency for simultaneously optimizing location and multipriority dispatch of ambulances. Int. Trans. Oper. Res. 26(2), 415–438 (2019) 11. Espejo, I., Marín, A., Puerto, J., Rodríguez-Chía, A.M.: A comparison of formulations and solution methods for the minimum-envy location problem. Comput. Oper. Res. 36(6), 1966– 1981 (2009) 12. Chanta, S., Mayorga, ME., McLay, L.A.: Improving emergency service in rural areas: a biobjective covering location model for EMS systems. Ann. Oper. Res. 221(1), 133–159 (2014) 13. Dahllöf, O., Hofwimmer, F., Holmgren, J., Petersson, J.: Optimal placement of mobile stroke units considering the perspectives of equality and efficiency. Proc. Comput. Sci. 141, 311–318 (2018) 14. Amouzad Mahdiraji, S., Dahllöf, O., Hofwimmer, F., Holmgren, J., Mihailescu, R.-C., Petersson, J.: Mobile stroke units for acute stroke care in the south of Sweden. Cogent Eng. 8(1) (2021). https://doi.org/10.1080/23311916.2021.1874084 15. Statistics Sweden. https://www.scb.se/. Accessed 10 July 2018 16. Sweden’s Southern Regional Health Care Committee. https://sodrasjukvardsregionen.se/. Accessed 10 July 2018

Method for Supporting Diagnostics

Automatic Joint Position Estimation Method for Diagnosis Support System in Rheumatoid Arthritis Tomio Goto, Ryota Fujimura, and Koji Funahashi

Abstract Rheumatoid arthritis is a chronic inflammatory disease characterized by inflammation of the synovial membrane of the joints. The destruction of the joints results in a narrowing of the interarticular space, distortion of the joints, and deformity of the fingers. Currently, the mainstream method of diagnosing such symptoms is by direct visual inspection of X-ray images by a physician. However, the diagnosis of the progression of joint destruction requires a lot of time and effort because the number of joints to be observed is large and it is difficult to accurately read minor pathological changes. In addition, the subjective evaluation by the physician is subject to large intra- and inter-examiner variability and lacks reproducibility, which is a major problem. As a means to solve these problems, the goal of this paper has been to develop a diagnosis support system by image processing that is quantitative and labor-saving. In this paper, we propose a method to improve the accuracy of joint position estimation in X-ray images in order to obtain the temporal variation between two images, and confirm the effectiveness of our proposed method by experiments. Keywords Joint position estimation · Automatic measurement · Rheumatoid arthritis · Medical examinations

1 Introduction X-ray images have been widely used to diagnose some kinds of diseases. However, to reduce the patient’s exposure to radiation, X-ray dosage must be minimized as T. Goto (B) · R. Fujimura Department of Computer Science, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555, Japan e-mail: [email protected] URL: http://iplab.web.nitech.ac.jp/ K. Funahashi Orthopaedic Surgery, Kariya Toyota General Hospital, 5-15 Sumiyoshi-cho, Kariya, Aichi 448-8505, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_16

197

198

T. Goto et al.

Fig. 1 Overview of rheumatoid arthritis

much as possible. As a result, X-ray images contain a significant amount of noise and resolution is compromised. Thus, we proposed a super-resolution system [1, 2] for X-ray images, which consists of total variation (TV) regularization [3–6], a shock filter [7, 8], and a median filter. In addition, we proposed a measurement algorithm for the treatment of rheumatoid arthritis, using X-ray images generated by our proposed super-resolution system [9]. We also proposed an automatic Joint Space Distance (JSD) measurement method to measure JSDs automatically [10].

2 Diagnostic Imaging Support System Rheumatoid arthritis, which is shown in Fig. 1 is a disease that affects about 56,000 patients in Japan. In rheumatoid arthritis patients, significant damage will be detected radiographically within two years after early symptoms appear, and appropriate treatment should be administered as soon as possible. However, it is difficult to find the early symptoms of rheumatoid arthritis, so an accurate measurement method is required. The Sharp Score [11, 12] has been proposed as a method to evaluate the progression of rheumatoid arthritis by image analysis. Because this method has only 5 to 6 evaluation stages, it is unable to evaluate minute changes in the early stage of disease, resulting in missed opportunities for treatment. Therefore, in a previous study [10], an application to measure the interarticular space distance was developed for the purpose of quantitatively understanding the disease state. However, in the machine learning method [13] using Adaboost, the detection range is limited to the MCP joint (third joint) to reduce false positives due to the low accuracy of joints detection. Furthermore, it is difficult to quantify minor changes in the distance between joints because the distance between joints is calculated using the vertices of edge curves. Therefore, in this paper, we propose a system to quantify small changes by superimposing images of two time points.

Automatic Joint Position Estimation Method for Diagnosis … Fig. 2 Block diagram of proposed method

199

X-ray input image Past

Current

Finger region extraction Finger centerline extraction Joints detection

Joint position estimation

Joints alignment Quantitative evaluation Output results

3 Proposed Method The process flow of the proposed method is shown in Fig. 2, and the process is described below.

3.1 Finger Region Extraction Process Figure 3 shows an example of an X-ray image. As shown in Fig. 3, the X-ray image may contain information other than the hand, such as the photographing equipment and the photographing situation. However, it is necessary to exclude such regions other than the hand because they greatly affect the accuracy of joints detection. Therefore, the purpose of this process is to extract finger regions, palm regions, and finger regions containing joints from the input image by automatically identifying the hand structure. First, we binarize the input image using the Percentile method [14]. The parameter p used in this method is set to 50% in preliminary experiments, and this value is used in this paper because it is found that the binarization can be performed correctly for all the experimental X-ray images. The background component is removed by the region expansion method using this binary image. Since the input image is originally an X-ray image of both hands, the left hand region can be extracted by setting the center point of the left half of the image as the starting point, and the right hand region can be extracted by setting the center point of the right half of the image as the starting point. The left hand region extracted in this way is shown in Fig. 4b. Next, we perform the Euclidean distance transform [15] using the hand region image. Since the Euclidean distance transform is a transform that gives the shortest

200

T. Goto et al.

Fig. 3 An example of X-ray image

distance from each pixel to the contour of a figure, the palm region can be extracted by considering the point with the largest distance value as the center of the palm and performing the inverse distance transform. Furthermore, the finger region can be extracted by extending the hand region upward from the palm region. Figure 4c, d show the distance transformed image and the identified region, respectively.

3.2 Finger Centerline Extraction Process In order to calculate the features of a finger region, it is necessary to know in advance the position and inclination of the bones. In this process, we use the distance transformed image obtained in Sect. 3.1 to determine the skeleton of the finger and extract the finger center line. First, we define the maximum value in the image as the pixel whose value is greater than or equal to the distance value of the neighboring 8 pixels. Then, we extract the maximum value in the finger region from the distance transform image. However, since some maxima appear at non-skeletal locations, we exclude outliers by considering the five regions with large areas as the finger skeleton. Then, the coordinates of the obtained maxima are thinned out and interpolated using a spline function. This produces a smoothed finger centerline. However, in our method, the maxima near the MCP joint are tilted toward the center of the palm because the skeleton is extracted based on the distance transform image. Therefore, to solve this problem, we use the coordinates on the basal ganglia at the center line generated above to obtain the inclination of the basal ganglia by the least-squares method and perform a scaling interpolation. Since the finger deformity caused by rheumatoid arthritis originates from the joint, the finger centerline passing through the MCP joint can be generated by calculating the slope of the basal bone unless the symptoms are

Automatic Joint Position Estimation Method for Diagnosis …

(a) Left hand region extraction

201

(b) Binarization

(c) Distance transformed image (d) Identified region image Fig. 4 An example of finger region extraction

severe. Figure 5 shows an example of the finger centerline extracted by the above process.

3.3 Joints Detection Process In general, the joint region contains more edges than the bone region due to the existence of the joint cavity. By using this feature, we try to identify the joint position by calculating the brightness gradient of the finger region.

202

T. Goto et al.

Fig. 5 An example of finger centerline

Fig. 6 A definition of rectangular region

3.4 Feature Calculation First, we define a rectangular region along the finger center line as shown in Fig. 6. The rectangle size is set according to the finger width, and in this paper, the rectangle size is set to 20 pixels in length and 80 pixels in width. Then, we calculate the luminance gradient in the rectangular region. In order to consider the luminance gradient component in the vertical direction to the joint, the luminance gradient vector is h and the angle between the finger center line t and the tangential direction is θ . The luminance gradient component h t in the tangential direction is as follows: h t = |h cos θ |.

(1)

Automatic Joint Position Estimation Method for Diagnosis …

203

Tangential direction

ht

θ

h

Fig. 7 A definition of tangential luminance gradient First domain

Second domain

Third domain

12 10

Feature values

8

6

4

2 Detection position

0 Distance from fingertips

Fig. 8 An example of transition of resulting feature values

This relationship is shown in Fig. 7. Next, we define the average value of the tangential luminance gradient in the rectangular region as the feature value, and repeat the same calculation for the rectangular region from the fingertip to the palm region. Figure 8 shows an example of the transition of the resulting feature values.

3.5 Joints Alignment Process In the superposition of images taken at two points in time, it is necessary to calculate the amount of rotation in addition to the amount of movement between the two images, because not only the positions but also the orientations of the reflected joints

204

T. Goto et al.

(a) Target image

(b) After alignment

(c) Template image

Fig. 9 An example of joints alignment process

vary depending on the shooting conditions. Therefore, in this section, we perform a more detailed alignment based on the results of joints position detection in Sect. 3.3. The outline of the process is shown below. In this process, we use the joint image A of a subject as a template image, and the joint image B of the subject after the temporal change as a target image. We translate the template image by one pixel in the range of 10 pixels with respect to the center coordinate of the target image, and rotate it in the range of ±3 degree. Then, we search for the position where the mutual information is maximized. The result of the joints alignment process is shown in Fig. 9.

4 Experimental Results In order to show the effectiveness of the proposed method, we conducted experiments on joints detection and joints alignment.

4.1 Experiments on Accuracy of Joints Detection We prepared 28 X-ray images of rheumatoid arthritis patients and conducted experiments. In the experiment, after extracting the finger center line, the joint is detected by calculating the feature value as shown in Fig. 10. If the output image contains more than 80% of the joint cavity, it is evaluated as a correct detection, otherwise as a false detection. We use the previous method [13] as the conventional method. The experimental results are shown in Table 1. Table 1 shows that the proposed method successfully detects all the 784 joints to be detected. The proposed method successfully detects all 784 joints to be detected, and the detection accuracy is improved compared to the conventional method which only targets MCP joints.

Automatic Joint Position Estimation Method for Diagnosis …

(a) Feature value

205

(b) Joint position indication

Fig. 10 Experimental results Table 1 Experimental results for joints detection Joints Proposed method Conventional method True False Accuracy (%) Distal InterPhalangeal (DIP) joints Proximal InterPhalangeal (PIP) joints Meta CarpoPhalangeal (MCP) joints

Table 2 Comparison positioning accuracy Detection method Proposed method Conventional method

280 280 224

0 0 0

100 100 100

– – 80.4

MSE of 60 MCP joints 2.25 × 105 5.89 × 105

4.2 Experiment on Accuracy of Joint Position Estimation Using the MCP joint images cropped by each method, we align the images between two time points. In the experiment, the X-ray image taken at the initial stage is used as a template image, and the error sum of squares at the position where the mutual information is maximized is calculated by applying parallel shift and rotation based on the center coordinate of the target image after the temporal change. The experimental results are shown in Table 2. In Table 2, we can confirm that the proposed method reduces the sum of error squares compared to the conventional method. This result indicates that the displacements between images are within the range of the proposed method, and there is little variation among images. Therefore, it is confirmed that the proposed method can estimate the joint position more accurately than the conventional method.

206

T. Goto et al.

5 Conclusion In this paper, we proposed a joint location estimation method focusing on the features on the finger centerline. The proposed method successfully detects all 784 joints in the target locations. Furthermore, we confirmed that the proposed method can estimate the joint positions with higher accuracy than the conventional method by using the mutual information as an index. Future work includes the study of the optimal setting of parameters for the image alignment process and the quantification method of the change over time.

References 1. Sakurai, M., Sakuta, Y., Watanabe, M., Goto, T., Hirano, S.: Super-resolution through nonlinear enhancement filters. In: IEEE International Conference on Image Processing (ICIP), pp. 854–858 (2013) 2. Goto, T., Mori, T., Kariya, H., Shimizu, M., Sakurai, M., Funahashi, K.: Super-resolution technology for X-ray images and its application for rheumatoid arthritis medical examinations. Smart Innov. Syst. Technol. 60, 217–226 (2016) 3. Osher, S.J., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60, 259–268 (1992) 4. Meyer, Y.: Oscillating patterns in image processing and nonlinear evolution equation. In: Fifteenth Dean Jacqueline B. Lewis Memorial Lectures, University Lecture Series, vol. 22 (2001) 5. Vese, L.A., Osher, S.J.: Modeling textures with total variation minimization and oscillating patterns in image processing. J. Sci. Comput. 19(5), 553–572 (2003) 6. Chambolle, A.: Total variation minimization and a class of binary MRF models. In: International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, vol. 60, pp. 136–152 (2005) 7. Osher, S.J., Rudin, L.I.: Feature-oriented image enhancement using shock filters. SIAM J. Numer. Anal. 310–340 (1990) 8. Alvarez, L., Mazorra, L.: Signal and image restoration using shock filters and anisotropic diffusion. SIAM J. Numer. Anal. 590–605 (1994) 9. Goto, T., Sano, Y., Mori, T., Shimizu, M., Funahashi, K.: Joint image extraction algorithm and super-resolution algorithm for rheumatoid arthritis medical examinations. Smart Innov. Syst. Technol. 98, 267–276 (2018) 10. Goto, T., Sano, Y., Funahashi, K.: Automatic joint space distance measurement method for rheumatoid arthritis medical examinations. Smart Innov. Syst. Technol. 192, 179–189 (2020) 11. Sharp, J.T., Lidsky, M.D., Collins, L.C., Moreland, J.: Methods of scoring the progression of radiologic changes in rheumatoid arthritis. Correlation of radiologic, clinical and laboratory abnormalities. Arthritis Rheum. 14(6), 706–720 (1971) 12. Heijde, V.D.: How to read radiographs according to the sharp/Van Der Heijde method. J. Rheumatol. 26(3), 743–745 (1999) 13. Goto, T., Sano, Y., Funahashi, K.: Improving measurement accuracy for rheumatoid arthritis medical examinations. Smart Innov. Syst. Technol. 71, 157–164 (2017) 14. Hyndman, R.J., Fan, Y.: Sample quantiles in statistical packages. Am. Stat. 50(4), 361–365 (1996) 15. Kato, T., Hirata, T., Saito, T., Kise, K.: An efficient algorithm for the Euclidean distance transformation. Syst. Comput. Jpn. 27(7), 18–24 (1996)

Computer-Aided Diagnosis of Peritonitis on Cine-MRI Using Deep Optical Flow Network Toshiki Kawahara, Akitoshi Inoue, Yutaro Iwamoto, Akira Furukawa, and Yen-Wei Chen

Abstract Cine magnetic resonance imaging (MRI) analysis methods are used to diagnose peritonitis, a life-threatening disease associated with decreased intestinal peristalsis. However, variable reproducibility of Cine MRI assessment tests of bowel peristalsis is a critical issue that needs to be improved. Computer-aided diagnosis could help address this issue; however, it faces a number of challenges: the need to extract temporal and spatial features from Cine MRI and the need to extract both global and local features. In this paper, we apply the deep optical flow network (DOFN), an optical flow calculation (TV-L1 method) method based on deep learning for the diagnosis of peritonitis. In the proposed method, the Cine MRI temporal frames serve as input to the DOFN method for optical flow computations of the abdominal region. The computed optical flows are used as global temporal and spatial features for diagnosing peritonitis. In addition, we divide the abdominal region into four subregions. We then proceed with calculating the optical flow and classifying peritonitis in each subregion. The final decision is made based on the results extracted from the four subregions. Furthermore, we visualize the small bowel motility through the computed optical flow. The area under the curve of the proposed method is about 0.72 even without administering contrast media intravenously. Keywords Deep learning · Optical flow · Computer-aided diagnosis · Cine MRI · Small bowel

T. Kawahara (B) · Y. Iwamoto · Y.-W. Chen Graduate School of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan e-mail: [email protected] A. Inoue National Hospital Organization Higashi-Ohmi General Medical Center, Higashiomi, Japan A. Furukawa Tokyo Metropolitan University, Hachioji, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_17

207

208

T. Kawahara et al.

1 Introduction The small bowel is an important organ that absorbs and transports nutrients essential to maintaining life-supporting activities. These are enabled by the complex peristalsis of the small bowel. If peristalsis is abnormal, intestinal contents cannot be properly transported and digested, potentially leading to diseases, such as indigestion and intestinal obstruction. Therefore, establishing a measurement method that leverages the characteristics of small intestinal peristalsis could lead to the early detection and prevention of diseases and pave the way for optimal treatment methods. In clinical practice, small bowel examinations are typically performed using double-balloon or capsule endoscopes. However, these techniques are invasive because they require inserting an endoscope into the body of the patient. Furthermore, the endoscopic examinations overlook peristalsis of the small bowel. Investigating and developing a reproducible and less-invasive method that can reveal whole small bowel peristalsis is highly desirable. Based on medical findings that revealed that patients with peritonitis have slower peristalsis in the evaluation of the small bowel region, Wakamiya et al. [1] and Inoue et al. [2] proposed the use of Cine magnetic resonance imaging (MRI) for analyzing small intestinal peristalsis. Although this method helps lighten the burden on patients by making the examination less invasive, the burden on physicians is high as they have to manually analyze the results. In addition, the results highly depend on each physician, causing variable reproducibility. To reduce the burden on physicians, Otsuki [3] proposed small bowel region segmentation and peristaltic analysis methods. The small bowel region extraction method involved the use of deep learning networks, namely, U-Net [4] and 3DU-Net [5], achieving high enough results [6]. Otsuki et al. also tried to use the contraction movement frequency of the segmented small bowel for differentiating peritonitis; however, the accuracy was not high enough because only a local segmented small bowel was used for the analysis. The global small bowel motility should be used for the analysis. In this paper, we apply the deep optical flow network (DOFN), an optical flow calculation (TV-L1 method [7]) method based on deep learning for diagnosing peritonitis. In the proposed method, the Cine MRI temporal frames serve as input to the DOFN for optical flow computations of the abdominal region. The computed optical flows are used as global temporal and spatial features for diagnosing peritonitis. In addition, we divide the abdominal region into four subregions. We then calculate the optical flow and classify (diagnose) peritonitis in each subregion. The final decision is made based on the results extracted from the four subregions. Furthermore, we visualize the small bowel motility through the computed optical flow.

Computer-Aided Diagnosis of Peritonitis …

209

2 Proposed Method In this section, we describe optical flow. Sections 2.1–2.4 present the overview of our proposed method; the flow layer method, which is an important part of this study; the classification method; and the region-division method, respectively. The optical flow method is widely used as it represents a moving object. There are several traditional methods for computing optical flows, i.e., Lucas and Kanade [8] and Horn and Schunck [9]. Lately, several methods using deep learning for computing optical flows have been proposed [10–13]. Piergiovanni and Ryoo [14] proposed a DOFN inspired by the algorithm of TV-L1, an optical flow estimation method, and applied it to action recognition by combining it with ResNet [15]. Its parameters for iterative flow optimization are learned in an end-to-end fashion together with the other convolutional neural network (CNN) model parameters. It is faster than the state-of-the-art methods and has the same or even better accuracy. It also outperforms existing motion representation methods, including TVNet [16] and OFF [17], in terms of both speed and accuracy. Because of its excellent performance, we use DOFN as our backbone network to calculate the optical flow for differentiating peritonitis.

2.1 Overview Figure 1 demonstrates the entire network. The input is the temporal abdominal Cine MRI. We use ResNet (blue block 1–5 in Fig. 1) to extract features from each frame. CNN feature maps may have hundreds or thousands of channels. The flow layer computes the flow for each channel, which can take a significant amount of time and memory. To address this issue, we apply a reduced channel layer (green block

Fig. 1 Overview of the proposed method

210

T. Kawahara et al.

in Fig. 1) to reduce the number of channels from 256 to 1 before the flow layer. In the normalized feature layer (orange block in Fig. 1), for numerical stability, we normalize this feature map to range [0–255], matching standard image values. Piergiovanni and Ryoo [14] found that the CNN features were quite small on average ( λθ |∇ F2 | vi = ⎪ ∇ F2 ⎩ ui − ρ |F otherwise |2

(2)

  ui+1 = vi + θ · divergence pi

(3)

2

pi+1 =

pi + τθ ∇ui+1 . 1+ τθ |∇ui+1 |

(4)

Here, θ controls the weight of the TV-L1 regularization term, λ controls the output smoothness, and τ controls the time step. These hyperparameters are manually set.

Computer-Aided Diagnosis of Peritonitis …

211

To compute the divergence, we zero-pad p on the first column (x-direction) or row (y-direction) and then convolve it with weights wx , w y : divergence( p) = px ∗ wx + p y ∗ w y ,

(5)



−1 where, initially, wx = [−1 1] and wx = . Note that these parameters are also 1 differentiable and can be learned with backpropagation. To derive the final output of the flow layer, we compute ∇u as follows: ⎡

⎡ ⎤ ⎤ 1 0 −1 1 2 1 ∇u x = ⎣ 2 0 −2 ⎦ ∗ u x , ∇u y = ⎣ 0 0 0 ⎦ ∗ u y . 1 0 −1 −1 −2 −1

(6)

2.3 Classification In this study, the FC layer is used as the classification layer. The features extracted from each flow are classified in the FC layer, and their average is taken as the final result of one case. The model is trained to minimize cross-entropy: L(v, c) = −

K 

(c == i) log( pi ),

(7)

i

where p = M(v), v is the Cine MRI, M is the classification CNN, and c represents which K classes v belongs to. This means that the parameters in the flow layers are trained together with other layers, thereby maximizing the final classification accuracy.

2.4 Region-Division Method The inflammatory process is limited around the causative disease at the first stage of peritonitis, and the inflammation is spread to the whole abdomen. For example, in acute appendicitis, localized peritonitis causes decreased bowel peristalsis in the right lower abdomen at an early stage. Considering this characteristic, we propose the region-division method. Figure 2a demonstrates the typical frame of Cine MRI. In addition to using the whole abdominal region (yellow bounding box in Fig. 2b) as an input image for global analysis, we also propose a region-division method. The abdominal region is divided into four subregions (top-left, top-right, bottom-left, bottom-right), as

212

T. Kawahara et al.

Fig. 2 a Typical frame of Cine MRI, b definition of ROI, and c subregion division

Fig. 3 Region-division method

shown in Fig. 2c. Each subregion is used as an input, and classification (diagnosis) of peritonitis is conducted for each subregion, as shown in Fig. 3. The final decision is made based on the results of the four subregions.

3 Experiments 3.1 Data Set The Cine MRI data set used in this study was provided by the Higashi-Ohmi General Medical Center. It includes 12 peritonitis and 58 nonperitonitis cases diagnosed by other imaging modalities, laboratory tests, and clinical follow-ups. MRI examinations were performed using a 1.5-T MRI system (Magnetom Aera, Siemens Healthineers, Forchheim Germany) and a phased-array coil (1.5-T Tim Coil, Siemens Healthineers). Patients were placed on the scanner in a supine position. Before the Cine MRI scans, coronal images (true FISP: TR, 4.26 ms; TE, 2.13 ms; flip angle,

Computer-Aided Diagnosis of Peritonitis …

213

60°; slice thickness, 5 mm; matrix, 320 × 320; field of view, 400 mm) of the entire abdomen were scanned, and three representative slices were selected by a radiological technologist. Cine MRI scans, including 70 images in 35 s (2 fps), were obtained in the selected three coronal planes (Fig. 2a). All data are confirmed and annotated by two abdominal radiologists. The data set was divided into six groups. Each group included 2 peritonitis cases and 9–10 nonperitonitis cases. We used sixfold cross-validation to perform our experiments. We chose a group as a test data set and the remaining five groups as the training data sets. The average results of the six experiments were taken as our final result. Each case had 25–60 frames. For each case, we used 15 frames (temporal images) as a (temporal) sample and generated M = 10–45 samples by shifting the frames. The probability of peritonitis for each case is defined as F/M, where F is the number of cases classified as peritonitis among M samples. For the region-division method, the optical flow between each frame was passed through the network to determine whether it was peritonitis by the FC. Therefore, each subregion has an output of two-class classification. In general, the bowel of patients with peritonitis does not move. The subregion involves movement of arteries and other organs at certain times. The number of peritonitis in each subregion was counted, and the maximum value divided by the M samples expressed the probability of peritonitis. Then, the probability of peritonitis was used for the analysis of the receiver operating characteristic curve. The cutoff value was determined using Youden’s index [18]. The cases were classified as peritonitis based on whether the probability in each case exceeded the cutoff value. The abdominal region (region of interest (ROI)) used for the analysis is shown in Fig. 2b with a yellow bounding box. The orange circle represents the region of superior anterior iliac spine, which is used as an anatomic landmark to determine the ROI. The horizontal red line is the one that passes through the center point of the superior anterior iliac spine. The vertical red line is the center line on the left and right sides of the image. The intersection of the two lines is the center of the ROI. The size of ROI is 128 × 128 pixels. For the region-division method, the four subregions are divided as shown in Fig. 2c. The size of each subregion is 64 × 64 pixels.

3.2 Results Typical optical flows of nonperitonitis and peritonitis cases are shown in Fig. 4a, b, respectively. As shown in Fig. 4, the small bowel in the nonperitonitis case is actively moving, while the small bowel in the peritonitis case is slowly moving. The flow difference between the normal and peritonitis cases helps determine whether peritonitis is present. Table 1 summarizes the differentiating (classification) results. Two experiments are performed. In the first experiment, we use the whole ROI image (Fig. 2b) as an input image (without region-division method). In the second experiment, we used the

214

T. Kawahara et al.

Fig. 4 Optical flows of a nonperitonitis case and b peritonitis case

Table 1 Comparison with whole ROI image and region-division method Whole ROI image

Region-division method

AUC

0.48

0.72

Accuracy

0.69

0.64

Recall

0.17

0.67

region-division method (Fig. 2c). As shown in Table 1, if we did not use the regiondivision method, the accuracy (accuracy = (TP + TN)/(TP + TN + FP + FN)), recall, and area under the curve (AUC) were 0.68, 0.16, and 0.48, respectively. The region-division method boosted diagnostic performance. The accuracy, recall, and AUC were improved to 0.64, 0.67, and 0.72, respectively. The reason why the regiondivision method improves the diagnostic performance is because it considers the inflammatory process and combines local and global information to take a decision. Without the region-division method, the decreased bowel peristalsis may be canceled with normal peristalsis in other regions. Table 2 shows the confusion matrix of the classification results of experiment 2 (region-division method). We can see that there are 4 false-negative cases among 12 positive cases (peritonitis cases) and 21 false-positive cases among 58 negative cases (nonperitonitis cases). The reduction of both false positives and false negatives will be the subject of our future work. Table 2 Confusion matrix of region-division method (Experiment 2) Region-division method

Peritonitis

Nonperitonitis

Positive

8

21

Negative

4

37

Computer-Aided Diagnosis of Peritonitis …

215

3.3 Comparison with the State–of-the-Art The latest study conducted by Otsuki et al. [3] used segmentation results of the small bowel region to differentiate peritonitis using frequency analysis and area variation. These results were evaluated using the t-test, with both methods having 26 degrees of freedom and a significance level of α = 0.05. The results from the peristaltic frequency analysis showed a t-value of −1.080 and failed to show a significant difference between the peritonitis and nonperitonitis cases. The results from the area variation analysis between slices by segmentation showed a t-value of −3.026. Our results cannot be compared with the study of Otsuki et al. because their evaluation methods are different. However, our method is superior in terms of reproducibility. In the study of Otsuki et al., the input ROI size was 16 × 16, and the observation positions greatly varied depending on each doctor. On the other hand, we focused on the ROI size 128 × 128. By dividing it into four subregions, we consider both global and local information and determine the same ROI for any doctor.

4 Conclusion In this paper, we proposed a deep learning-based method for computer-aided diagnosing peritonitis with Cine MRI. In our proposed method, we applied DOFN, which inserts the optical flow calculation method into the classification network for diagnosing peritonitis. We considered both global and local information by dividing the input image into four subregions. We succeeded in improving the accuracy using optical flow and considering temporal information. The accuracy, recall, and AUC were 0.64, 0.67, and 0.72, respectively. We also mapped the flow onto the MRI to provide visual diagnostic support. In future work, we will further improve the method by taking into account the problem of decreased classification accuracy due to the area without bowel and the influence of arteries. We will then compare the results of the optical flow generated by the deep learning-based method with those obtained by state-of-the-art methods, such as the Lucas–Kanade method. Acknowledgements This work was supported in part by the Grant-in Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under Grant No. 20KK0234 and No. 20K21821.

216

T. Kawahara et al.

References 1. Wakamiya, M., Furukawa, A., Kanasaki, S., Murata, K.: Assessment of small bowel motility function with cine-MRI using balanced steady-state free precession sequence. J. Magn. Reson. Imaging 33, 1235–1240 (2011) 2. Inoue, A., et al.: Acceleration of small bowel motility after oral administration of dai-kenchu-to (TJ-100) assessed by cine magnetic resonance imaging. PLoS One 13(1), e0191044 (2018). https://doi.org/10.1371/journal.pone.0191044.eCollection2018 3. Otsuki, K.: Small bowel region extraction and peristaltic analysis of abdominal cine-MRI using deep learning to differentiate peritonitis. Master thesis of Graduate School of Information Science and Engineering, Ritsumeikan University 4. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. MICCAI2015, pp. 243–241 (2015) 5. Cicek, O., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. MICCAI2016, pp. 424–432 (2016) 6. Otsuki, K., et al.: Cine MR image segmentation for assessment of small bowel motility function using 3D U-Net. J. Image Graph. 7, 134–139 (2019) 7. Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-L 1 optical flow. Pattern Recogn. 1(1), 214–223 (2007) 8. Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference on Artificial Intelligence, pp. 674–679 (1981) 9. Horn, B., Schunck, B.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981) 10. Revaud, J., et al.: Epicflow: edge-preserving interpolation of correspondences for optical flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1164–1172 (2015) 11. Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015) 12. Ilg, E., et al.: Flownet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017) 13. Guan, S., Li, H., Zheng, W.-S.: Unsupervised learning for optical flow estimation using pyramid convolution lstm. In: 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp. 181–186 (2019) 14. Piergiovanni, A.J., Ryoo, M.S.: Representation flow for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019) 15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 16. Fan, L., Huang, W., Gan, S.E.C., Gong, B., Huang, J.: End-to-end learning of motion representation for video understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) 17. Sun, S., Kuang, Z., Sheng, L., Ouyang, W., Zhang, W.: Optical flow guided feature: a fast and robust motion representation for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) 18. Youden, W.J.: Index for rating diagnostic tests. Cancer 3, 32–35 (1950)

Automated Retrieval of Focal Liver Lesions in Multi-phase CT Images Using Tensor Sparse Representation Jian Wang, Junlin Zhao, Xian-Hua Han, Lanfen Lin, Hongjie Hu, Yingying Xu, Qingqing Chen, Yutaro Iwamoto, and Yen-Wei Chen

Abstract Content based image retrieval (CBIR) that searches for similar images in a large database has been attracting increasing research interest recently, and it has been applied to medical image characterization for sharing experts’ experiences. One challenging task in CBIR is to extract features for effective image representation. To this end, bag-of-visual-words (BoVW) has been proven to be effective to extract middlelevel features for image analysis. However, it is necessary to first vectorize the twoor three-dimensional spatial structure for analysis in conventional BoVW and then destroy the spatial relationships of nearby voxels. In this study, we propose a tensor sparse coding method, which is a multilinear generalization of conventional sparse coding (soft assignment in BoVW), to learn features from multi-dimensional medical images. We regard high-dimensional local structures as tensors and propose a K-CP (CANDECOMP/PARAFAC) algorithm to learn an overcomplete tensor dictionary iteratively. By using the learned overcomplete tensor dictionary, sparse coefficients of tensor local structures are calculated by employing the tensor orthogonal matching pursuit (Tensor-OMP) algorithm, which is an extended multilinear version of the conventional vector-based OMP. The proposed method is applied to the retrieval of focal liver lesions (FLLs) by using a medical database consisting of contrastJ. Wang (B) · Y. Iwamoto · Y.-W. Chen College of Info. Science and Engineering, Ritsumeikan University, Shiga, Japan e-mail: [email protected] J. Wang · J. Zhao School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China X.-H. Han Faculty of Science, Yamaguchi University, Yamaguchi, Japan L. Lin College of Computer Science and Technology, Zhejiang University, Hangzhou, China H. Hu · Q. Chen Department of Radiology Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, Zhejiang, China Y. Xu Zhejiang Lab, Research Center for Healthcare Data Science, Hangzhou, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_18

217

218

J. Wang et al.

enhanced multi-phase computer-tomography (CT) images. Experiments show that the proposed tensor sparse coding method achieved better retrieval performance than conventional methods. Keywords Multi-phase CT · Tensor analysis · Sparse coding · CBMIR · Focal liver lesion

1 Introduction Liver cancer is one of the leading causes of death worldwide. Early detection of liver cancers by analysis of medical images is a helpful way to reduce death due to liver cancer. Based on clinical observations, different types of liver lesions exhibit different visual characteristics at various time points after intravenous contrast injection. To capture the visual feature transitions of liver tumors over time, multi-phase contrastenhanced computer-tomography (CT) scanning is generally employed on patients, that are thought to have liver problems. In the multi-phase contrast-enhanced CT scan procedure, noncontrast-enhanced (NC) phase images are obtained from scans before contrast injection. Moreover, the images of three additional phases’ are scanned at different times after contrast injection, namely, arterial (ART) phase scanned 25–40 s after contrast injection, portal venous (PV) phase 60–75 s after contrast injection, and delayed (DL) phase scanned 3–5 min after contrast injection. Medical images, accompanied with diagnosis reports, are accumulated so that doctors can be more confident in diagnosis decisions and treatment planning by reviewing similar past cases. High-definition medical images and large unorganized medical datasets, however, post challenges to doctors from the viewpoint of analysis and review. Content-based medical image retrieval (CBMIR) systems help in these challenging analysis and review situations by searching for similar past cases in an accumulated dataset of any under-study (query) medical data. A CBMIR system performs two main tasks: feature learning to calculate the inherent and core information of a medical image and similarity assessment to measure the similarity between image representations. In this study, we focus on learning discriminative and effective representation of medical images for focal liver lesion (FLL) characterization. Characterization of FLLs has attracted considerable research interest recently. Most of current studies focus on low-level and middle-level features. We firstly give a brief overview of the studies using low-level features. Intuitive statistical features, such as intensity and variance, are simple but important for gray-scale medical image representation. In addition to the first-order statistical features, texture is the major source of feature information in medical image analysis [1]. Duda et al. [2] proposed to extract four first-order parameters, Law’s filters, run-length matrix features, and cooccurrence parameters for recognizing hepatic primary tumors. Roy et al. [3] used four types of features, that is, density, temporal density, texture, and temporal texture, which are derived from four-phase medical images, to retrieve the most similar images of five types of liver lesions. The texture features are six coefficients computed

Automated Retrieval of Focal Liver Lesions in Multi-phase …

219

from a three-dimensional (3D) gray-level cooccurrence matrix. Shape feature was adapted in [4] in combination with density and texture features for retrieving five types of FLLs. The shape feature was derived from three principal components of 3D surface models of FLLs. Compared to low-level features, the middle-level feature bag-of-visual-words (BoVW) has proved to be considerably more effective for classifying and retrieving natural images. There are many variants and enhanced versions of BoVW for learning the inherent features of medical images. The performance levels achieved using a universal dictionary and category-specific dictionaries were compared in [5] in terms of retrieval of three types of FLLs. In [6, 7], partition techniques were applied and representation was learned using BoVW based on partitioned regions. On the other hand, Diamant et al. [7] learned BoVW representation of the interior and boundary regions of FLLs for classifying three types of FLLs from single-phase CT images. A variant of BoVW called bag of temporal co-occurrence words (BoTCoW) was proposed by Xu et al. [8]. In BoTCoW, BoVW was applied to temporal co-occurrence images, which were constructed by connecting the intensities of multi-phase images, to extract temporal features for retrieving five types of FLLs from triple-phase CT images. After a common codebook learning procedure, Diamant et al. [9] proposed a visual word selection method based on mutual information to select more meaningful visual words for a specific classification task. In addition to these variants and enhanced versions of BoVW based on the hard-assignment mechanism, Wang et al. [10] learned sparse representations of local structures, which is a softassignment BoVW method, of multi-phase CT scans for FLL retrieval. As mentioned above, many variants and enhanced versions of the BoVW model have been used in early works on FLL characterization. All of these studies used raw patches as local descriptors for remaining the detailed tiny structures, in stead of SIFT that is widely used in natural image analysis. However, all these methods used vectorized 2D patches as local descriptors, and these patches neglect the spatial structure of 3D medical images. Therefore, in this study, we explore a multilinear generalization of the soft-assignment BoVW, that is, the tensor sparse coding approach, for robust analysis of high-dimensional medical local structures. Tensor representation of highdimensional medical structures will not only preserve the inherent spatial structure of high-dimensional medical data, but also capture the temporal information that lies in multi-phase medical images. In this study, we propose a novel tensor sparse coding method for learning the tensor representation of high-dimensional medical structures. The rest of this paper is organized as follows: The data used in experiments is first reviewed briefly in Sect. 2. The proposed feature learning method is introduced in Sect. 3. In Sect. 4, we discuss the experimental settings, parameters optimization, and results. In Sect. 5, we provide a few concluding remarks and an outline for future work.

220

J. Wang et al.

2 Materials Single-phase images are used in common codebook models for classification or retrieval. However, it has been proven that multi-phase data are critical in the medical field, especially for detecting liver lesions. A multi-phase medical dataset was constructed with the help of radiologists to evaluate the performance of the proposed method. The dataset comprises five types of FLLs from 137 medical cases. For each medical case, the CT images used in clinics were collected, where triplephase (NC/ART/PV) low-resolution CT images and high-resolution CTs in the PV phase are supplied as well. In a few medical cases, the tumor contrast was gradually enhanced and the typical enhancement features were visible in the DL phase CTs, especially for large tumors of the HEM type. Low-resolution DL phase CT images were available for these medical cases. However, from the viewpoint of data consistency across all medical cases, the CT images in the DL phase were neglected in our experiments. Here “low-resolution” refers to a CT volume with spacing of (0.5−0.8) × (0.5−0.8) × (5/7) mm3 , and “high-resolution” refers to a CT volume with equal spacing in three directions (0.5−0.8 mm in each direction). The size of a CT slice was fixed to 512 × 512 pixels, while the number of CT slices was set depending on the region scanned (full body or only the abdomen). All tumors in each CT image were manually marked by an experienced medical doctor. In our experiments, however, only the major tumor in a medical case, that is, the tumor with the largest volume, was considered for the following reasons: (1) to simplify the data acquisition process and (2) other FLLs, except the major one, in an image are invariably too small to exhibit typical features. As a result, 137 FLLs were selected for use in our experiments, including 38 lesions of the cyst class, 28 cases of both hepatocellular carcinoma (HCC) and hemangioma (HEM), 22 cases of focal nodular hyperplasia (FNH) and 21 metastatis (METS). Examples of the five types of FLLs are shown in Fig. 1. Each FLL region was extracted from a representative slice of the major tumor in the triple-phase low-resolution CTs.

3 Methods First, we introduce the notations used throughout this paper. A vector is denoted by a lowercase boldface letter, for example, x. A matrix is denoted by an uppercase boldface letter, for example, X. A tensor is denoted by an uppercase boldface sans letter, for example, Y. We define tensor multiplication in a way similar to that in [11]. The n-mode product of a tensor D ∈ R I1 ×I2 ×···×I N with a matrix X ∈ R M×In is n−1 ×M×In+1 ×···×I N , where each element is defined as follows: D ×n X ∈ R I1 ×I2 ×···×I  In defined as: (D ×n X)i1 ...in−1 min+1 ...i N = in =1 di1 i2 ...i N xmin The n-mode product of ten¯ n x, which results in an (N − 1)th-order sor D with a vector x ∈ R In is denoted as D× tensor that can be used to calculate the inner product of the mode-n sub-tensors of

Automated Retrieval of Focal Liver Lesions in Multi-phase …

221

Fig. 1 Examples of each lesion type on 3 phases. Rows are images belong to same contrast phase, while columns are images from same lesion: cyst, FNH, HCC, HEM, METS

¯ n x)i1 ...in−1 in+1 ...i N = tensor D with vector x. Elements are calculated as follows: (D×  In d x i n =1 i 1 i 2 ...i N i n The OMP algorithm is a greedy algorithm that finds sparse coefficients of vectorbased signals using a given codebook, whose codewords (atoms) are also vectors. OMP is widely used in conventional sparse coding implementations owing to its simplicity and efficiency. We proposed a straightforward generalization of the vectorbased OMP to multilinear algebra, also called Tensor-OMP. By using Tensor-OMP, sparse coefficients of tensor input signals to a given tensor codebook can be calculated without vectorization. Given a collection of samples Y = [Y1 , Y2 , . . . , Y N ], where Yi ∈ R I1 ×I2 ×···×I M , i = 1, 2, . . . , N , is an Mth-order tensor and Y ∈ R I1 ×I2 ×···×I M ×N is an (M + 1)th-order tensor. Suppose a codebook D comprises of K tensor codewords Dk ∈ R I1 ×I2 ×···×I M . Then, D is a (M + 1)th-order tensor. Similar to conventional vector-based OMP, any sample Yi can be approximated by using a linear combination of codewords, within which the sparse coefficients can be calculated with Tensor-OMP by solving the following objective function: ¯ (M+1) xi ||22 , i = 1, 2, . . . , N min || Yi − D× xi

s.t. || xi ||0  T, ∀i

(1)

where a column vector xi in X represents a combination of the codewords that approximates a sample Yi , and T is a sparsity measure. CP decomposes a Pth-order tensor D into a sum of rank-one tensors [11].

222

J. Wang et al.

D≈

R 

λr (d r1 ◦ d r2 ◦ · · · ◦ d rP )

(2)

r =1

where ◦ denotes the outer product. We suppose the vector d rp is normalized to unit length, and the weight of each rank-one tensor is λr . Implementation of the proposed K-CP method comprises two iterated stages: calculation of sparse coefficients, assuming that the codebook is fixed and codeword update based on the calculated sparse coefficients. The first stage can be solved easily by using the proposed Tensor OMP. In the codeword update stage, each tensor codeword is updated individually. To update codeword Dk , we first find the row vector xkT in X, in which each entry corresponds to the coefficient of a sample in Y to Dk . Then, we define the approximation error without using codeword Dk as follows: K  D j ◦ xTj (3) Ek = Y − j=k

The total reconstruction error can be written as follows: || Y − D ×(M+1) X ||2 = || Ek − Dk ◦ xkT ||2

(4)

Our aim is to find the optimal Dk that well approximates the reconstruction error Ek in Eq. (4), which can be solved easily by applying CP decomposition on Ek . However, applying CP on Ek directly would fill the coefficient vector xkT , which means that the sparsity would be destroyed. Therefore, we construct a constraint vector ω k = (i|1 ≤ i ≤ N , xkT = 0) that captures the nonzero entries of xkT . According to ω k , we must restrict Ek and xkT to EkR and xkR , respectively. By applying CP to EkR with a rank-one tensor component, Dk can be updated by using the decomposition result and the coefficient vector xkT can be updated by zero-padding the weight λ, as in Eq. (2) The above process of applying CP to the reconstruction residual tensor is executed K times to update the tensor codewords one at a time. The above two stages are iterated until a pre-specified reconstruction error is achieved or the maximum iteration number is reached.

3.1 Retrieval of Focal Liver Lesions in Multi-phase CT Images For each patient in the dataset, there are two types of data used in our experiments: triple-phase low-resolution (slice thickness of 5–7) CT images and single-phase highresolution (slice thickness of 0.5–0.8 mm) CT images. The dataset is explained in Sect. 2. Based on the structure of the dataset, spatiotemporal features and 3D spatial

Automated Retrieval of Focal Liver Lesions in Multi-phase …

223

Fig. 2 Learning 3D spatial and spatiotemporal features via the proposed tensor sparse coding method from single-phase high-resolution images and multi-phase low-resolution images, respectively

features are extracted from the two types of CT images by using the BoVW models, in which codebooks are learned by the proposed tensor sparse coding method, as shown in Fig. 2. 3D spatial information was captured from the high-resolution data (PV phase), where small cubes were extracted as local descriptors. The spatiotemporal information refers to the 2D spatial structure and temporal co-occurrence in multi-phase CT images. To capture the spatiotemporal information, volumes with three layers were constructed by stacking the corresponding representative slices in triple-phase low-resolution CT images. This operation transforms temporal information into spatial information in the third dimension of the constructed volumes. In addition, small cubes were extracted as local descriptors. Codebooks were learned using the proposed tensor sparse coding method for spatial and temporal information, respectively. The final representation of each medical case was obtained by combining tensor sparse representations (coefficient vectors) of the spatial and temporal information. The overall workflow of feature extraction is shown in Fig. 2.

4 Experiments and Results 4.1 Retrieval Performance Evaluation Method Considering the size of the constructed dataset, the leave-one-out cross-validation method is used in evaluation. Given a test case, i.e. the ith case, K nearest neighbors (KNN) method is used to find k most similar cases to the ith case. Let Nik is the number of cases, in the retrieved k similar cases, that belong to the same type as the ith case:

224

J. Wang et al.

Fig. 3 The retrieval performance comparison using various features. Red-colored lines used spatial features learned from single-phase high-resolution images. Green-colored lines used temporal features learned from multi-phase low-resolution images. Blue-colored lines used the combination of spatial and temporal features. The solid lines are that learned by the proposed tensor sparse coding method (TSC) with tensor training samples, whereas the dotted lines are that learned by conventional sparse coding methods with vectorized patches

Nik =

k  j=1

 lj, lj =

1, i f T j = Ti . 0, i f T j = Ti

Nik . k N 1  Pr ec@k = Pik N i=1 Pik =

(5)

where, T j represents the type of the jth case. T j = Ti stands for a correct prediction of the ith case. Ci is the number of cases in the dataset that belong to the same type as the ith case, retrival precision of the ith case when k similar cases are selected, Pik , can be calculated. We used mean precision values (Pr ec@k), when k most similar cases to the ith case are retrieved, to demonstrate the overall retrieval performance.

4.2 Experimental Results The retrieval performance Prec@k,(k=1,2,…,15), is shown in Fig. 3 by comparing the performance achieved using only 3D spatial information, spatiotemporal information, or combination thereof, which were learned by conventional sparse coding and the proposed tensor sparse coding method. As shown in Fig. 3, the combined

Automated Retrieval of Focal Liver Lesions in Multi-phase …

225

Table 1 The retrieval performance compared with state-of-the-art methods Methods Pr ec@1 Pr ec@2 Pr ec@6 Pr ec@10 Pr ec@15 Temporal density and GLCM [3] Combined features [4] Global BoVW [5] Categoryspecific BoVW [5] BoTCoW [8] Dual BoVW [7] BoVW with MI criterion [9] BoVW based on sparse coding [10] The proposed TSC-based BoVW

Average of Top 15

0.6058

0.5985

0.6017

0.6038

0.59323

0.6017

0.6742

0.6515

0.6363

0.6045

0.5742

0.6281

0.7424

0.7083

0.6325

0.5962

0.5500

0.6242

0.7500

0.7386

0.6489

0.5992

0.5530

0.6347

0.6591

0.6932

0.6679

0.6500

0.6268

0.6554

0.7424

0.6856

0.6603

0.6416

0.6116

0.6683

0.6818

0.6629

0.5997

0.5394

0.4904

0.5958

0.6818

0.7008

0.6591

0.6295

0.6076

0.6558

0.7727

0.7348

0.6717

0.6371

0.6111

0.6855

feature always yields the best result in case of the conventional sparse coding method and the proposed tensor sparse coding method; the proposed tensor sparse coding method outperforms the conventional sparse coding method when using the same features (3D spatial, spatiotemporal, or their combinations). A comparison of the performance of the proposed method with those of the stateof-the-art methods is given in Table 1. The proposed tensor sparse coding method outperforms the other methods by preserving the spatial structure of medical images and combining the 3D spatial features captured from high-resolution CT images and spatiotemporal features captured from multi-phase CT images. Considerable research effort has been invested in to exploring variants and enhanced versions of the BoVW model for FLL characterization. Table 1 shows a comparison of the proposed method with a few other BoVW models. Diamant et al. [9] proposed a feature selection method based on mutual information for specific classification tasks. However, experimental results showed that feature selection does not improve the retrieval performance of the proposed image representation method.

226

J. Wang et al.

5 Conclusion In this paper, we proposed the Tensor-OMP and K-CP method to learn tensor sparse representations of multi-phase medical images. Core information of 3D spatial structures of CT volumes and spatiotemporal features of multi-phase CT images were extracted separately by using BoVW models with codebooks learned by using the proposed method. Experiments of FLL retrieval showed that the proposed method outperformed the conventional vector-based sparse coding method. By retrieval of FLLs using multi-phase CT with the proposed method, the diagnostic accuracy of both intern radiology students and resident doctors can be improved. Acknowledgements This research was supported in part by Shandong Provincial Natural Science Foundation under the Grant No. ZR2019BF035, and in part by the Grant-in Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant No. 18H03267, in part by Zhejiang Lab Program under the Grant No. 2020ND8AD01.

References 1. Mir, A.H., Hanmandlu, M., Tandon, S.N.: Texture analysis of CT-images. IEEE Eng. Med. Biol. 5, 781–786 (1995) 2. Duda, D., Kretowski, M., Bezy-Wendling, J.: Texture characterization for hepatic tumor recognition in multiphase CT. Biocybern. Biomed. Eng. 26(4), 15–24 (2006) 3. Roy, S., Chi, Y., Liu, J., Venkatesh, S.K., Brown, M.S.: Three-dimensional spatiotemporal features for fast content-based retrieval of focal liver lesions. IEEE Trans. Biomed. Eng. 61(11), 2768–2778 (2014) 4. Xu, Y., Lin, L., Hu, H., Yu, H., Jin, C., Wang, J., Han, X.-H., Chen, Y.-W.: Combined density, texture and shape features of multi-phase contrast-enhanced ct images for cbir of focal liver lesions: a preliminary study. In: Innovation in Medicine and Healthcare 2015. Kyoto, Japan (2015) 5. Yang, W., Lu, Z., Yu, M., Huang, M., Feng, Q., Chen, W.: Content-based retrieval of focal liver lesions using bag-of-visual-words representations of single- and multi-phase contrast-enhanced ct images. J. Digit. Imaging 25, 708–719 (2012) 6. Yu, M., Feng, Q., Yang, W., Gao, Y., Chen, W.: Extraction of lesion- partitioned features and retrieval of contrast-enhanced liver images. Comput. Math. Methods Med. (2012) 7. Diamant, I., Hoogi, A., Beaulieu, C.F., Safdari, M., Klang, E., Amitai, M., Greenspan, H., Rubin, D.L.: Improved patch based automated liver lesion classification by separate analysis of the interior and boundary regions. IEEE J. Biomed. Health Inf. 20(6), 1585–1594 (2016) 8. Xu, Y., Lin, L., Hu, H., Wang, D., Liu, Y., Wang, J., Chen, Y.-W., Han, X.: Bag of temporal co-occurrence words for retrieval of focal liverlesions using 3d multiphase contrast-enhanced CT images. In: 2016 23rd International Conference on Pattern Recognition (ICPR 2016) (2016) 9. Diamant, I., Klang, E., Amitai, M., Konen, E., Goldberger, J., Greenspan, H.: Task-driven dictionary learning based on mutual information for medical image classification. IEEE Trans. Biomed. Eng. 64(6), 1380–1392 (2017)

Automated Retrieval of Focal Liver Lesions in Multi-phase …

227

10. Wang, J., Han, X.H., Xu, Y., Lin, L., Hu, H., Jin, C., Chen, Y.W.: Sparse codebook model of local structures for retrieval of focal liver lesions using multi-phase medical images. Int. J. Biomed. Imaging 13pp. (2017). Article ID 1413297 11. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)

Colorization for Medical Images Based on Patient-Specific Prior Information and GAN Features Yonglong Zhang, Yizhou Chen, Wenbo Pang, and Huiyan Jiang

Abstract Medical images are mostly gray-scale images. Colorization for medical images can make them have better visualization effect and provide more diagnostic information for doctors. This paper proposes a colorization method for medical images based on prior explicit features and Generative Adversarial Networks implicit features. Firstly, patient attribute features are extracted based on the clinical prior knowledge, including age, gender, drug intake, etc. Meanwhile, image domain features are extracted from gray image, like texture, intensity value and others. Then, the feature value of each point is calculated to obtain the prior feature map. Secondly, the prior feature map and the original image are input to Generative Adversarial Networks to complete end-to-end training of the network. In addition, the method introduces a new loss function with prior explicit features and GAN features to constrain the network learning direction. This method has achieved good performance on both e-anatomy data set and our local dataset. The mean square error is reduced by 30.889 and the peak signal-to-noise ratio is increased by 5.116. Experimental results show that our method is superior in colorization for medical images. Keywords Image colorization · Medical image · Generative adversarial networks · Deep learning · Feature fusion

Y. Zhang · Y. Chen · W. Pang (B) · H. Jiang Northeastern University, Shenyang 110819, China e-mail: [email protected] Y. Zhang e-mail: [email protected] Y. Chen e-mail: [email protected] H. Jiang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_19

229

230

Y. Zhang et al.

1 Introduction With the further development of medical technology, medical image has been more and more widely used in modern medicine. However, in the current medical image processing technology, the original images are mostly gray medical images based on optical theory. Because of the low ability of human eyes to distinguish gray, it is difficult to distinguish the focus area and read the pathological information. By giving color information to medical images, it can greatly improve the visualization effect of medical images, highlight the focus area and pathological information, and help doctors achieve more accurate diagnosis. Traditional image colorization algorithms are mainly based on color transfer and user interaction. Image colorization based on color transfer uses the color of the reference image like the input image, which makes the target image have the features of the source image and the reference image. Reinhard et al. [1] proposed a color transfer image colorization algorithm to automatically colorize the image with reference image, but the algorithm cannot solve the problem of color transfer between grayscale images and cannot transmit the rest of the image statistics. Welsh et al. [2] improved colorization result based on Reinhard, matching the brightness and texture information between the target image and the source image, all the color in-formation of the target image is transferred to the source image while retaining the brightness information, solving the color transfer between grayscale image and color image. Levin et al. [3] proposed an optimization-based framework, according to the difference between the pixel and its neighboring pixels to solve the quadratic pixel function for grayscale image colorization. Huang et al. [4] improved the method of Levin et al. to prevent the color from overflowing to the object boundary. Yatziv and Sapiro [5] proposed a chromaticity mixing fast colorization technique based on weighted geodesic distance. Luan et al. [6] used texture similarity to spread colors more effectively. However, these methods rely heavily on user input and take trial and error to get acceptable results. In recent years, image colorization algorithms based on deep learning have shown impressive performance. Cheng et al. [7] proposed a full-automatic method of colorization, which uses a small neural network to extract various features and colorize different color blocks of the image, combined with bilateral filtering to improve the result. The image colorization method proposed by Zhang et al. [8] designs an appropriate loss function to deal with the multi-mode uncertainty of the colorization problem and maintain the diversity of colors. The deep colorization proposed by Carlucci et al. [9] uses a deep neural network structure of residual paradigm to learn how to map deep data to three-channel images. The qualitative analysis of the images obtained by this method clearly shows that learning the optimal mapping better preserves the richness of depth information than the existing manual methods. Although the above work has achieved good results, due to the low interpretability of the deep learning network and the lower signal-to-noise ratio of medical images compared with natural images, higher precision colorization results are required. In

Colorization for Medical Images Based on Patient-Specific …

231

this paper, we introduce a priori feature of medical image and an additional loss function calculation method, and propose a colorization method of medical image based on priori explicit feature and implicit feature of Generative Adversarial Networks [10]. Our method consists of three parts, including explicit prior feature extraction model, GAN implicit feature extraction and colorization model, and loss function optimization model. The main contributions of this paper are as follows: • For each pixel of the original image, a series of prior features are extracted and fused into feature map to improve the interpretability of the color model, including patient attribute features and image domain features. • An additional objective function is introduced to optimize the implicit feature extraction of GAN and the loss function calculation of colorization model, constraining the network learning direction to improve the performance of colorization. • Compared with the current mainstream image colorization methods, our model achieves better colorization effect.

2 Related Work Generative Adversarial Networks (GAN) [10] is a kind of deep neural network. A deep neural network consists of many layers and can predict the output based on a given input. The relationship between input and output is as follows: y = σ (b + W x)

(1)

where x and y represent the input and output of the layer respectively, and x, y ∈ Rn , W is a m × n weight matrix, b represents the offset, range is b ∈ Rn . And represent Rn → Rm is a nonlinear transfer function. Both the weight W and the offset b are obtained through back-propagation learning, in which the chain rule is used to propagate the loss value to update the parameters. The loss value represents the error between the network’s prediction result and the training data. The Generative Adversarial Networks is developed based on the idea of game theory. The network usually consists of two modules: a generative model (Generator) and a discriminant model (Discriminator). The Generator generates new data that is not in the training data based on the training data, and the task of the Discriminator is to estimate the probability that the sample comes from the training data. The two learn to play against each other to produce a better output. In the practical applications of GAN network, both Generator and Discriminator usually use deep neural networks.

232

Y. Zhang et al.

3 Method This paper proposes a colorization method for medical images based on prior explicit features and GAN implicit features (PGF). The method includes three modules: prior explicit feature extraction, GAN implicit feature extraction and colorization model establishment, loss function optimization. Figure 1 shows the flow chart of the medical image colorization model. I. Processing Graying

Image enhancement

Data augmentation

II. Prior feature extraction Patient prior feature extraction Image domain features

Attribute domain features ...

...

Normalization Pixel-level feature fusion Image-level prior feature map

III. Implicit feature extraction and colorization GAN

Generator

Discriminator

Loss Loss-BCE Loss-CE

Fig. 1 Medical image colorization model based on prior explicit features and GAN implicit features

Colorization for Medical Images Based on Patient-Specific …

233

3.1 Preprocessing The process of medical image prior feature extraction includes medical image preprocessing module and prior feature extraction module. The image preprocessing module grays the color image in the data set to obtain the corresponding gray image. Then, the color and gray images are enhanced and expanded simultaneously. Image enhancement is to denoise and enhance the contrast without losing the useful information of the image to highlight the information of human tissue structure in the image. Data expansion means that the original image data set is expanded into deep learning data set by rotation, mirror image, transformation, and other methods.

3.2 Extraction of Explicit Prior Features Images The explicit prior feature extraction module includes two types of features: image domain feature Pt and patient attribute feature Pp . Image domain features include image texture features, such as gray level co-occurrence matrix, local binary pattern, directional gradient histogram, and pixel mean value, median value, and other intensity value characteristics. Table 1 shows the important image domain features. The characteristics of patients included age, gender, height and weight, body fat, drug intake and metabolic rate. Table 2 shows the important patient attributes domain features used in this paper. The medical image explicit prior feature extraction module extracts n image domain features and m patient attribute features. Taking the gray level co-occurrence matrix method [11] as an example, suppose that this method is the ith prior feature extraction method. Using the center pixel, the gray level co-occurrence matrix is obtained, and then the ASM energy, contrast and other k features of the pixel are calculated in turn, and the k × 1 size feature vector of the pixel is obtained, which is recorded as Pi . After all the prior features are extracted, m + n feature vectors which Table 1 Image domain features Features

Description

Gray level co occurrence matrix Image texture features express the spatial features of pixels in the image Local binary model

Texture features are nonparametric descriptors in the range of gray level, which are insensitive to the change of gray level

Directional gradient histogram

Texture feature is used to calculate and count the gradient direction histogram of the local area around the target pixel

Pixel mean

The feature of pixel intensity value is the average value in the neighborhood of 8 or 24 target pixels

Pixel mode

The feature of pixel intensity value is the number of neighbors of the target pixel 8 or 24

234

Y. Zhang et al.

Table 2 Patient attribute domain features Features

Description

Patient age

Provided by the case or image header file, with values from 1 to 100 and normalized to (0, 1]

Patient gender

Provided by case or image header file, male value is 0, female value is 1

Patient height and weight

Provided by case or image header file, standardized to (0, 1]

Drug intake

The injection dose of patients in the imaging process, no drug injection value is 0

Spatial structure information

The location of tissue structure expressed in medical image is head area, chest and abdomen area, lower limb area and so on

Metabolic rate

Considering the drug absorption rate and drug attenuation rate, no drug injection value was 0

size is k × 1 are finally obtained, and each feature vector is normalized to [0, 1]. Then the feature vectors are fused to get a k × 1 feature matrix P, and the fusion method is as follows:  Wi Pi + b (2) P = i=1

where, Pi represents the ith feature vector (i ∈ [1, m + n]), and W i represents the weight value corresponding to the feature, and b represents the offset. Next, we get the transpose matrix PT of P, and multiply PT and P to obtain the final feature value P0 of the pixel. The calculation method is as follows: P0 = P T · P

(3)

PT is 1 m matrix, P is m × 1 matrix, so it is easy to get that P0 is 1 × 1, which is the corresponding feature value of the pixel. Assuming the size of the input original image is h × h, the pixel level features of each pixel are calculated according to the above method, so that the image level features F m corresponding to the original gray image can be obtained. F m is also h × h. The calculated feature map F m and the image dataset will be input into the deep learning network at the same time to establish the image colorization model. In addition, the prior feature map is extended into feature vector and fused with the input features of full connection layer in Discriminator, and the loss value is calculated by softmax loss function [12]. The detailed description of the method is shown in Sect. 3.4. The prior features of medical images can provide more abundant discriminant information for deep learning network, such as local smoothness, non-local selfsimilarity, non-Gaussian, statistical characteristics, sparsity and so on. Corresponding to medical images, prior information can help our model better capture the lesion

Colorization for Medical Images Based on Patient-Specific …

235

region and pathological information, and make segmentation for images of different organs in order to optimize the colorization result of the whole model.

3.3 Loss Function Optimization In this paper, we use Generative Adversarial Networks [10] to extract the implicit features of medical image, and color the medical image. The Generative Adversarial Network usually consists of two modules: a generative model (Generator) and a discriminant model (Discriminator). The objective function of the GAN is V(D, G), Generator hopes to reduce the value of V(D, G) so that the data generated by itself will not be distinguished by Discriminator, and Discriminator hopes to increase the value of V(D, G) to efficiently distinguish the true and false categories of the data. The original gray image, prior feature image and color image are input into deep learning network. Moreover, resnet18 [13] is used as the discriminant network, and the feature map is converted into a 1 × 512 feature vector in the full connection layer, which will also be used in the subsequent optimization of the loss function model.

3.4 Loss Function Optimization In this paper, softmax is introduced as a new objective function to constrain the direction of network learning. softmax calculates the implicit features and prior explicit features extracted by Discriminator in GAN. It is combined with sigmoid function in the original network. Sigmoid function is a common nonlinear function, which maps the value of real number to the value between 0 and 1. In the Discriminator, the color image results generated by Generator are extracted, and the extracted image features are transformed into one-dimensional feature vector P0 in the full connection layer. Then, the feature vector is input into the sigmoid function, and the lossBCE (Binary Cross Entropy Loss [14]) is calculated. In addition, softmax function is introduced to calculate the full connection layer feature and prior feature in Discriminator. softmax function is used to normalize vectors. Through the softmax function, the output value of the multi-classification can be converted into a probability distribution with a range between [0, 1] and a sum of 1. The lossCE (Cross Entropy Loss [14]) is based on the softmax function and the feature vector is also input into the lossCE . Finally, lossBCE and lossCE are integrated as the final loss of the whole network. Loss = uloss BC E + θlossC E + b s.t. u + θ = 1

(4)

236

Y. Zhang et al.

The network optimization method of image colorization proposed in this paper can improve the interpretability in the process of network training, and make the network learning direction more in line with the expected goal.

4 Experiment 4.1 Data This paper uses the image data [15] of the visible human project of the National Library of America as the training data set. Including 1095 medical images, including 229 head images, 66 neck images, 203 chest images and others. In addition, we also used 103 local data, including 29 head data, 13 neck data, 38 chest data and others.

4.2 Network Training The prior feature extraction module extracts 56 kinds of image domain features, such as gray level co-occurrence matrix, and extracts 36 kinds of prior features, such as gender, age, height and weight, according to the patient attributes. Each kind of features is converted into a 1 × 512 length feature vector. The input image size is 512 × 512. In order to improve the robustness and generalization ability of the model and avoid over fitting, the image is randomly flipped horizontally and vertically with 50% probability. In addition, batch normalization [16] is used to accelerate the learning speed and improve the problem of training depth network. Choose to optimize training by using Adadelta optimizer [17].

4.3 Result In this paper, 20 cases of medical images were selected for colorization test and compared with the existing mature colorization methods. Figure 2 shows the colorization results of the proposed method. The colorization experiments were carried out on the medical images of the head, neck, chest, lower limbs and other parts of the human body. For each part, the first line is the original gray medical image, the second line is the corresponding color image, which is the label of the network, and the third line is the colorization result of this method. The experimental results show that our model can effectively color the medical images of different parts. The color medical images retain the structure, texture and other important information of the original image, and have good performance in colorization of multiple organs.

Colorization for Medical Images Based on Patient-Specific …

237

Gray Image

Ground Truth

Proposed

(a) head Gray Image

Ground Truth

Proposed

(b) neck Gray Image

Ground Truth

Proposed

(c) chest Gray Image

Ground Truth

Proposed

(d) others Fig. 2 Medical images and colorized results

We also conducted several comparative experiments, which is the experimental comparison before and after the fusion of prior features and the experimental comparison before and after the loss function optimization. Figure 3 shows the experimental results of the colored network fused with prior features compared with the original-colored network. It can be seen from the figure that the original network colorization result is dark in some structures, and the texture

238

Y. Zhang et al.

Fig. 3 Experimental comparison before and after fusion of prior features. a The results of colorization without fusion of prior features. b The enlarged result of the blue area of a. c The results of colorization by fusing prior features. d The enlarged result of the blue area of c

Fig. 4 Experimental comparison of loss function before and after optimization. a The results of colorization before loss function optimization, b The results of colorization after loss function optimization

is quite different from the surrounding area. The colorization network with prior features can significantly improve the problem and restore the image details. Figure 4 shows the experimental results before and after the loss function optimization. Experiments show that the colorization result without loss function optimization appears more noise in a certain area, and the optimized network effectively weakens this problem, and there is almost no noise in the original area with many noises. We also selected the current mature colorization method Iizuka [18] for comparison. It can be seen from the results that the method of Iizuka retains the structure of the original image. However, because the goal of the network itself is to color natural images, the overall color effect of medical images is purple. The results of our method are more realistic and have more significant coloring effect. Iizuka’s method has a better effect in the head area, while our method has a better effect in the neck and chest. Figure 5 shows the comparison between our result and that of Iizuka. In addition, MSE, PSNR and structural similarity (SSIM) were used as the quantitative analysis criteria of the comparative experiment. Table 3 shows the performance index of our colorization results and Iizuka colorization results. The mean square error of colorization results is 13.742,which is 30.889 lower than that of Iizuka’s method, and the peak signal-to-noise ratio is 36.750, which is 5.116 higher than that of Iizuka’s method. Although the structural similarity is slightly lower than that of Iizuka’s method by 0.007, both are close to 1. Compared with the method of Iizuka, the mean square error of the colorization results of the method of this paper is significantly reduced, and the peak signal-to-noise ratio is significantly

Colorization for Medical Images Based on Patient-Specific …

239

Gray Image

Ground Truth

Iizuka [18]

Proposed

Fig. 5 Comparison between Iizuka and proposed method

Table 3 Quantitative results of comparison between Iizuka and proposed method

MSE

PSNR

SSIM

Iizuka [18]

44.631

31.634

0.987

Proposed

13.742

36.750

0.980

increased. Iizuka can retain more structure information, and the color information of the method of this paper is more prominent.

5 Conclusion In this paper, a colorization meth-od based on Generative Adversarial Networks image is proposed. The method combines prior explicit features and optimizes the network loss function. At the same time, we intro-duce the image size processing module, which enables us to process the input image at any resolution, and train our model end-to-end to achieve accurate and real color effect. Finally, we select many test data to evaluate the colorization model. Experiments show that the proposed method has a very significant colorization effect.

References 1. Reinhard, E., Adhikhmin, M., Gooch, B., et al.: Color transfer between images. IEEE Comput. Graph. Appl. 21(5), 34–41 (2001) 2. Welsh, T., Ashikhmin, M., Mueller, K.: Transferring color to greyscale images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, pp. 277–280 (2002)

240

Y. Zhang et al.

3. Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: ACM SIGGRAPH 2004 Papers, pp. 689–694 (2004) 4. Huang, Y.C., Tung, Y.S., Chen, J.C., et al.: An adaptive edge detection based colorization algorithm and its applications. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 351–354 (2005) 5. Yatziv, L., Sapiro, G.: Fast image and video colorization using chrominance blending. IEEE Trans. Image Process. 15(5), 1120–1129 (2006) 6. Luan, Q., Wen, F., Cohen-Or, D., et al.: Natural image colorization. In: Proceedings of the 18th Eurographics Conference on Rendering Techniques, pp. 309–320 (2007) 7. Cheng, Z., Yang, Q., Sheng, B.: Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–423 (2015) 8. Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: European Conference on Computer Vision, pp. 649–666. Springer, Cham (2016) 9. Carlucci, F.M., Russo, P., Caputo, B.: $ˆ 2$ CO: Deep depth colorization. IEEE Robot. Autom. Lett. 3(3), 2386–2393 (2018) 10. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial networks (2014). arXiv:1406.2661 11. Dewi, D.E.O., Bakar, N.A., Hamid, H.A.: Myocardial tissue characterization of cardiac magnetic resonance images using gray-level co-occurrence matrix: case studies on normal and dilated cardiomyopathy. Int. J. Cardiol. 273, 18–19 (2018) 12. Mikolov, T., Kombrink, S., Burget, L., et al.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 5528–5531 (2011) 13. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 14. Ho, Y., Wookey, S.: The real-world-weight cross-entropy loss function: modeling the costs of mislabeling. IEEE Access 8, 4806–4813 (2019) 15. https://www.imaios.com/cn/e-anatomy/node_49402/node_127139 16. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning. PMLR, pp. 448–456 (2015) 17. Zeiler, M.D.: Adadelta: an adaptive learning rate method (2012). arXiv:1212.5701 18. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color! Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. (ToG) 35(4), 1–11 (2016)

Case Discrimination: Self-supervised Feature Learning for the Classification of Focal Liver Lesions Haohua Dong, Yutaro Iwamoto, Xianhua Han, Lanfen Lin, Hongjie Hu, Xiujun Cai, and Yen-Wei Chen

Abstract Deep Learning provides exciting solutions to problems in medical image analysis and is regarded as a key method for future applications. However, only a few annotated medical image datasets exist compared to numerous natural images. A solution to this problem is transfer learning using ImageNet. However, because the domain of ImageNet is different from that of medical images, the results of transfer learning are not always good. Therefore, we propose a model to investigate transfer learning by self-supervised learning using medical images. It is widely known that the results of Computerized Tomography (CT) scan are 3D volume images. There are lots of slices in CT or Magnetic Resonance Imaging scan images. So why not make these slices to a class? It is imperative to formulate this intuition as a self-supervised feature learning at the case-level. The results of our experiment demonstrated that, under self-supervised feature learning settings, our method surpasses the transfer learning method that uses ImageNet for classification. By experimenting with unannotated datasets, our method is remarkable for consistently improving test performance with a few annotated data. By fine-tuning the learned features, we obtained competitive results for self-supervised learning and classification tasks. Keywords Self-supervised learning · Medical image processing · Image classification H. Dong · Y. Iwamoto · Y.-W. Chen (B) College of Information Science and Engineering, Ritsumeikan University, Shiga, Japan e-mail: [email protected] X. Han Artificial Intelligence Research Center, Yamaguchi University, Yamaguchi, Japan L. Lin (B) · Y.-W. Chen College of Computer Science and Technology, Zhejiang University, Hangzhou, China e-mail: [email protected] H. Hu (B) · X. Cai Sir Run Run Shaw Hospital, Hangzhou, China e-mail: [email protected] Y.-W. Chen Zhejiang Lab, Research Center for Healthcare Data Science, Hangzhou, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_20

241

242

H. Dong et al.

1 Introduction Liver cancer has the fifth highest death rate for cancer among men, and it has the ninth highest death rate for cancer among women. Also, it accounts for the first place among new cancer cases [1]. Radiological procedures, such as computed tomography (CT) and magnetic resonance imaging (MRI), are the main methods used to detect liver tumors. Medical image analysis plays an important role in the early accurate detection and classification of focal liver lesions [2–5]. In the last few years, Deep Learning has been the state-of-the-art method for medical image analysis [6–10]. Most successful models have been trained using supervised learning, which requires a large number of datasets that are fully annotated. However, the number of annotated medical image datasets are few because such data annotation often requires the intervention of doctors possessing many years of clinical experience. Hence, the available data are insufficient for the model to learn, thus, greatly affecting its ability generalize. One solution to this problem is pre-training [9]. A model pre-trained on a large and diverse dataset like ImageNet captures universal features like curves and edges in its early layers. This is relevant and useful to most of the classification problems[11]. However, because the domain of ImageNet is different from that of the medical images, the results of transfer learning are not always good. Therefore, to pre-train a model, we propose a new method that can capture the universal features of the medical images through the data itself or more unlabeled data. The apparent similarity is not learned from the semantic annotations, but from the visual data themselves [12]. We can learn the difference between each instance to obtain a representation of the similarity among instances. However, this method still poses some challenges; it involves a huge number of ‘classes’. For ImageNet, it would be 1.2 million ‘classes’, which not feasible for the neural network. Therefore, the key to using this method is how to train the neural network with a huge number of ‘classes’. Instance Discrimination [12] and MOCO [13] used contrast learning to solve this problem. They proposed a structure called Memory Bank, which stores the trained features in the system memory to save GPU memory. By contrast learning, the features are distributed as sparsely as possible in the feature space to achieve the purpose of classification. In this paper, we propose a self-supervised method based on the property of 3D volume medical images called Case Discrimination. Medical images from CT or MRI, create a 2D image of a thin slice of the body, and there are a few such slices composed in a case. Usually, only one lesion is present in a case. Thus, the slices from the same case can be combined into the same pseudolabel using self-supervised learning to learn the similarities between them. Figure 1 gives a comparison with Case Discrimination and Instance Discrimination. The main contributions of our work are summarized as follows: (1) We train a feature extraction model using self-supervised learning without the annotated information, and keep it frozen; then we use non-linear Multilayer Perceptron (MLP) classifier, which can achieve the same level of accuracy as supervised learning.

Case Discrimination: Self-supervised Feature Learning …

243

Fig. 1 Comparison with a proposed Case Discrimination and b Instance Discrimination. In this Liver CT case, it can be seen that a single tumor spread across multiple slices. These slices have the same features such as curves and edges. Thus, they can be combined into the same pseudo-label to learn the similarities in a case

(2) We use the feature extraction model as a pre-trained model for supervised classification and improve accuracy. Our proposed method can be applied to a small number of annotated datasets.

2 Method Figure 2 gives a flowchart of the transfer learning and proposed Case Discrimination. Compared with traditional transfer learning, Case Discrimination uses the pseudolabel of the case to learn the feature information from themselves by self-supervised learning and replaces the Fully Connected (FC) Layer with MLP Layer. The feature extraction model trained on Case Discrimination can achieve a robust feature representation, which can be transferred to the classification by fine-tuning. Self-supervised learning is a new paradigm of unsupervised learning. The pipeline has two steps: (1) Pre-training a feature extraction model on a pretext task with unannotated dataset. (2) Fine-tuning or freezing the pre-trained model for the target task with annotated dataset. From related work, the pretext tasks include the following: predicting the relative position between two random patches from one image [14], predicting the rotation [15], or solving a jigsaw puzzle [16]. Our goal is to learn a feature extraction model FV = f θ (x) without labels. f θ (x) is a Deep Neural Network with parameters θ , for mapping input image x to the feature vector FV . The pretext task is a case-level classification. The slices (i.e., image) are made from the same case as a class of themselves, and a classifier is trained to differentiate classes by case.

244

H. Dong et al.

Fig. 2 Comparison with a transfer learning from ImageNet, and b the proposed Case Discrimination

2.1 Pre-processing For a 3D medical volume, there are a few thin slices composed in a case. Our goal is to let neural network learn advanced semantic features for case classification. So in order not to let the neural network learn the texture information between adjacent slices, the slices we choose are not adjacent. The image will undergo random data augmentation including random angle rotation and random horizontal flip prior to input. It is proven to be very effective for self-supervised learning [13, 17].

2.2 Pre-training Stage As Fig. 2b shows, the backbone of the feature extraction can be any used Convolutional Neural Network such as ResNet [18]. Texture characteristic is a key feature of medical image analysis; it often only needs a simple feature extractor. Thus, in this paper, ResNet18 was used as the backbone. The feature from the backbone was given as input to the classifier. The case-level classification was formulated by softmax based on the assumption that there are n images x1 , x2 , . . . , xn in m classes, and the feature vectors are FV1 , FV2 , . . . , FVn from FVi = f θ (xi ); the probability of identifying the softmax, as the jth class for the image xi with feature vector FVi is

Case Discrimination: Self-supervised Feature Learning …

exp(w Tj σ (FVi )) P( j|FVi ) = m T k=1 exp(w k σ (FVi ))

245

(1)

where wk is the weight vector of class k and wkT FVi calculates the fit of image i of the kth class, and σ is a ReLU nonlinearity function.

2.3 Fine-Tuning Stage In the fine-tuning stage, non-linear MLP with one hidden layer (h-Dimension) was used instead of FC layer in the classifier. Thus, the probability of identifying it as the jth class becomes, h

T T l=1 exp(wl→ j σ (h i→l σ (FVi ))) P( j|FVi ) = m h T T k=1 l=1 exp(wl→k σ (h i→l σ (FVi )))

(2)

T where wl→k is the weight vector of the hidden layer l to class k, h Ti→l is the weight vector of feature vector i to hidden layer l, and σ is a ReLU nonlinearity function.

3 Experiments and Results 3.1 Dataset and Implementation A total of 89 cases of CT liver volume including 489 slice images were used. They contained four types of lesions (i.e., Cyst, FNH, HCC, and HEM). The region of interest of each case was first outlined by experienced radiologists. Our dataset distribution is shown in Table 1. Each CT case has three phases (i.e., NC, ART, and PV). The CT images in our dataset were created from 2015 to 2017. The slice collimation was 5–7 mm, matrix was 512 × 512 pixels, and in-plane resolution was 0.57–0.59. After the pre-processing, the 3-channel (NC, ART, PV) input image was resized by 128 × 128 × 3. In our experiments, the dataset was split into 3 group to perform 3-fold crossvalidation. To increase the amount of data in the data set, several slices centered on representative slices of CT images were selected.

246

H. Dong et al.

Table 1 Dataset distribution Type Cyst Group1:case (slice) Group2:case (slice) Group3:case (slice) Total:case (slice)

FNH

HCC

HEM

Total

10(59)

6(33)

6(29)

6(30)

28(151)

10(41)

7(23)

7(68)

7(28)

31(160)

10(49)

6(20)

7(65)

7(44)

30(178)

30(149)

19(76)

20(162)

20(102)

89(489)

Table 2 Ablation study of the proposed method (classification accuracy (%) is represented as mean and standard deviation) Case Discrim- Fine-tuning FC MLP Acc. ination ResNet18 Model 1 Model 2 Model 3 Model 4 (the proposed method)

   

    

 

78.9 ± 2.56 82.34 ± 1.80 82.77 ± 1.23 84.4 ± 2.27 85.44 ± 2.09

3.2 Experimental Setting The dimension of the output feature vector of the backbone network is 128. The non-linear MLP was set with a hidden layer (512-Dimension). In the pre-training stage, the model was trained for 500 epochs using Stochastic Gradient Descent with momentum. The batch size was 32, and the learning rate was 0.1. And in the finetuning stage, the model was trained for 100 epochs using Stochastic Gradient Descent with momentum. The batch size was 32, and the learning rate was 0.1.

3.3 Ablation Study and Comparison of Different Transfer Learning Methods We conducted an ablation study to evaluate non-linear MLP and the proposed method. The results are shown in Table 2. By comparison, it can be seen that non-linear MLP improved accuracy by about 1% and Case Discrimination improved accuracy by about 3.44%.

Case Discrimination: Self-supervised Feature Learning …

247

Table 3 Comparison of different transfer learning methods Accuracy

Cyst

FNH

HCC

HEM

Train from scratch

97.26 ± 1.00

61.37 ± 15.03

84.73 ± 1.44

59.54 ± 13.81

Avg. 78.90 ± 2.56

ImageNet

92.00 ± 2.78

89.51 ± 4.50

72.87 ± 2.75

75.69 ± 10.91

82.03 ± 2.49

Instance Discrimination

92.85 ± 4.01

77.00 ± 10.11

83.48 ± 6.12

76.56 ± 6.20

83.61 ± 1.26

Case Discrimination (the proposed method)

96.68 ± 2.35

90.74 ± 0.54

81.56 ± 1.84

68.82 ± 11.08

85.44 ± 2.09

Fig. 3 Changes in accuracy brought about by reducing the number of supervised learning training sets. N is the number of training samples shown in Table 1

Also, we compared the different transfer learning methods from ImageNet, Instance Discrimination, and Case Discrimination. We use the right-side model in the Fig. 2a as the fine-tuned ImageNet and Instance Discrimination. This means that the parameters of the feature extraction model are updated during training. We further added supervised learning that trains from scratch. Table 3 summarizes the quantitative comparison results. We confirmed that the accuracy of the proposed method is better than the accuracy obtained from fine-tuning using ImageNet. By training from scratch, the accuracy of the proposed method improved by 6.54%.

3.4 Small Scale Annotation Training Data To simulate scenarios of self-supervised learning using unannotated data, the number of supervised training sets was reduced to N/10, N/3, and N/2 of the original number, where N is the number of training samples shown in Table 1. And experimental setting is the same as before.

248

H. Dong et al.

As discussed by the results, in Fig. 3, we confirmed that the accuracy of the proposed method was better than fine-tuning with ImageNet, both in terms of precision and accuracy reduction. Thus, we confirmed that the proposed method achieves better feature representation even when the number of datasets is reduced.

4 Conclusions In this paper, we proposed a method of using self-supervised learning to classify focal liver lesions. The experiments conducted using our dataset demonstrated that the proposed method improves the accuracy of the classification of focal liver lesions. In our future work, we will apply the proposed method to other tasks such as segmentation, to determine whether the accuracy of other tasks can also be improved. Acknowledgements We would like to thank Sir Run Run Shaw Hospital for providing medical data and helpful advice on this research. This work is supported in part by the Grantin Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant No. 20KK0234, No. 20K21821, No.18H03267, in part by Zhejiang Lab Program under the Grant No. 2020ND8AD01.

References 1. Henley, S.J., et al.: Annual report to the nation on the status of cancer, part i: national cancer statistics. Cancer 126(10), 2225–2249 (2020) 2. Roy, S., et al.: Three-dimensional spatiotemporal features for fast content-based retrieval of focal liver lesions. IEEE Transactions on Biomedical Engineering 61(11), 2768–2778 (2014) 3. Diamant, I., et al.: Improved patch-based automated liver lesion classification by separate analysis of the interior and boundary regions. IEEE J. Biomed. Health Inform. 20(6), 1585– 1594 (2015) 4. Xu, Y., et al.: Texture-specific bag of visual words model and spatial cone matching-based method for the retrieval of focal liver lesions using multiphase contrast-enhanced CT images. Int. J. Comput. Assist. Radiol. Surg. 13(1) (2018) 5. Wang, J., et al.: Tensor-based sparse representations of multi-phase medical images for classification of focal liver lesions. Pattern Recognit. Lett. 130, 207–215 (2020) 6. Maayan, F.-A., et al.: Modeling the intra-class variability for liver lesion detection using a multi-class patch-based CNN. In: Patch-Based Techniques in Medical Imaging, pp. 129–137. Springer International Publishing, Cham (2017) 7. Yasaka, K., et al.: Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: a preliminary study. Radiology 286(3), 887–896 (2018) 8. Liang, D., et al.: Combining convolutional and recurrent neural networks for classification of focal liver lesions in multi-phase CT images. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 666–675. Springer (2018) 9. Wang, W., et al.: Classification of focal liver lesions using deep learning with fine-tuning. In: Proceedings of the 2018 International Conference on Digital Medicine and Image Processing, pp. 56–60 (2018)

Case Discrimination: Self-supervised Feature Learning …

249

10. Wang, W., et al.: Deep fusion models of multi-phase CT and selected clinical data for preoperative prediction of early recurrence in hepatocellular carcinoma. IEEE Access 8, 139212–139220 (2020) 11. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015) 12. Wu, Z., et al.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 3733–3742 (2018) 13. Kaiming, H., et al.: Momentum contrast for unsupervised visual representation learning. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2020) 14. Doersch, C., et al.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015) 15. Gidaris, S., et al.: Unsupervised representation learning by predicting image rotations (2018). arXiv:1803.07728 16. Noroozi, M., et al.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: European Conference on Computer Vision, pp. 69–84. Springer (2016) 17. Ting, C., et al.: A simple framework for contrastive learning of visual representations (2020). arXiv:2002.05709 18. Kaiming, H., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

Content-Based Retrieval of Focal Liver Lesions Using Geometrical and Textural Features of Multi-Phase CT-Scan Images Saeed Moslehi, Amir Hossein Foruzan, Yen-Wei Chen, and Hongjie Hu

Abstract The most critical step in content-based medical image retrieval systems is feature extraction. The objective of the feature-extraction phase is to discriminate between different types of lesions. Preparing useful features improves diagnosis by an image retrieval system and successful treatment. We propose a new set of geometrical features based on the description of different types of tumors and integrate them with textural elements to enhance the discrimination of five kinds of focal liver lesions. The abnormal region is divided into three sections, including central, middle, and border regions. The textural features are obtained in each part individually, while the geometrical characteristics are calculated for the whole zone of a lesion. Evaluation of the results shows an improvement of prec@6 and mAP, which is attributed to the proposed geometric characteristics. Moreover, our method increases prec@6 by 3.9% compared to recent researches. Keywords Content-based medical image retrieval · Focal liver lesions (FLLs) · Gray-level co-occurrence matrix · Multi-phase CT images

S. Moslehi · A. H. Foruzan (B) Department of Biomedical Engineering, Shahed University, Tehran, Iran e-mail: [email protected] S. Moslehi e-mail: [email protected] Y.-W. Chen Intelligent Image Processing Lab, College of Information, Science and Engineering, Ritsumeikan University, Shiga, Japan e-mail: [email protected] H. Hu Department of Radiology, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, Zhejiang, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2_21

251

252

S. Moslehi et al.

1 Introduction 1.1 A Subsection Sample Content-Based Medical Image Retrieval (CBMIR) systems can assist physicians in diagnosing the type of malignancy using a query image. These systems use noninvasive procedures and remove the necessity to perform tumor biopsy, which is an invasive task. A Content-Based Image Retrieval algorithm consists of four modules: (1) preprocessing; (2) feature extraction; (3) similarity measurement; and (4) lesion classification. Similarity index and feature extraction are considered as two crucial steps [1]. Concerning the similarity measures, numerous researches have introduced a variety of distance measurement indices, including Euclidean, Mahalanobis, Chamfer, Chessboard, City block, Manhattan distances, RLDA, LDP, intersection, and manifold learning histograms [2–6]. Regarding feature extraction, there are two trends to improve retrieval results: (1) developing more complex low-level features, (2) employing physician-based highlevel attributes. There is a gap between low-level features prepared by a machine and high-level attributes obtained by a physician using semantic concepts. Visual characteristics cannot represent semantic relevance for an image; therefore, some latest studies develop features that are explainable by a specialist [7]. Regarding low-level attributes, textural features play a central role in CBMIR systems. The recent trend is to develop more sophisticated features by integrating a large number of visual characteristics. Roy et al. and Chi et al. extracted local textural information from the Gray-Level Co-occurrence Matrix (GLCM) [8, 9]. Yang et al. used Bag Of Visual Words (BOVW) based on raw intensity [10]. Xu et al. integrated the GLCM and BOVW methods and called the model as Bag Of Textural Words (BOTW) [11]. Wang et al. used Multi-linear Sparse Coding to model nodule textures [12]. Xu et al. used texture-specific BOVW and Spatial Cone Matchingbased BOVW to extract local texture [13]. Alahmar et al. used the GLCM matrix and the Gabor transform to represent a texture [14]. Napel et al. modeled image texture using the gray-level histograms and the low-frequency coefficients of its three-level Haar wavelet transform [15]. Yu et al. modeled local and global textural information using BOVW, GLCM matrix, Gabor transform, and high-frequency coefficients from Daubechies based on wavelet decomposition into four levels [2]. Qian et al. modeled image texture using the GLCM matrix and the wavelet-Gabor transform [16]. Gu et al. extracted a set of features from multi-resolution data [4]. Spainer et al. used standard histogram-based features to obtain similar liver lesions automatically [17]. Tuyet et al. applied shearlet transform to prepare retrieval features. Moreover, they utilized their method to retrieve identical images from a hybrid of different medical mages [18]. Along with textural and temporal features [8] obtained in different acquisition phases, other features such as density, temporal density, intensity, size, number of tumors, shape of the lesion, marginal attributes, boundary sharpness, and area have been proposed in previous works [8, 14, 15].

Content-Based Retrieval of Focal Liver Lesions ...

253

With the advent of Convolutional Neural Networks, several studies used CNNs in the feature extraction stages and lesion classification algorithms [7, 19, 20]. Although CNNs are treated as black boxes, eXplainable Artificial Intelligence (XAI) aims to fill the semantic gap between obscure results and human interpretations. In [20], a combination of CNNs and hashing functions was proposed to reduce the dimension of the feature vector as well. Owais et al. improved retrieval results by using ResNet for feature extraction [21]. Concerning high-level features, Trojachanec et al. used temporal information that showed neurodegeneration over time for better representation Alzheimer’s Disease [22]. In this paper, we propose several geometric characteristics that are based on the description of the lesion by physicians and combine them with standard textural features for accurate retrieval of focal liver lesions. The proposed method is given in Sect. 2. The results and discussions are given in Sect. 3, and we conclude the paper in Sect. 4.

2 The Proposed Method The flowchart of the proposed framework is shown in Fig. 1. In the first step, multiphase abdominal Computed Tomography (CT) images and the masks of the tumors are read, then three volumes of the candidate FLLs are detected. Input images correspond to non-contrast, arterial, and portal-vein phases. Next, the volumetric regions of interest (VOI) of FLLs in three phases are aligned using B-Spline registration. Each FLL in the database is represented quantitatively using three-dimensional temporal, spatial, and geometric attributes extracted from different areas within the volume of interest (VOI) of the FLL. A database is constructed using the feature vectors and the annotations of physicians. Finally, we use the L2 norm index to measure the distance between a query lesion and lesions in the database. We list the retrieved outcomes based on the similarity score to assist a physician in deciding on the type of a new tumor. Fig. 1 The flowchart of the proposed CBMIR system is shown

254

S. Moslehi et al.

2.1 Input Data The contrast-enhanced CT images of the abdomen, known as multi-phase images, form the input of the CBMIR system. The image of the first phase was obtained before the injection of a contrast media, and it is known as Non-Contrast (NC) phase. The second phase was achieved 25–40 s after injection of the contrast media, and it is called the Arterial (ART) phase. The final image was obtained 60–75 s later and termed as Portal-Venous (PV) phase. Using multi-phase data allows the extraction of spatial-temporal features and helps a more accurate diagnosis than single-phase images.

2.2 Preprocessing In the preprocessing, we follow the approach in [23] to detect FLLs in a 3D image. To ensure correspondence between the pixels of the three-phases, we use a B-spline registration scheme. Then, the volume of a lesion is divided into three partitions representing the innermost, intermediate, and outermost partitions shown by Pt1, Pt2, and Pt3, respectively. The enhancement of the contrast-media is different in the three regions; Pt1 captures the attributes of a tumor core, Pt2 models the tumor tissue characteristics in the middle zone, and Pt3 illustrates the pattern of the boundary. The partitions are obtained automatically by a Euclidean distance transform. If the distance map is normalized, Pt1, Pt2, and Pt3 are the boundaries of 0.4, 0.7, and 1 distances from the center of the tumor. In the future, we plan to evaluate the sensitivity of retrieval to the size and accuracy of the partitions.

2.3 Conventional Feature Vectors We consider various characteristics of a tumor, including its relative size and location, the profile of the boundary, textural, and temporal attributes. Variation of intensity values in different acquisition phases, known as temporal detail, is a crucial feature. The gray level co-occurrence matrix encodes local features of hepatic tumors, and it is the most widely used attribute for retrieval of liver CT image. We define four sets of 3D feature vectors to model a tumor, as it is described below. Density Features: The average intensity of a partition is divided by the mean liver intensity to measure enhancement of the region compared to the liver parenchyma. It is called F1 described by (1) and (2).   F1 = D N C , D A RT , D P V .

(1)

Content-Based Retrieval of Focal Liver Lesions ...

255

  N C N C N C D N C = d Pt1 d N C , d Pt2 d N C , d Pt3 d N C . Liver

Liver

Liver

(2)

D A RT and D P V are defined similarly. F 1 is a 9 × 1 vector. Temporal Density Features:F 2 represents the relative enhancement of a tumor in ART and PV phases compared to the NC phase, and it is defined in (3), (4), (5) and (6). F2 =



 T D A RT / N C , T D P V / N C .

  A RT N C A RT N C A RT N C T D A RT / N C = td Pt1 / , td Pt2 / , td Pt3 / . d A RT − d N C A RT N C td Pti / = Pti N C Pti , i = 1, 2, 3. d Pti td Pti /

PV NC

=

d PPtiV − d NPtiC , i = 1, 2, 3. d NPtiC

(3) (4)

(5)

(6)

Textural Features: We use a 3D GLCM to quantify the gray tone distribution in the tumor subvolumes [8]. There are eight textural coefficients. They include energy, entropy, inverse difference moment (IDM), inertia, cluster shade, correlation, dissimilarity, and homogeneity. They define the texture features (F 3 ) vector (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 and 19). Energy = t1 =

N −1  N −1 

g(i, j)2 .

(7)

g(i, j) log2 [g(i, j)] .

(8)

i=0 j=0

Entr opy = t2 =

N −1  N −1  i=0 j=0

I D M = t3 =

N −1  N −1  i=0 j=0

I ner tia = t4 =

N −1  N −1 

g(i, j) . 1 + (i − j)2 (i − j)2 g(i, j) .

(9)

(10)

i=0 j=0

Cluster Shade = t5 =

N −1  N −1 

(i + j − μi − μ j )3 g(i, j) .

(11)

i=0 j=0

μi =

N −1  i=0

i

N −1  j=0

g(i, j) .

(12)

256

S. Moslehi et al. N −1  N −1  (i − μi )( j − μ j )g(i, j) Corr elation = t6 = . σi σ j

(13)

i=0 j=0

σi =

N −1 

N −1 

(i − μi )2

i=0

g(i, j) .

(14)

j=0

Dissimilarit y = t7 =

N −1  N −1  i=0

H omogeneit y = t8 =

abs(i − j) g(i, j) .

(15)

g(i, j) . 1 + abs(i − j)

(16)

j=0

N −1  N −1  i=0 j=0

  F3 = T A RT , T P V .

(17)

  A RT A RT A RT . , TPt2 , TPt3 T A RT = TPt1

(18)

  A RT A RT A RT A RT , i = 1, 2, 3. = t1 Pti , t2 Pti , · · · , t8 Pti TPti

(19)

In (7), (8), (9), (10), (11), (12), (13), (14), (15) and (16), N is the number of rows and columns of the co-occurrence matrix, and g(i,j) is the probability of cooccurrence of gray levels i and j. The resulting texture coefficients from six partitions in ART and PV phases constitute a 48D feature vector. Temporal Texture Features:F 4 is the normalized difference in texture at the two enhancement phases (20, 21, 22, 23 and 24).   F4 = T T A RT , T T P V .

(20)

  A RT A RT A RT T T A RT = T T Pt1 . , T T Pt2 , T T Pt3

(21)

  A RT A RT A RT A RT T T Pti , i = 1, 2, 3. = tt1 Pti , tt2 Pti , · · · , tt8 Pti

(22)

  T T PPtiV = tt1 PPtiV , tt2 PPtiV , · · · , tt8 PPtiV , i = 1, 2, 3.

(23)

The derivatives of the coefficient are described in (24). p

A RT − median (tk Pti ) tk Pti A RT = ttk Pti

p

p∈( A RT, P V ) p

max(tk Pti ) − min(tk Pti )

p∈( A RT, P V )

p∈( A RT, P V )

, k = 1, 2, · · · , 8.

(24)

Content-Based Retrieval of Focal Liver Lesions ...

257

T T P V is texture derivative in the PV phase, and its formula is similar to T T A RT .

2.4 Geometric Feature Vector We propose a set of geometric features based on the characteristics of different types of liver lesions that are defined in medical resources [15, 24]. Therefore, these sets of features are considered as high-level semantic characteristics of tumors. Some studies even show a relationship between the enhancement of a tumor and its size/texture [25]. Thus, we cannot depend on geometric features to discriminate lesions. Size: Some types of lesions may appear in large volumes, and therefore, the relative size of a tumor can be used as a discriminate attribute. We define the relative size of a tumor as the volumetric ratio of a lesion to the liver. Location: Some varieties of tumors are at the boundary of the parenchyma, while others are in the central part of the liver. We employ the Hausdorff distance as another feature. The Hausdorff distance from a point-set A to point-set B is a function defined in (25).  h(A, B) = max min[d(a, b)] . a∈A

b∈B

(25)

Boundary Sharpness: Sharpness determines the quality of details in an imaging system. Due to biological reasons, the boundary of some liver lesions include sharp edges, i.e., the regions of abnormality and healthy tissues can be distinguished unambiguously [15]. Based on previous researches [26], the shape of a smoothed intensity profile perpendicular to a boundary point follows a sigmoid function. Therefore, we model the intensity profile that is normal to the surface of a tumor by a sigmoid function (26). f (x) =

1 . 1 + e− c1 (x−c2 )

(26)

The slope of the sigmoid function indicates the strength or weakness of the edge. A steep sigmoid represents a clear boundary. The parameters c1 and c2 are calculated by an iterative approach to the least square method. Initial values for the parameters are c1 = 1 and c2 = L/2, respectively, where L is the length of the profile vector. We introduce c1 as another geometric feature (Fig. 2). More details on the parameter estimation technique are given in [27]. Aspect Ratio: Since some kinds of tumors have a spherical shape and others have root-like forms, we employ the aspect ratio of a tumor volume as a geometric index. The aspect ratio of a lesion is calculated by the eigenvalues of the 3D VOI image matrix. We employ the Principal Component Analysis to calculate the aspect ratio.

258

S. Moslehi et al.

Fig. 2 The plot of a sigmoid function is shown (c1 controls the slope)

2.5 Similarity Measure The affinity between a query Focal Liver Lesion (FLL1) and a member of the database (FLL2) is calculated using an L2 distance between their feature vectors (27). D L 2 (F L L 1 , F L L 2 ) =

8 



2 ωi FFi L L1 − FFi L L2 .

(27)

i=1

In (27), ωi is a normalizing factor. After calculating the similarity of a query image to all members of the database, they are sorted based on their distance, and the pathological type of the FLL is predicted using Bulls Eye Performance (BEP). BEP is the percentage of correct retrievals within the top 2C results where Cis the size of the query FLL’s class [8]. In our experiments, C is 29, 16, 24, 24, and 18 for cyst, HEM, FNH, METS, and HCC, respectively. We assign the query image to a category with the highest BEP score (28). Quer y ⊆ Ci

i f B E P(Ci ) = max[B E P(Ck )] , k = 1, 2, · · · , 5.

(28)

3 Results and Discussions We evaluated the proposed CBMIR framework on a database of 116 multi-phase CT images consisting of five pathological types, namely, Cyst, Hemangioma (HEM), Focal Nodular Hyperplasia (FNH), Metastasis (METS), and Hepatocellular Carcinoma (HCC). The data were collected between 2011 and 2015 in Sir Run Run Shaw Hospital, Medical School, Zhejiang University, Hangzhou, China. There were 30

Content-Based Retrieval of Focal Liver Lesions ...

259

Fig. 3 The visual appearance of various FLLs over three phases are shown

cases of the cyst, 17 instances of HEM, 25 FNH, 25 METS, and 19 HCC. Examples of tumors are shown in Fig. 3. Precision was used to quantify the retrieval performance of the proposed framework. For a given query image, Precision is the ratio of correctly retrieved instances, and its variations are Prec@6, Prec@10, and the mean Average Precision (mAP). Prec@k is the ratio of the accurately achieved images at top k retrieved data. Average Precision is described in (29). AP =

n 

Pr ec@(k) · r el(k)

R.

(29)

k=1

In (29), Pr ec@(k) is the Precision at top k retrieved instances, r el(k) is a binary function that is 1 when the kth data belongs to the class of the query image, n is the total number of the retrieved results, and R is the size of the class corresponding to the query image. We employ the leave-one-out cross-validation scheme. We optimized the weights of the feature vector for the best results. Experimentally, it was found that the best weight vector was ω = [0.20 0.15 0.09 0.25 0.13 0.01 0.01 0.15]. The elements of the weight vector are the density, temporal density, temporal texture, boundary sharpness, texture, aspect ratio, size, and Hausdorff distance, respectively. The low weights for aspect ratio and size may be related to a deficiency in defining these parameters mathematically.

260

S. Moslehi et al.

Table 1 Results of retreivial experiments Texture feature

Geometric features Size

Location

Boundary sharpness

X X

X

X

X

X

X

X

X

X

X

X

X

X

mAP

Prec@6

Prec@10

0.7495

0.7016

0.6434

0.7495

0.7016

0.6434

0.7496

0.7060

0.6429

0.7499

0.7106

0.6374

0.7499

0.7106

0.6374

Aspect ratio

X

Table 2 Comparison with other retrieval methods Methods

Features

mAP

Prec@6

Prec@10

GLCM [8]

Texture

0.3790

0.2727

0.2697

LBP histogram [28]

Texture

0.4478

0.2576

0.2424

Global BoVW[10]

Texture

0.6746

0.5758

0.5273

Category-specific BoVW [10]

Texture

0.7045

0.5808

0.5303

Texture-specific BoVW [13]

Texture

0.7563

0.6717

0.6515

Proposed method

Texture + Geometric

0.7499

0.7106

0.6374

In Table 1, the performance of the geometric attributes is compared with standard GLCM features. Based on these results, the size and aspect ratio of a tumor do not help in distinguishing the tumor type. The reason is ascribed to either unimportance of these characteristics or improper definition of the features. However, both the location and boundary sharpness are essential elements. As is shown in Table 2, our approach achieved the best results among other algorithms. It can be attributed to the introduction of geometric features that are considered a new global characteristic. While the improvement of inclusion of the geometric features seems trivial compared to textural attributes; however, we obtained a significant gain of 4% corresponding to Prec@6 for HEM data. This class contains the lowest number of data compared to other categories; therefore, the possibility of misclassification may increase in such a case. This result reveals the potential of the proposed features to improve the discrimination ability of geometric attributes when classimbalance exists in the available data. In Fig. 4, variations of Prec@k for HEM data are shown. Since a physician discriminates different lesion using geometric characteristics, better results are obtained if mathematical descriptions of the above attributes are improved more. If we decide on the final class of a test image based on the frequency of the ten similar retrieved data and calculate the accuracy of retrieval mode, the results corresponding to the textural and textural + geometric features are 79.3% and 82.0%

Content-Based Retrieval of Focal Liver Lesions ...

261

Fig. 4 Variations of Prec@k for HEM data are shown

respectively which reveals the superiority of the proposed technique compared to textural-based approach.

4 Conclusion and Future Works We introduced new geometrical features and added them to the textural and Spatiotemporal attributes to improve the performance of a CBMIR system. The new features were the relative size and location of the lesion, the Boundary Sharpness, and the aspect ratio. The quantitative evaluation revealed improvement of the accuracy of the retrieval system. In the future, we intend to employ non-Euclidean measurement tools, including manifold learning techniques. Acknowledgements Authors would like to thank Prof. Hongjie Hu in Sir Run Run Shaw Hospital, Medical School, Zhejiang University, for providing us CT data and advice on data processing. This research was supported in part by the Grant-in-Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant No. 18H03267.

References 1. Costa, M.J., Tsymbal, A., Hammon, M., Cavallaro, A., Suhling, M., Seifert, S. et al.: A discriminative distance learning–based CBIR framework for characterization of indeterminate liver lesions. In: MICCAI International Workshop on Medical Content-Based Retrieval for Clinical Decision Support 2011 Sep 22, pp.92–104 (2012) 2. Yu, M., Lu„ Z., Feng, Q.,Chen, W.: Liver CT image retrieval based on non-tensor product wavelet. In: International Conference of Medical Image Analysis and Clinical Application (MIACA), pp. 67–70 (2010)

262

S. Moslehi et al.

3. Lei, B., Yang, P., Zhuo, Y., Zhou, F., Ni, D., Chen, S., Xiao, X., Wang, T.: Neuroimaging retrieval via adaptive ensemble manifold learning for brain disease diagnosis. IEEE J. Biomed. Heal. Inform. 23(4), 1661–1673 (2019) 4. Gu, Y., Yang, J.: Densely-connected multi-magnification hashing for histopathological image retrieval. IEEE J. Biomed. Heal. Inform. 23(4), 1683–1691 (2019) 5. Veerashetty, S., Patil, N.B.: Manhattan distance-based histogram of oriented gradients for content-based medical image retrieval. Int. J. Comput. Appl. 1–7 (2019) 6. Mirasadi, M.S., Foruzan, A.H.: Content-based medical image retrieval of CT images of liver lesions using manifold learning. Int. J. Multimed. Inf. Retr. 8(4), 233–240 (2019) 7. Swati, Z.N.K., Zhao, Q., Kabir, M., Ali, F., Ali, Z., Ahmed, S., Lu, J.: Content-based brain tumor retrieval for mr images using transfer learning. IEEE Access 7, 17809–17822 (2019) 8. Roy, S., Chi, Y., Liu, J., Venkatesh, S.K., Brown, M.S.: Three-dimensional spatiotemporal features for fast content-based retrieval of focal liver lesions. IEEE Trans. Biomed. Eng. 61(11), 2768–2778 (2014) 9. Chi, Y., Zhou, J., Venkatesh, S.K., Tian, Q., Liu, J.: Content-based image retrieval of multi-phase CT images for focal liver lesion characterization. Med. Phys. 40(10) (2013) 10. Yang, W., Lu, Z., Yu, M., Huang, M., Feng, Q., Chen, W.: Content-based retrieval of focal liver lesions using bag-of-visual-words representations of single- and multi-phase contrast-enhanced CT images. J. Dig. Imag. 25(6), 708–719 (2012) 11. Xu, Y., Lin, L., Hu, H., Wang, D., Liu, Y.: A retrieval system for 3D multi-phase contrastenhanced CT images of focal liver lesions based on combined bags of visual words and texture words. In: 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics(CISP-BMEI), pp. 806–810 (2016) 12. Wang, J., Han, X.H., Xu, Y., Lin, L., Hu, H., Jin, C.: Tensor sparse representation of temporal features for content-based retrieval of focal liver lesions using multi-phase medical images. In: IEEE International Symposium on Multimedia, pp. 507–510 (2017) 13. Xu, Y., Lin, L., Hu, H., Wang, D., Zhu, W., Wang, J., et al.: Texture-specific bag of visual words model and spatial cone matching-based method for the retrieval of focal liver lesions using multi-phase contrast-enhanced CT images. Int. J. Comput. Assist. Radiol. Surg. 13(1), 151–164 (2018) 14. Alahmera, H., Ahmeda, A.: Computer-aided classification of liver lesions from CT images based on multiple ROI. Proc. Comput. Sci. 90, 80–86 (2016) 15. Napel, A., Beaulieu, F., Rodriguez, C., Cui, J., Xu, J., Gupta, A., et al.: Automated retrieval of CT images of liver lesions on the basis of image similarity. Radiology 256(1), 243–252 (2010) 16. Qian, Y., Gao, X., Loomes, M., Comley, R., Barn, B., Hui, R. et al.: Content-based retrieval of 3D medical images. In: The Third International Conference on eHealth, Telemedicine, and Social Medicine. eTELEMED, pp. 7–12 (2011) 17. Spanier, A.B., Caplan, N., Sosna, J., Acar, B., Joskowicz, L.: A fully automatic end-to-end method for content-based image retrieval of CT scans with similar liver lesion annotations. Int. J. Comput. Assist. Radiol. Surg. 13(1), 165–174 (2018) 18. Tuyet, V., Hien, N., Quoc, P., Son, N., Binh, N.: Adaptive content-based medical image retrieval based on local features extraction in shearlet domain. EAI Endorsed Trans. Context. Syst. Appl. 6(17), 159351 (2019) 19. Liang, D., Lin, L., Hu, H., Zhang, Q., Chen, Q., lwamoto, Y. et al.: Combining convolutional and recurrent neural networks for classification of focal liver lesions in multi-phase CT images. In: Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) Medical Image Computing and Computer-Assisted Intervention—MICCAI 2018. MICCAI 2018, Lecture Notes in Computer Science, LNCS7951, pp.666–675 (2018). Springer 20. Cai, Y., Li, Y., Qiu, C., Ma, J., Gao, X.: Medical image retrieval based on convolutional neural network and supervised hashing. IEEE Access 7, 51877–51885 (2019) 21. Owais, M., Arsalan, M., Choi, J., Park, K.R.: Effective diagnosis and treatment through contentbased medical image retrieval (CBMIR) by using artificial intelligence. J. Clin. Med. 8(4), 462 (2019)

Content-Based Retrieval of Focal Liver Lesions ...

263

22. Trojachanec, K., Kitanovski, I., Dimitrovski, I., Loshkovska, S.: Longitudinal brain MRI retrieval for Alzheimer’s disease using different temporal information. IEEE Access 6, 9703–9712 (2017) 23. Chi, Y., Zhou, J., Venkatesh, S.K., Huang, S., Tian, Q., Liu, J.: Computer aided focal liver lesion detection. Int. J. Compt. Assist. Radiol. Surg. 8(4), 511–525 (2013) 24. Radiology Assistant. https://radiologyassistant.nl/abdomen/liver/common-liver-tumors. Accessed 18 June 2020 25. Hodler, J., Kubik-Huch, R.A., von Schulthess, G.K.: Diseases of the Abdomen and Pelvis 2018–2021: Diagnostic Imaging, IDKD Book. Springer Nature (2018) 26. Foruzan, A.H., Chen, Y.-W.: Improved segmentation of low-contrast lesions using sigmoid edge model. Int. J. Comput. Assist. Radiol. Surg. 11(7), 1267–1283 (2016) 27. Esfandiarkhani, M., Foruzan, A.H.: A generalized active shape model for segmentation of liver in low-contrast CT volumes. Comput. Biol. Med. 82, 59–70 (2017) 28. Foncubierta-Rodríguez, A., Seco de Herrera, A.G., Müller, H.: Medical image retrieval using bag of meaningful visual words: unsupervised visual vocabulary pruning with PLSA. In: Proceedings of the 1st ACM International Workshop on Multimedia Indexing and Information Retrieval for Healthcare, pp. 75–82 (2013)

Author Index

A Aonpong, Panyanat, 41

B Bai, Jie, 51 Bhatti, Uzair Aslam, 61, 75, 87, 101

C Cai, Xiujun, 241 Cheng, Jingliang, 51 Chen, Qingqing, 217 Chen, Yen-Wei, 41, 51, 207, 217, 241, 251 Chen, Yizhou, 229 Cipriano, Marco, 117 Costagliola, Gennaro, 117 Cui, Wenfeng, 61, 75, 87, 101

D De La Cruz-Ramirez, Yuliana Mercedes, 3 De Rosa, Mattia, 117 Dong, Haohua, 241

E Ebrahimi, Ali, 143

F Fang, Yangxiu, 61, 75, 87, 101 Foruzan, Amir Hossein, 251 Fuccella, Vittorio, 117 Fujimura, Ryota, 197

Fukami, Yoshiaki, 15 Funahashi, Koji, 197 Furukawa, Akira, 207

G Gerevini, Alfonso, 29 Goto, Tomio, 197

H Han, Baoru, 61, 75, 87, 101 Han, Xian-Hua, 51, 217, 241 Holmgren, Johan, 183 Hu, Hongjie, 217, 241, 251

I Inoue, Akitoshi, 207 Iwamoto, Yutaro, 41, 51, 207, 217, 241 Izukura, Rieko, 171

J Jiang, Huiyan, 229

K Kawahara, Toshiki, 207 Kobayashi, Nobuyuki, 129

L Lavelli, Alberto, 29 Li, Hangpeng, 155

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Y.-W. Chen et al. (eds.), Innovation in Medicine and Healthcare, Smart Innovation, Systems and Technologies 242, https://doi.org/10.1007/978-981-16-3013-2

265

266

Author Index

Li, Jingbing, 61, 75, 87, 101 Lin, Lanfen, 41, 217, 241 Liu, Jing, 61, 87

Serina, Ivan, 29 Shevchenko, Sergiy, 117 Shirasaka, Seiko, 129

M Mahdiraji, Saeid Amouzad, 183 Mansourvar, Marjan, 143 Mao, Zhengjia, 155 Masuda, Yoshimasa, 15, 129, 155 Mehmood, Tahir, 29 Mihailescu, Radu-Casian, 183 Miura, Kasei, 129 Miyake, Tetsuro, 129 Moslehi, Saeed, 251

T Toma, Tetsuya, 155

N Naemi, Amin, 143 Nakashima, Naoki, 171 O Olaza-Maguiña, Augusto Felix, 3 P Pang, Wenbo, 229 Petersson, Jesper, 183 S Schmidt, Thomas, 143

W Wang, Jian, 217 Wang, Weibin, 41 Wiil, Uffe Kock, 143

X Xiao, Xiliang, 61, 75, 87, 101 Xu, Yingying, 217

Y Yamashita, Takanori, 171 Yi, Dan, 61, 75, 87, 101

Z Zhang, Xinran, 51 Zhang, Yonglong, 229 Zhao, Guohua, 51 Zhao, Junlin, 217 Zhong, Junhao, 155