12th Asian-Pacific Conference on Medical and Biological Engineering: Proceedings of APCMBE 2023, May 18–21, 2023, Suzhou, China―Volume 2: ... Biology (IFMBE Proceedings, 104) 303151484X, 9783031514845


118 17 65MB

English Pages [439]

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
APCMBE2023 Committees
Preface
Contents
Computer-Aided Surgery
Inside-Out Accurate Head Tracking with Head-Mounted Augmented Reality Device
1 Introduction
2 Methods
2.1 Retro-reflective Tool Definition and Detection
2.2 Camera Depth Undistortion
2.3 Preoperative Registration
2.4 Dynamic Head Tracking
3 Experiments and Results
3.1 Preoperative Registration
3.2 Tracking Stability
3.3 System Performance
4 Discussions
5 Conclusions
References
A Model-Guided Method for Ultrasound Probe Calibration
1 Introduction
2 Methods
2.1 Calibration Model Design and General Ideas
2.2 3D Reconstruction Based on an Uncalibrated Probe
2.3 Image Registration
2.4 Probe Calibration
3 Experiments and Results
4 Discussions and Conclusions
References
Real-Time Medical Tool Runout Monitor Based on Dual Laser Displacement Sensors
1 Introduction
2 Methods
2.1 Error Model
2.2 Dual Laser Displacement Sensors Tracking Structure and Calibration Method
2.3 Cutting Tool Tip Position Measurement Method
3 Experiments and Results
3.1 Experimental Setup
3.2 Stability and Precision
3.3 Use Case: Drilling
3.4 System Performance
4 Discussions and Conclusions
References
Correction of Premature Closure of Sagittal Suture with Small-Incision Traction Bow
1 Introduction
2 Materials and Methods
2.1 Patients
2.2 Surgical Method
2.3 Results
3 Discussion
4 Conclusions
References
A Home-Style Intelligent Monitoring Sanitize Robot
1 Introduction
2 Home Monitoring Sanitize Overall Design of Robots
3 Home Monitoring Sanitize Control of Various Functions of Robots
3.1 Sanitize Mode Control
3.2 Intelligent Temperature Monitoring Mode Control
4 Monitoring Sanitize Robot Control Testing
4.1 Robot Temperature Detection Error Analysis
4.2 Robot Driving Stability Simulation
5 Conclusions
References
YOLOv7-Based Multiple Surgical Tool Localization and Detection in Laparoscopic Videos
1 Introduction
2 Methodology and Experiment
2.1 YOLOv7 Algorithm
2.2 Data Pre-Processing
2.3 Training Process
2.4 Evaluation Metrics
2.5 Experiment Implement
3 Results
4 Discussion
5 Conclusion
References
A Frequency-Based Analysis Method to Improve Adversarial Robustness of Neural Networks for EEG-Based Brain-Computer Interfaces
1 Introduction
2 Proposed Approach
2.1 Framework
2.2 EEG Adversarial Example Distribution and Influence
2.3 Frequency Analysis of EEG Adversarial Example
2.4 Variation Coefficient-Based EEG Adversarial Example Detection
3 Experiments and Results
3.1 Experimental Setup
3.2 Evaluating the Variation Coefficient
3.3 Detection Results of EEG Adversarial Examples
4 Conclusions
References
Robot-Assisted Optical Coherence Tomography for Automatic Wide-Field Scanning
1 Introduction
2 Methods
3 Experiments and Results
4 Discussion and Conclusion
References
Adversarial Detection and Defense for Medical Ultrasound Images: From a Frequency Perspective
1 Introduction
2 Ultrasound Adversarial Images Detection Method: From a Frequency Perspective
2.1 Overview
2.2 Feature Distribution of Adversarial Examples
2.3 Difference of Frequency Domain
2.4 Coefficient of Variation
3 Experiments and Results
3.1 Datasets and Implementation Details
3.2 Adversarial Examples Detection on Ultrasound Images
4 Conclusion
References
A Novel Model-Independent Approach for Autonomous Retraction of Soft Tissue
1 Introduction
2 Methods
2.1 State Representation Learning (SRL) Setup
2.2 Deep Reinforcement Learning (DRL) Setup
3 Results
3.1 Sample Efficiency
3.2 Accomplishment Degree of the Task
3.3 Safety Level Indicator of the Task
4 Discussion and Conclusion
References
A Soft Robot Based on Magnetic-Pneumatic Hybrid Actuation for Complex Environments
1 Introduction
2 Design Concept and Fabrication
3 Motion Gait and Analysis
4 Experiment
5 Conclusion
6 Discussion
References
A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional Spatial Interaction
1 Introduction
2 Methods
2.1 Overview of Our Environment
2.2 Visualization Platform of Cervical Imaging
2.3 3D Spatial Interaction and 3D Tumor Segmentation
3 Results and Discussion
3.1 System Architecture and Technical Details
3.2 User Study
3.3 Discussion
4 Conclusion
References
An Image Fusion Method Combining the Advantages of Dual-Mode Optical Imaging in Endoscopy
1 Introduction
2 Methods
3 Experiments and Results
4 Discussion
5 Conclusions
References
An End-to-End Spatial-Temporal Transformer Model for Surgical Action Triplet Recognition
1 Introduction
2 Methods
2.1 End-to-End Spatial-Temporal Transformer
2.2 Multi-task Auxiliary Supervisions
3 Experiments
3.1 Dataset Description
3.2 Evaluation Metrics
3.3 Implementation Details
4 Results
4.1 Ablation Studies
4.2 Comparisons with State-of-the-Arts
5 Conclusions
References
2D/3D Reconstruction of Patient-Specific Surface Models and Uncertainty Estimation via Posterior Shape Models
1 Introduction
2 Method
2.1 Overview of the Present Method
2.2 Construction of the Posterior Shape Model
2.3 Surface Model Reconstruction and Uncertainty Estimation via the Posterior Shape Model
3 Experiments and Results
3.1 Experimental Setup
3.2 Metrics
3.3 Evaluation on the Synthetic Cases
3.4 Evaluation on the Cadaveric Cases
4 Conclusion
References
Semantics-Preserved Domain Adaptation with Target Diverse Perturbation and Test Ensembling for Image Segmentation
1 Introduction
2 Methods
2.1 Preliminary
2.2 Image Translation
2.3 Semantics Preservation Regularization (SPR)
2.4 Target Diverse Perturbation (TDP)
2.5 Test-Time Ensembling (TTE) for CDIS
3 Experiments
3.1 Dataset and Preprocessing
3.2 Comparison and Ablation
3.3 Evaluation and Implementation
4 Results
4.1 Comparison Performance
4.2 Ablation Studies
5 Conclusion
References
Biomechanics
A New Mathematical Model for Assessment of Bleeding and Thrombotic Risk in Three Different Types of Clinical Ventricular Assist Devices
1 Introduction
2 Materials and Methods
2.1 Studied Models
2.2 Hydraulic Performance Prediction
2.3 Shear Stress and Residence Time Predication
2.4 Bleeding and Thrombosis Prediction Model Building
2.5 Mesh Details and Sensitivity Analysis
2.6 CFD Methods
3 Results
3.1 Hydraulic Performances and Flow Field
3.2 Shear Stress and Residence Time
3.3 Bleeding Probability
3.4 Thrombotic Potential
4 Disscusion
5 Conclusions
References
Analysis of YAP1 Gene as a Potential Immune-Related Biomarker and Its Relationship with the TAZ Expression
1 Introduction
2 Materials and Methods
2.1 UCSC Xena Dataset Analysis
2.2 GEO Dataset Acquirement and Analysis
2.3 cBioPortal Dataset Analysis
2.4 UALCAN Dataset Analysis
2.5 Kaplan–Meier Plotter Analysis
2.6 TISIDB Immune Analysis
2.7 CIBERSORT Immune Analysis
2.8 GSEA Analysis
2.9 TIMER Database Analysis
2.10 STRING Protein Network Analysis
2.11 CYTOSCAPE Network Analysis
2.12 GENPIA Dataset Analysis
2.13 Statistical Analysis
3 Results
3.1 Pan-Cancer Analysis of YAP1 Expression, Stage, Molecular Subtypes, and Methylated Level in Human Cancers
3.2 Genetic Alterations in YAP1 in Human Cancers
3.3 Prognostic Value of YAP1 Expression in Cancers
3.4 Correlation Between YAP1 Expression and Immune Cells Infiltration and Immune Subtypes in Human Cancers
3.5 Significant Pathways Influenced by YAP1 in Different Cancers
3.6 The Correlation Between YAP1 Expression and TAZ Expression and Regulate Downstream Genes
4 Discussion
5 Conclusions
References
Morphological Feature Recognition of Induced ADSCs Based on Deep Learning
1 Introduction
2 Materials and Methods
3 Results
4 Discussion
5 Conclusions
References
Micromechanical Properties Investigation of Rabbit Carotid Aneurysms by Atomic Force Microscopy
1 Introduction
2 Materials and Methods
2.1 Carotid Aneurysm Creation
2.2 Histological Examination
2.3 AFM Experiment
2.4 Finite-element Method Simulation
3 Results
3.1 Histology
3.2 Comparison Between Aneurysm Arteries and Healthy Arteries
3.3 Comparison of Indentation at Different Depths by FEM
4 Discussion
5 Conclusions
References
The Development of the “Lab-In-Shoe” System Based on an Instrumented Footwear for High-Throughput Analysis of Gait Parameters
1 Introduction
2 Materials and Methods
2.1 “Lab-in-shoe” Design
2.2 Implementation of the Algorithm
2.3 Experimental Validations
3 Results
3.1 Validity of the System
3.2 Measurement Reliability Analysis
4 Discussion
5 Conclusions
References
3D-Printed Insole Designs for Enhanced Pressure-Relief in Diabetic Foot Based on Functionally-Graded Stiffness Properties
1 Introduction
2 Materials and Methods
2.1 3D Geometry of the Diabetic Foot Insoles
2.2 Functionally-Graded Stiffness Design
2.3 Patient-Specific Plantar Pressure Analysis
3 Results
4 Discussions
5 Conclusions
References
A Novel Force Platform for Assessing Multidimensional Plantar Stresses in the Diabetic Foot—A Deep Learning-Based Decoupling Approach
1 Introduction
2 Proposed Method
2.1 Physical Principle of the Method
2.2 Optical Flow Calculation
2.3 Neural Network-Based Calibration of the Force Platform
3 Results
4 Discussions
5 Conclusions
References
MicroNano Bioengineering
A Nanoparticle Tracking Analysis Algorithm for Particle Size Estimation
1 Introduction
2 Materials and Methods
2.1 Principle of Nanoparticle Tracking Analysis
2.2 Algorithm of Nanoparticle Tracking Analysis
2.3 Experimental Setup
3 Results
4 Discussion
5 Conclusions
References
Biomaterials
Self-adaptive Dual-Inducible Nanofibers Scaffolds for Tendon-To-Bone Interface Synchronous Regeneration
1 Introduction
2 Materials and Method
2.1 Reagents and Materials
2.2 Preparation of Sr-MBG
2.3 Fabrication of Nanofibers Scaffolds
2.4 Characterizations of Sr-MBG and Nanofibers Scaffolds
2.5 Release of Ions by Nanofibers Scaffolds
2.6 Cytocompatibility of Nanofiber Scaffolds
2.7 In Vitro Cell Induction
2.8 In Vivo Experiments
2.9 Statistical Analysis
3 Result
3.1 Characterization of Sr-MBG
3.2 Characterization of Nanofibers Scaffolds
3.3 Biocompatibility of Nanofibers Scaffolds
3.4 Self-adaptive Dual-Inducible Effect of SDNS in Vitro
3.5 Effect of SDNS on Tendon-to-Bone Interface Synchronous Regeneration in Vivo
4 Discussion
5 Conclusion
References
Medical Informatics
A Multifunctional Image Processing Tool for CT Data Standardization
1 Introduction
2 Methods
2.1 IO Module
2.2 Denoising Module
2.3 Resampling Module
2.4 Registration Module
2.5 Logging Module
2.6 Tag Modifying Module
3 Results
4 Discussion
5 Conclusions
References
Effect of Schroth Exercise on Pulmonary Function and Exercise Capacity in Patients with Severe Adolescent Idiopathic Scoliosis
1 Introduction
2 Materials and Methods
2.1 Subjects
2.2 Interventions
2.3 Data Collection
2.4 Statistical Analysis
3 Results
3.1 Subjects
3.2 Pulmonary Function
3.3 Exercise Capacity
4 Discussion
5 Conclusions
References
An Imputation Approach to Electronic Medical Records Based on Time Series and Feature Association
1 Introduction
2 Related Work
2.1 Statistical Prediction Imputation Methods
2.2 Data Mining Prediction Imputation Methods
2.3 Deep Learning Prediction Imputation Methods
3 Missing Value Imputation of Electronic Medical Records
3.1 Description of the Problem
3.2 Our Approach—GRU-RMF
4 Experimental Method
4.1 Experimental Setup
4.2 Evaluation Criteria
5 Results
6 Discussions
6.1 Usefulness of the Model
6.2 Impact of Data Dimension on the Model
7 Conclusion
References
A Software Tool for Anomaly Detection and Labeling of Ventilator Waveforms
1 Introduction
2 Materials and Methods
2.1 Data Collection
2.2 Design of Software Functions
2.3 Automatic Extraction of Breathing Cycles
2.4 Anomaly Detection of Mechanical Ventilation Waveforms
3 Results
4 Discussion
5 Conclusion
References
A Machine Learning Approach for Predicting the Time Point of Achieving a Negative Fluid Balance in Patients with Acute Respiratory Distress Syndrome
1 Introduction
2 Materials and Methods
2.1 Source of Data
2.2 Participants
2.3 Definition of Time Window and Label
2.4 Predictors
2.5 Training and Testing Model
3 Results
4 Discussion
5 Conclusions
References
3D Simulation Model for Urine Detection in Human Bladder by UWB Technology
1 Introduction
2 Methods
2.1 Antenna Model
2.2 The 3D Bladder Model
2.3 Imaging Configuration
2.4 Imaging Algorithm
3 Results
4 Discussion
5 Conclusions
References
AI in Medicine
A Nearest Neighbor Propagation-Based Partial Label Learning Method for Identifying Biotypes of Psychiatric Disorders
1 Introduction
2 Material and Method
2.1 Materials
2.2 Method
2.3 Evaluation of Biotypes
3 Results
4 Conclusions
References
Predicting Timing of Starting Continuous Renal Replacement Therapy for Critically Ill Patients with Acute Kidney Injury Using LSTM Network Model
1 Introduction
2 Materials And Methods
2.1 Source of Data
2.2 Research Population
2.3 Data Extraction and Feature Screening
2.4 Time Series Forecasting Events Via Sliding Window
2.5 Model Training and Prediction
3 Results
4 Discussion
5 Conclusions
References
An End-To-End Seizure Prediction Method Using Convolutional Neural Network and Transformer
1 Introduction
2 Materials and Methods
2.1 Data Description
2.2 Preprocessing
2.3 Convolutional Neural Networks
2.4 Transformer Block
3 Results
4 Discussion
5 Conclusion
References
Ensemble Feature Selection Method Using Similarity Measurement for EEG-Based Automatic Sleep Staging
1 Introduction
2 Methods
2.1 Generation of Original Feature Set
2.2 Generation of Feature Subsets with Desired Similarity
2.3 Similarity Measurement Base on Base Feature Selectors
3 Two-Stage Majority Voting Method
4 Experiments and Results
4.1 Dataset Description and Preprocessing
4.2 Results and Analysis
5 Conclusions
References
Biomedical Photonics
Rapid Virus Detection Using Recombinase Polymerase Amplification Assisted by Computational Amplicon-Complex Spectrum
1 Introduction
2 Method
3 Results
3.1 Absorption Spectra Calculation of 1× GelRed Solution
3.2 Determination of Reasonable GelRed Concentration
3.3 Detection of Influenza a Virus
4 Discussion
5 Conclusions
References
Neuromodulation with Submillimeter Spatial Precision by Optoacoustic Fiber Emitter
1 Introduction
2 Methods
2.1 CPOF Fabrication
2.2 Optoacoustic Signal Measurement
2.3 Cortical Neuron Culture
2.4 Calcium Imaging and Neural Stimulation
3 Results
4 Discussion
5 Conclusions
References
Medical Laboratory Engineering
A Novel Poly(3-hexylthiophene) Microelectrode for Ascorbic Acid Monitoring During Brain Cytotoxic Edema
1 Introduction
2 Materials and Methods
2.1 Carbon Fiber Microelectrodes (CFMEs) Preparation
2.2 Fabrication of CFME/P3HT-N-MWCNTs Sensor
2.3 Brain Slices Experiments
2.4 HT22 Cells’ Experiments
3 Results
3.1 Preparation Process and Electrochemical Performance of CFME/P3HT-N-MWCNTs
3.2 Monitoring of HT22 Cells During Cytotoxic Edema
3.3 Monitoring of Brain Slices During Cytotoxic Edema
4 Discussion
5 Conclusions
References
Health Engineering
Radar Translator: Contactless Eyeblink Detection Assisting Basic Daily Intension Voice for the Paralyzed Aphasia Patient Using Bio-Radar
1 Introduction
2 Principle Introduction and Operating Scenario
2.1 Principle of Radar-Based Eyelid Blink Detection
2.2 Operating Scenario
3 Signal Processing and System Control
4 Experiments and Validation
4.1 Effectiveness Verification of Blink Detection
4.2 Quantitative System Performance Investigation
4.3 Practical Usability Testing of the Radar-Based Translator
5 Conclusions
References
Image Reconstruction Algorithm in Electrical Impedance Tomography Based on Improved CNN-RBF Model
1 Introduction
2 Method
2.1 Electrical Impedance Tomography (EIT)
2.2 Preparing Data
2.3 Network Architecture for EIT Reconstruction
3 Result
3.1 Evaluation Metrics
3.2 Correlation Analysis of Boundary Voltages
3.3 Training and Evaluating the Model
4 Discussion
5 Conclusions
References
An Explainable Assessment for Depression Detection Using Frontal EEG
1 Introduction
2 Methods
2.1 Data and Processing
2.2 Feature Extraction
2.3 Statistics and Classification
2.4 Feature Validity Assessment
3 Result
4 Discussion
5 Conclusion
References
Heart Murmur Detection in Phonocardiogram Signals Using Support Vector Machines
1 Introduction
2 Methods
2.1 Data
2.2 Pre-processing
2.3 Segmentation
2.4 Feature Extraction
2.5 Classification Algorithm
3 Results
4 Discussion
5 Conclusions
References
Computational Systems, Modeling and Simulation in Medicine, Multiscale Modeling and Synthetic Biology
SCpipeline: The Tool and Web Service for Identifying Potential Drug Targets Based on Single-Cell RNA Sequencing Data
1 Introduction
2 Materials and Methods
3 Results
4 Discussion
References
Study on the Detection of Vertigo Induced by GVS Based on EEG Signal Feature Binary Classification
1 Introduction
2 Methods
2.1 Subjects
2.2 Experimental Method
2.3 Signal Feature Extraction Method
2.4 Classification Algorithm
3 Results
3.1 DHI Scale Analysis
3.2 Binary Classification Results Based on Different Feature Data Sets
3.3 Verification Results
4 Disscussion
5 Conclusions
References
Therapeutic and Diagnostic Systems and Technologies
CT Images-Based Automatic Path Planning for Pedicle Screw Placement Incorporating Anatomical and Biomechanical Considerations
1 Introduction
2 Methods
2.1 Dataset Production and Pre-processing
2.2 Training of Mechanical Distributions Generating Subnetwork
2.3 Path Points Location Subnetwork
2.4 Evaluation
3 Results and Discussion
3.1 Results
3.2 Discussion
4 Conclusions
References
Reciprocal Unlocking Between Autoinhibitory CaMKII and Tiam1: A Simulation Study
1 Introduction
2 Methods
2.1 Docking
2.2 Visualization
2.3 PDB Entries
3 Results
4 Conclusions
5 Discussion
References
Author Index
Recommend Papers

12th Asian-Pacific Conference on Medical and Biological Engineering: Proceedings of APCMBE 2023, May 18–21, 2023, Suzhou, China―Volume 2: ... Biology (IFMBE Proceedings, 104)
 303151484X, 9783031514845

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

IFMBE Proceedings 104

Guangzhi Wang · Dezhong Yao · Zhongze Gu · Yi Peng · Shanbao Tong · Chengyu Liu Editors

12th Asian-Pacific Conference on Medical and Biological Engineering Proceedings of APCMBE 2023, May 18–21, 2023, Suzhou, China—Volume 2: Computer-Aided Surgery, Biomechanics, Health Informatics, and Computational Biology

IFMBE Proceedings

104

Series Editor Ratko Magjarevi´c, Faculty of Electrical Engineering and Computing, ZESOI, University of Zagreb, Zagreb, Croatia

Associate Editors Piotr Łady˙zy´nski, Warsaw, Poland Fatimah Ibrahim, Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia Igor Lackovic, Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia Emilio Sacristan Rock, Mexico DF, Mexico

The IFMBE Proceedings Book Series is an official publication of the International Federation for Medical and Biological Engineering (IFMBE). The series gathers the proceedings of various international conferences, which are either organized or endorsed by the Federation. Books published in this series report on cutting-edge findings and provide an informative survey on the most challenging topics and advances in the fields of medicine, biology, clinical engineering, and biophysics. The series aims at disseminating high quality scientific information, encouraging both basic and applied research, and promoting world-wide collaboration between researchers and practitioners in the field of Medical and Biological Engineering. Topics include, but are not limited to: • • • • • • • • • •

Diagnostic Imaging, Image Processing, Biomedical Signal Processing Modeling and Simulation, Biomechanics Biomaterials, Cellular and Tissue Engineering Information and Communication in Medicine, Telemedicine and e-Health Instrumentation and Clinical Engineering Surgery, Minimal Invasive Interventions, Endoscopy and Image Guided Therapy Audiology, Ophthalmology, Emergency and Dental Medicine Applications Radiology, Radiation Oncology and Biological Effects of Radiation Drug Delivery and Pharmaceutical Engineering Neuroengineering, and Artificial Intelligence in Healthcare

IFMBE proceedings are indexed by SCOPUS, EI Compendex, Japanese Science and Technology Agency (JST), SCImago. They are also submitted for consideration by WoS. Proposals can be submitted by contacting the Springer responsible editor shown on the series webpage (see “Contacts”), or by getting in touch with the series editor Ratko Magjarevic.

Guangzhi Wang · Dezhong Yao · Zhongze Gu · Yi Peng · Shanbao Tong · Chengyu Liu Editors

12th Asian-Pacific Conference on Medical and Biological Engineering Proceedings of APCMBE 2023, May 18–21, 2023, Suzhou, China—Volume 2: Computer-Aided Surgery, Biomechanics, Health Informatics, and Computational Biology

Editors Guangzhi Wang Department of Biomedical Engineering School of Medicine Tsinghua University Beijing, China Zhongze Gu State Key Laboratory of Bioelectronics School of Biological Science and Medical Engineering Southeast University Nanjing, China Shanbao Tong School of Biomedical Engineering Shanghai Jiao Tong University Shanghai, China

Dezhong Yao School of Life Science and Technology University of Electronic Science and Technology Chengdu, China Yi Peng Institute of Basic Medical Sciences Chinese Academy of Medical Sciences and Peking Union Medical College Beijing, China Chengyu Liu State Key Laboratory of Bioelectronics School of Instrument Science and Engineering Southeast University Nanjing, China

ISSN 1680-0737 ISSN 1433-9277 (electronic) IFMBE Proceedings ISBN 978-3-031-51484-5 ISBN 978-3-031-51485-2 (eBook) https://doi.org/10.1007/978-3-031-51485-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

APCMBE2023 Committees

Presidium President Xuetao Cao

Chinese Academy of Medical Sciences

Executive President Shengshou Hu

Fuwai Hospital, Chinese Academy of Medical Sciences

Vice Presidents Jing Cheng Hui Chi Xiaosong Gu Jinxiang Han Deyu Li Suiren Wan Guangzhi Wang Guosheng Wang Dezhong Yao Yiwu Zhao Hairong Zheng

Tsinghua University Medical Information Research Institute of Chinese Academy of Medical Sciences Nantong University Shandong First Medical University Beihang University Southeast University Tsinghua University Henan Tuoren Medical Device Co., Ltd. University of Electronic Science and Technology of China Naton Medical Group Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

Secretary-General Hui Chi

Medical Information Research Institute of Chinese Academy of Medical Sciences

vi

APCMBE2023 Committees

Scientific Committee Director Guangzhi Wang

Tsinghua University

Deputy Directors Luming Li Xuemin Xu Hairong Zheng

Tsinghua University Shanghai Jiao Tong University Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

Members Yilin Cao

Zhengtao Cao Jin Chang Jianghua Chen Weiyi Chen Wenjuan Chen Hui Chi Yazhou Cui Jianrong Dai Yubo Fan Jinxiang Han Feilong Hei Xiaotong Hou Jingbo Kang Deling Kong Deyu Li Jinsong Li Zongjin Li

The Ninth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine Air Force Specialty Medical Center Tianjin University The First Affiliated Hospital, Zhejiang University School of Medicine Taiyuan University of Technology Huawei Technologies Co., Ltd. Medical Information Research Institute of Chinese Academy of Medical Sciences Shandong Pharmaceutical Biotechnology Research Center Cancer Hospital, Chinese Academy of Medical Sciences Beihang University Shandong First Medical University Beijing Anzhen Hospital, Capital Medical University Beijing Anzhen Hospital, Capital Medical University The Sixth Medical Center of PLA General Hospital Nankai University Beihang University Zhejiang University Nankai University

APCMBE2023 Committees

Hongen Liao Kangping Lin Chengyu Liu Hongbing Lu Changsheng Ma Dong Ming Hongwei Ouyang Yingxin Qi Qizhu Tang Jie Tian Suiren Wan Tao Wan Shulin Wu Yifei Wang Zhibiao Wang Mengyu Wei Xunbin Wei Huayuan Yang Dezhong Yao Ming Zhang Yiwu Zhao Changren Zhou

Tsinghua University Chung Yuan Christian University Southeast University Air Force Medical University Beijing Anzhen Hospital, Capital Medical University Tianjin University Zhejiang University Shanghai Jiaotong University Wuhan University Key Laboratory of Molecular Imaging, Chinese Academy of Sciences Southeast University The Second Military Medical University Guangdong Provincial People’s Hospital Jinan University Chongqing Medical University University of Macau Peking University Shanghai University of Traditional Chinese Medicine University of Electronic Science and Technology of China The Hong Kong Polytechnic University Naton Medical Group Jinan University

Organizing Committee Director Qingming Luo

Hainan University

Deputy Directors Zhongze Gu Xueqing Yu Qiang Zhang Hairong Zheng

vii

Southeast University Guangdong Provincial People’s Hospital Shanghai United Imaging Healthcare Co., Ltd. Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

viii

APCMBE2023 Committees

Local Organizing Committee Members Xin Chen Xun Chen Xiang Dong Jianzeng Dong Qianjin Feng Feng Fu Xingming Guo Gang Huang Baohua Ji Linhong Ji Hua Jiang Xieyuan Jiang Xinquan Jiang

Yan Kang Xixiong Kang Bin Li Changying Li Tao Li Pengcheng Li Jun Liang Peixue Ling Gang Liu Hui Liu Lihua Liu Yajun Liu Zhicheng Liu Aili Lu Jiaxin Liu Ling Lv Zhenhe Ma Chenxi Ouyang

Shenzhen University University of Science and Technology of China Naton Medical Group Beijing Anzhen Hospital, Capital Medical University Southern Medical University Air Force Medical University Chongqing University Shanghai University of Medicine and Health Sciences Zhejiang University Tsinghua University Zhejiang University Beijing Jishuitan Hospital The Ninth People’s Affiliated Hospital of Shanghai Jiao Tong University School of Medicine Shenzhen Technology University Beijing Tiantan Hospital, Capital Medical University Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiaotong University Beijing Aeonmed Co., Ltd. Tianjin Third Central Hospital Huazhong University of Science and Technology The Second Affiliated Hospital of the Air Force Military Medical University National Glycoengineering Research Center Xiamen University Chinese Academy of Medical Sciences The Fourth Hospital of Hebei Medical University Beijing Jishuitan Hospital of Capital Medical University Capital Medical University Tsinghua University Chinese Academy of Medical Sciences The Fourth School of Clinical Medicine of Nanjing Medical University Northeast University Fuwai Hospital, Chinese Academy of Medical Sciences

APCMBE2023 Committees

Zhaolian Ouyang Yi Peng Fang Pu Xianzheng Sha Zhu Shen Guosheng Wang Shunren Xia Guimin Zhang Songgen Zhang

Medical Information Research Institute of Chinese Academy of Medical Sciences Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences Beihang University China Medical University Sichuan Academy of Medical Sciences/Sichuan Provincial People’s Hospital Henan Tuoren Medical Device Co., Ltd. Hangzhou City University Lunan Pharmaceutical Group TINAVI Medical Technologies Co., Ltd.

International Organizing Committee Members Michael B. Flood Luming Li Hong’en Liao Chengyu Liu Jaw-Lin Wang Chih-Chung Huang Min Wang Ming Zhang Ichiro Sakuma Keiko Fukuta Mang I. Vai Peng Un Mak Lodge Pun Chulhong Kim Suparerk Janjarasjitt

Locus Consulting, Australia Tsinghua University, China Tsinghua University, China Southeast University, China Taiwan University, Chinese Taipei Cheng Kung University, Chinese Taipei University of Hong Kong, Hong Kong, SAP The Hong Kong Polytechnic University, Hong Kong, SAP The University of Tokyo, Japan Japanese Association for Clinical Engineers, Japan University of Macau, Macau, SAP University of Macau, Macau, SAP University of Macau, Macau, SAP Pohang University of Science and Technology, South Korea Ubon Ratchathani University, Thailand

ix

Preface

The 12th IFMBE Asian Pacific Conference on Medical and Biological Engineering (APCMBE2023) was held in Suzhou, China, from 18 to 21 May 2023. The conference was organized by the Chinese Society of Biomedical Engineering (CSBME) and was endorsed by International Federation for Medical and Biological Engineering (IFMBE). Aimed to gather talents in the fields of medicine, enterprise, research, and education, APCMBE2023 focused on key fields and key technologies of biomedical engineering and promoted the integration of multiple disciplines. Special attention was paid to the frontiers of biomedical engineering, including medical artificial intelligence, neural engineering, medical imaging, computer-aided surgery, biosensors, rehabilitation engineering, medical informatics, biomechanics, and other hot topics and key issues. The progress of biomedical engineering has provided strong support for the realization of translational medicine and personalized medicine based on interdisciplinary cooperation and information sharing. Improving medical standards and ensuring people’s health are a long journey and a highly challenging undertaking. We need to maintain an open, innovative, and cooperative spirit, jointly address the challenges, and promote the continuous progress and transformation of technology in biomedical engineering. In total, we received 363 contributions, of which 181 contributions were full-length scientific papers, and the rest were short abstract submissions. In total, 100 papers met the standards for publication in the Proceedings of APCMBE2023. We, the local organizers, would like to thank IFMBE for its support in organizing APCMBE2023. Our thanks go to the members of the International Organizing Committee for their contribution. We extend our thanks to the organizers of topical sessions and the reviewers. They made the creation of these Proceedings possible by devoting their time and expertise to reviewing the received manuscripts and thus allowed us to maintain a high standard in selecting the papers for the Proceedings. And last, but certainly not least, we would like to thank Springer Nature publishing company for the support and assistance in publishing these Proceedings. Xuetao Cao President, APCMBE2023

Contents

Computer-Aided Surgery Inside-Out Accurate Head Tracking with Head-Mounted Augmented Reality Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haowei Li, Wenqing Yan, Yuxing Yang, Zhe Zhao, Hui Ding, and Guangzhi Wang A Model-Guided Method for Ultrasound Probe Calibration . . . . . . . . . . . . . . . . . . Jiasheng Zhao, Haowei Li, Sheng Yang, Chaoye Sui, Hui Ding, and Guangzhi Wang

3

10

Real-Time Medical Tool Runout Monitor Based on Dual Laser Displacement Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sheng Yang, Haowei Li, Hui Ding, and Guangzhi Wang

18

Correction of Premature Closure of Sagittal Suture with Small-Incision Traction Bow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanshan Du, Li Wen, Zhenmin Zhao, and Junchen Wang

26

A Home-Style Intelligent Monitoring Sanitize Robot . . . . . . . . . . . . . . . . . . . . . . . B. Liu, J. Yang, J. Ding, D. Zhao, and J. Wang

38

YOLOv7-Based Multiple Surgical Tool Localization and Detection in Laparoscopic Videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md Foysal Ahmed and Gang He

48

A Frequency-Based Analysis Method to Improve Adversarial Robustness of Neural Networks for EEG-Based Brain-Computer Interfaces . . . . . . . . . . . . . . Sainan Zhang, Jian Wang, and Fang Chen

56

Robot-Assisted Optical Coherence Tomography for Automatic Wide-Field Scanning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yangxi Li, Yingwei Fan, and Hongen Liao

65

Adversarial Detection and Defense for Medical Ultrasound Images: From a Frequency Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jian Wang, Sainan Zhang, Yanting Xie, Hongen Liao, and Fang Chen

73

xiv

Contents

A Novel Model-Independent Approach for Autonomous Retraction of Soft Tissue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiaqi Chen, Longfei Ma, Xinran Zhang, and Hongen Liao

83

A Soft Robot Based on Magnetic-Pneumatic Hybrid Actuation for Complex Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhuxiu Liao, Jiayuan Liu, Longfei Ma, and Hongen Liao

91

A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional Spatial Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nan Zhang, Tianqi Huang, Jiayuan Liu, Yuqi Ji, Longfei Ma, Xinran Zhang, and Hongen Liao

98

An Image Fusion Method Combining the Advantages of Dual-Mode Optical Imaging in Endoscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Shipeng Zhang, Ye Fu, Xinran Zhang, Longfei Ma, Hui Zhang, Tianyu Xie, Zhe Zhao, and Hongen Liao An End-to-End Spatial-Temporal Transformer Model for Surgical Action Triplet Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Xiaoyang Zou, Derong Yu, Rong Tao, and Guoyan Zheng 2D/3D Reconstruction of Patient-Specific Surface Models and Uncertainty Estimation via Posterior Shape Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Wenyuan Sun, Yuyun Zhao, Jihao Liu, and Guoyan Zheng Semantics-Preserved Domain Adaptation with Target Diverse Perturbation and Test Ensembling for Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Xiaoru Gao, Runze Wang, Rong Tao, and Guoyan Zheng Biomechanics A New Mathematical Model for Assessment of Bleeding and Thrombotic Risk in Three Different Types of Clinical Ventricular Assist Devices . . . . . . . . . . 139 Yuan Li and Zengsheng Chen Analysis of YAP1 Gene as a Potential Immune-Related Biomarker and Its Relationship with the TAZ Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Shan He, Rushuang Xu, Qing Luo, and Guanbin Song Morphological Feature Recognition of Induced ADSCs Based on Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Ke Yi, Cheng Xu, Guoqing Zhong, Zhiquan Ding, Guolong Zhang, Xiaohui Guan, Meiling Zhong, Guanghui Li, Nan Jiang, and Yuejin Zhang

Contents

xv

Micromechanical Properties Investigation of Rabbit Carotid Aneurysms by Atomic Force Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Guixue Wang, Jingtao Wang, Xiangxiu Wang, Juhui Qiu, and Zhiyi Ye The Development of the “Lab-In-Shoe” System Based on an Instrumented Footwear for High-Throughput Analysis of Gait Parameters . . . . . . . . . . . . . . . . . 183 Ji Huang, Xin Ma, and Wen-Ming Chen 3D-Printed Insole Designs for Enhanced Pressure-Relief in Diabetic Foot Based on Functionally-Graded Stiffness Properties . . . . . . . . . . . . . . . . . . . . . . . . . 192 Xingyu Zhang, Pengfei Chu, Xin Ma, and Wen-Ming Chen A Novel Force Platform for Assessing Multidimensional Plantar Stresses in the Diabetic Foot—A Deep Learning-Based Decoupling Approach . . . . . . . . . 200 Hu Luo, Xin Ma, and Wen-Ming Chen MicroNano Bioengineering A Nanoparticle Tracking Analysis Algorithm for Particle Size Estimation . . . . . 211 Song Lang, Yanwei Zhang, Hanqing Zheng, and Yan Gong Biomaterials Self-adaptive Dual-Inducible Nanofibers Scaffolds for Tendon-To-Bone Interface Synchronous Regeneration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A. Haihan Gao, B. Liren Wang, C. Tonghe Zhu, D. Jinzhong Zhao, and E. Jia Jiang Medical Informatics A Multifunctional Image Processing Tool for CT Data Standardization . . . . . . . . 243 Yiwei Gao, Jinnan Hu, Peijun Hu, Chao Huang, and Jingsong Li Effect of Schroth Exercise on Pulmonary Function and Exercise Capacity in Patients with Severe Adolescent Idiopathic Scoliosis . . . . . . . . . . . . . . . . . . . . . 251 Wei Liu, Christina Zong-Hao Ma, Chang Liang Luo, Yu Ying Li, and Hui Dong Wu An Imputation Approach to Electronic Medical Records Based on Time Series and Feature Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Y. F. Yin, Z. W. Yuan, J. X. Yang, and X. J. Bao

xvi

Contents

A Software Tool for Anomaly Detection and Labeling of Ventilator Waveforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 Cheng Chen, Zunliang Wang, Chuang Chen, Xuan Wang, and Songqiao Liu A Machine Learning Approach for Predicting the Time Point of Achieving a Negative Fluid Balance in Patients with Acute Respiratory Distress Syndrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Haowen Lei, Zunliang Wang, and Songqiao Liu 3D Simulation Model for Urine Detection in Human Bladder by UWB Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Mengfei Jiang, Liping Qin, Hui Zhen, and Gangmin Ning AI in Medicine A Nearest Neighbor Propagation-Based Partial Label Learning Method for Identifying Biotypes of Psychiatric Disorders . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Yuhui Du, Bo Li, Ju Niu, and Vince D. Calhoun Predicting Timing of Starting Continuous Renal Replacement Therapy for Critically Ill Patients with Acute Kidney Injury Using LSTM Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Chengyuan Li, Zunliang Wang, Lu Niu, and Songqiao Liu An End-To-End Seizure Prediction Method Using Convolutional Neural Network and Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Yiyuan Wang and Wenshan Zhao Ensemble Feature Selection Method Using Similarity Measurement for EEG-Based Automatic Sleep Staging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Desheng Zhang and Wenshan Zhao Biomedical Photonics Rapid Virus Detection Using Recombinase Polymerase Amplification Assisted by Computational Amplicon-Complex Spectrum . . . . . . . . . . . . . . . . . . . 335 F. Yang, Y. Su, F. G. Li, T. Q. Zhou, X. S. Wang, H. Li, S. L. Zhang, and R. X. Fu Neuromodulation with Submillimeter Spatial Precision by Optoacoustic Fiber Emitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Ninghui Shao and Hyeon Jeong Lee

Contents

xvii

Medical Laboratory Engineering A Novel Poly(3-hexylthiophene) Microelectrode for Ascorbic Acid Monitoring During Brain Cytotoxic Edema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Zexuan Meng, Yuchan Zhang, Lu Yang, Shuang Zhao, Qiang Zhou, Jiajia Chen, Jiuxi Sui, Jian Wang, Lizhong Guo, Luyue Chang, Guixue Wang, and Guangchao Zang Health Engineering Radar Translator: Contactless Eyeblink Detection Assisting Basic Daily Intension Voice for the Paralyzed Aphasia Patient Using Bio-Radar . . . . . . . . . . . 361 Fugui Qi, Jiani Li, Haoyang Yu, Haoxin Han, Wei Huang, Guohua Lu, and Jianqi Wang Image Reconstruction Algorithm in Electrical Impedance Tomography Based on Improved CNN-RBF Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Liyuan Zhang, Xuechao Liu, Lei Li, Feng Fu, Li Jin, and Bin Yang An Explainable Assessment for Depression Detection Using Frontal EEG . . . . . 377 Feifei Chen, Lulu Zhao, Licai Yang, Jianqing Li, and Chengyu Liu Heart Murmur Detection in Phonocardiogram Signals Using Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384 Foli Fan, Yuwei Zhang, Chenxi Yang, Jianqing Li, and Chengyu Liu Computational Systems, Modeling and Simulation in Medicine, Multiscale Modeling and Synthetic Biology SCpipeline: The Tool and Web Service for Identifying Potential Drug Targets Based on Single-Cell RNA Sequencing Data . . . . . . . . . . . . . . . . . . . . . . . 395 Lu Lin, Qianghan Shao, Xiao Sun, and Hongde Liu Study on the Detection of Vertigo Induced by GVS Based on EEG Signal Feature Binary Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Y. Geng and W. Xue Therapeutic and Diagnostic Systems and Technologies CT Images-Based Automatic Path Planning for Pedicle Screw Placement Incorporating Anatomical and Biomechanical Considerations . . . . . . . . . . . . . . . . 421 Xintong Yang, Yunning Wang, Yajun Liu, Xuquan Ji, Anyi Guo, Yan Hu, and Wenyong Liu

xviii

Contents

Reciprocal Unlocking Between Autoinhibitory CaMKII and Tiam1: A Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Zhen Yu, Xiaonian Ji, Jiaqi Zuo, and Xiaodong Liu Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

Computer-Aided Surgery

Inside-Out Accurate Head Tracking with Head-Mounted Augmented Reality Device Haowei Li1 , Wenqing Yan2 , Yuxing Yang1 , Zhe Zhao3,4 , Hui Ding1 , and Guangzhi Wang1(B) 1 Department of Biomedical Engineering, Tsinghua University, Beijing, China

[email protected]

2 School of Medicine, Tsinghua University, Beijing, China 3 Department of Orthopaedics, Beijing Tsinghua Changgung Hospital, Beijing, China 4 School of Clinical Medicine, Tsinghua University, Beijing, China

Abstract. Objective: External Ventricular Drainage (EVD) is a widely used procedure in neurosurgery that is restricted in accuracy and reproducibility due to free-hand operation. Augmented reality (AR) improves punctuation success rate by superimposing virtual paths on the operation area. However, the effectiveness of surgical guidance is affected by tracking accuracy. This paper aims to achieve accurate and stable head tracking during EVD surgery. Methods: We propose a dynamic inside-out tracking method combining retro-reflective markers and point clouds. First, built-in infrared depth sensor of HoloLens 2 is used to identify markers pasted on patient’s head for coarse registration of preoperative images and intraoperative patient. Real-time 3D point clouds and point-to-plane ICP registration are then used to further improve tracking accuracy and stability. Meanwhile, we calibrate and correct the depth distortion of the HoloLens 2 depth sensor on different materials, improving the accuracy of point cloud-based tracking methods. Results: The root mean square error (RMSE) of preoperative registration is less than 1.6mm; average RMSE of intraoperative head tracking is less than 1.28mm. Meanwhile, average angular tracking jitter is reduced by more than 40% when integrating point clouds. The proposed method can achieve 37.7fps tracking. Conclusion: The retro-reflective marker and point cloud hybrid tracking method in this paper can achieve high-precision real-time head tracking, providing the potential for accurate visual guidance in EVD surgery. Keywords: Augmented Reality · Head Tracking · Point Cloud · Surface Reconstruction

1 Introduction EVD surgery is a neurosurgery widely used in acute hydrocephalus, intraventricular hemorrhage and intracranial hypertension [1, 2]. The accuracy of drainage tube insertion affects the probability of brain injury and complications, thus determining the success Haowei Li and Wenqing Yan have contributed equally to this work. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 3–9, 2024. https://doi.org/10.1007/978-3-031-51485-2_1

4

H. Li et al.

of the operation. Traditional freehand puncture process uses surface feature points as a reference and strongly relies on the surgeon’s experience [3], leading to low accuracy and reproducibility, higher complication probability and higher operative risk. Image-guided operation uses infrared tools for tracking of patients and surgical tool and provides visual guidance on 2D monitors. However, this would lead to complex system setup, distraction of surgeons’ attention and hand-eye incoordination [4]. Augmented reality enhances neurosurgery operation accuracy by providing visual guidance as superimposed virtual images [5]. A core problem during this procedure is to track patient accurately with AR device. Liebmann et al. used a pointer with AprilTag for spine surface digitalization [6], and then used point cloud registration method for spine tracking. However, the tracking accuracy is restricted by the marker size and point cloud sparsity. Kunz et al. used infrared depth sensor of HoloLens for retro-reflective marker tracking [7], reaching submillimeter translation error, which is then proven to behave relatively poorly on angular error [8]. Point cloud registration has been used for marker-less tracking, which can potentially provide more stable results. However, Gu et al. proved that a systematic depth error exists on the infrared depth sensor integrated on commercial AR devices [9], leading to large tracking errors. To provide accurate and stable tracking results in EVD, this work proposes a method integrating retro-reflective marker tracking and point cloud registration for real-time accurate head tracking. A calibration and correction procedure is first finished for HoloLens 2 depth sensor on different materials. A surface reconstruction procedure is then proposed for accurate preoperative marker-patient registration. Finally, marker tracking and point cloud registration are used in sequence to provide real-time stable tracking.

2 Methods This paper proposes a method that only uses built-in depth sensor in HoloLens 2 to enable accurate and stable head tracking by combining infrared retro-reflective markers and point clouds (Fig. 1). To provide a larger field of view and higher resolution, the Articulated Hand Tracking (AHAT) mode of HoloLens 2 depth sensor [10] is used to acquire data for both markers and point clouds. To take advantage of both methods, retro-reflective markers are pasted on the patient’s head and registered with preoperative images. Intraoperatively, markers are used to provide coarse tracking results; point clouds are then used for pose enhancement. This section details the framework of the method, including tool definition and detection, depth camera undistortion, preoperative registration and dynamic head tracking. To prove the capability of our method, we conducted a series of experiments on preoperative registration, intraoperative tracking and efficiency. 2.1 Retro-reflective Tool Definition and Detection Retro-reflective tapes are cut into circles and pasted on the patients’ heads (n ≥ 3) to provide coarse tracking results. A procedure including marker detection, tool definition and tool detection is then used for coarse tracking. During marker detection, 2D

Inside-Out Accurate Head Tracking with Head-Mounted Augmented

5

Fig. 1. Proposed EVD intraoperative head tracking framework.

marker centers are first recognized from the image with threshold criterion and connected component detection. Providing camera intrinsic parameters and depth values, marker  T    positions in camera space C can be calculated as Xi = di xi , yi , 1 / xi , yi , 1 2 . Given a specific tool shape A, the difference between it and a marker frame can be evaluated with rigid body fitting error. The geometric mean of this error on multiple frames is then used as the target for tool shape optimization. Finally, distance information between different markers in a frame and the tool definition is used to recognize the tool from sensor data and is then used to calculate the tool pose TAC . 2.2 Camera Depth Undistortion In the work of Gu et al. [9], a systematic depth error is proved to exist in AHAT camera, which would bring large error to point cloud based tracking method. Meanwhile, due to different textures, colors and other surface characteristics, different materials may present different errors. In order to achieve higher tracking accuracy, a calibration and undistortion procedure is completed on four materials including nylon, photopolymer, PC-ABS and PLA. As retro-reflective tracking method is proved to present a low systematic error, it is regarded as reference material during calibration. As shown in Fig. 2a, a structure was designed to keep all materials on one plane. The structure was fixed at different depths and angles. Each time, 250 frames of sensor data were collected. The standard errors of the depth over 250 frames were used to describe the depth stability. Point clouds from the reference material were used to fit a reference plane, and depth error δ of a certain point was calculated as the distance from the point to the reference plane along the depth direction. First, a single point cloud showed that different materials behaved differently in both error and stability (Fig. 2b). In the depth stability test, depth values on retro-reflective material were proved to be more stable than those on other materials (p < 0.5). In terms of depth error, a significant difference was revealed between any two different materials at all depths and angles (p < 0.5), while single material was also proved to present differently at different depths or angles (p < 0.5). To undistort the depth value, the average depth error at 15 different depths from 382 mm to 655 mm was used to represent the systematic depth error of a certain material. As a result, nylon, PLA, photopolymer and PC-ABS respectively presented a depth error of 11.04mm, 25.79 mm, 23.91 mm and −6.63 mm. 2.3 Preoperative Registration During preoperative registration, multiple frames of sensor data are used to register retro-reflective markers and medical images. The tool shape A is first optimized, which

6

H. Li et al.

Fig. 2. a Structure for AHAT depth calibration. b Single frame point cloud of the calibration structure at different views. c Instability of the depth value at different depths. d, e Depth value error of different materials at different angles and depths.

is then used to calculate the relative pose between markers and AHAT camera in each frame. Depth values from the sensor are then corrected according to the target material as di = di − δ. Then, GPU-based TSDF [11] reconstruction is processed for head surface. By registering this surface and extracted surface from images with point-to-plane ICP algorithm, the spatial transformation from preoperative image to the retro-reflective markers can be calculated as TIA . 2.4 Dynamic Head Tracking Retro-reflective marker tracking stability is limited by depth data quality and is relatively unstable in rotation. To utilize both the speed of marker tracking and stability of point cloud registration, markers are used for coarse tracking result as TIC,coarse = TAC TIA first. Point-to-plane point cloud registration is further used to refine the pose and get final tracking result as TIC,fine .

3 Experiments and Results 3.1 Preoperative Registration In order to test the precision of preoperative registration, a head phantom was 3D printed with nylon. In each group, 3, 4, 5 or 6 markers were sticked on the phantom and 1200 frames of sensor data were recorded for surface reconstruction and registration. As

Inside-Out Accurate Head Tracking with Head-Mounted Augmented

7

shown in Fig. 3a, b, reconstructed surface provided more details and was smoother. RMSE of preoperative registration was smaller and the inlier point rate was higher when depth distortion was considered (Fig. 3f, g). When three markers were used, an RMSE less than 1.6mm was presented. Meanwhile, registration error tended to decrease when marker number increased, which may due to more stable tool poses provided by increased markers. Moreover, another experiment was conducted to reconstruct more complete phantom surface with the guiding of 9 markers (Fig. 3e). The result showed that the reconstructed surface fit the real model surface better when we considered depth distortion, and a large systematic error would exist in point cloud-based tracking method when this distortion was neglected due to the large gap between the reconstructed surface and ground-truth (GT). 3.2 Tracking Stability To evaluate the effect brought by adding point cloud registration, another 1200 frames were collected when HoloLens 2 and phantom kept static. Tracking results were calculated from both retro-reflective marker tracking and the proposed method. Tracking RMSEs and angular differences between frames were then calculated for evaluation. As shown in Fig. 3h, i, proposed method presented more stable than retro-reflective markers in angular tracking under all conditions. The mean angular jitter was 0.61 mm when 3 markers were used for tracking. A decrease of more than 40% was found on mean angular tracking jitter intensity on all marker numbers. However, proposed method presented higher mean tracking RMSEs (1.28mm at max), which may be due to higher depth instability on nylon material compared to retro-reflective tapes. 3.3 System Performance The performance of the proposed method was further tested on Intel 13900K CPU and RTX 4090 GPU. Preoperative reconstruction and dynamic tracking respectively reached 217.4fps and 37.7fps. Therefore, a potential of real-time tracking is proven.

4 Discussions This paper proposes a method for inside-out accurate and stable head tracking during EVD surgery using infrared depth sensor. According to our experiments, retro-reflective materials presented more stable than other materials, which may ensure a relatively accurate pose when marker number was limited. Meanwhile, different materials were found to have different systematic depth errors compared to retro-reflective materials. The mean error under 15 different depths were used to correct the depth images. The reconstruction test showed that depth information after depth correction fit GT better (Fig. 3e). Preoperative registration showed an error less than 1.6mm and an inlier point rate higher than 99.8%, demonstrating a high accordance between reconstructed surface and GT. A mean RMSE less than 1.5mm was found during dynamic tracking. It was also proved that the proposed method could decrease angular jitter by more than 40%, thus

8

H. Li et al.

Fig. 3. a A frame of point cloud from AHAT camera. b Reconstructed head surface. c Preoperative registration result of reconstructed head surface and image. d Intraoperative registration result of single frame point cloud and image. e Comparison of reconstructed surface with and without depth undistortion. f, g Registration RMSE and inlier point ratio of reconstructed surface and surface from CT image. h, i Angular tracking jitter and RMSE of two different tracking methods under different retro-reflective marker quantities.

could potentially provide more accurate tracking result during EVD surgery. Finally, the performance test showed that the method could run at more than 37fps during dynamic tracking. Comparatively, our method mainly contributes in: 1. Undistort depth error for AR built-in sensor, increasing tracking accuracy of point cloud-based tracking method. 2. Propose a method for high resolution smooth surface reconstruction with built-in AR sensor. 3. Integrate infrared markers and point clouds for more accurate and stable tracking. Despite promising tracking stability and accuracy presented by proposed method, certain limitations exist. Mean depth error on 15 different depths was used to correct depth images. However, it was proved that different depth errors existed when depths and angles were different, which led to local distortion during reconstruction (Fig. 3e). Meanwhile, different materials have different depth errors, and complex scenarios where surfaces of multiple materials need to be reconstructed simultaneously may be difficult. Therefore, more studies are needed to better calibrate and correct the depth error before it can provide more accurate information.

5 Conclusions This paper proposes an inside-out head tracking method that uses built-in depth sensor in head-mounted mixed reality device, integrating retro-reflective marker detection and point cloud registration. The method enables stable and real-time head tracking without

Inside-Out Accurate Head Tracking with Head-Mounted Augmented

9

additional electronics, fixed rigid tracking tools or prior tool shape information. The proposed method showed a registration error of RMSE < 1.6mm during preoperative procedure. It is also proved that the method can decrease the intraoperative angular tracking jitter by over 40%. Therefore, the proposed tracking method can potentially provide stable and accurate real-time tracking information for visual guidance in EVD surgery. Acknowledgments. This study is supported by NSFC (U20A20389), National Key R&D Program of China (2022YFC2405304), Tsinghua University Clinical Medicine Development Fund (10001020508) and Tsinghua ISRP (20197010009).

References 1. Srinivasan, V.M., O’Neill, B.R., Jho, D., et al.: The history of external ventricular drainage: Historical vignette. J. Neurosurg. 120(1), 228–236 (2014) 2. Kakarla, U.K., Kim, L.J., Chang, S.W. et al.: Safety and accuracy of bedside external ventricular drain placement. Neurosurgery 63(1 Suppl_1), ONS162-ONS167 (2008) 3. Pelargos, P.E., Nagasawa, D.T., Lagman, C., et al.: Utilizing virtual and augmented reality for educational and clinical enhancements in neurosurgery. J. Clin. Neurosci. 35, 1–4 (2017) 4. Fried, H.I., Nathan, B.R., Rowe, A.S., et al.: The insertion and management of external ventricular drains: an evidence-based consensus statement. Neurocrit. Care. 24(1), 61–81 (2016) 5. Meola, A., Cutolo, F., Carbone, M., et al.: Augmented reality in neurosurgery: a systematic review. Neurosurg. Rev. 40(4), 537–548 (2017) 6. Liebmann, F., Roner, S., von Atzigen, M., et al.: Pedicle screw navigation using surface digitization on the Microsoft HoloLens. Int. J. Comput. Assist. Radiol. Surg. 14(7), 1157–1165 (2019) 7. Kunz, C., Maurer, P., Kees, F. et al.: Infrared marker tracking with the HoloLens for neurosurgical interventions. Curr. Dir. Biomed. Eng. 6(1) (2020) 8. Martin-Gomez, A., Li, H.W., Song, T.Y., et al.: STTAR: surgical tool tracking using offthe-shelf augmented reality head-mounted displays. IEEE Trans. Vis. Comput. GR (2023). https://doi.org/10.1109/TVCG.2023.3238309 9. Gu, W.H., Shah, K.J., Knopf, J., et al.: Feasibility of image-based augmented reality guidance of total shoulder arthroplasty using microsoft HoloLens 1. Comput. Meth. Biomech. Biomed. Eng. Imaging Vis. 9(3), 261–270 (2021) 10. Ungureanu, D., Bogo, F., Galliani, S. et al.: Hololens 2 research mode as a tool for computer vision research (2020). arXiv:2008.11239 11. Curless, B., Levoy, M.: A volumetric method for building complex models from range images, ACM. Proc. Annu. Conf. Comput. Graph. Interact. Tech. New Orleans, LA, USA 1996, 303–312 (1996)

A Model-Guided Method for Ultrasound Probe Calibration Jiasheng Zhao, Haowei Li, Sheng Yang, Chaoye Sui, Hui Ding, and Guangzhi Wang(B) Department of Biomedical Engineering, Tsinghua University, Beijing, China [email protected]

Abstract. Objective: Ultrasound (US) probe calibration is critical for the localized ultrasound system. Key points and surfaces are often used for calibration, whose accuracy is restricted by the ultrasound volume effect. The aim of this paper is to accurately calibrate the US probe under volume effect. Method: We present a model-guided ultrasound probe calibration method, to provide accurate calibration results under volume effect. First, we design a rotationally symmetric calibration phantom unit to provide image areas weekly affected by volume effect during continuous scanning. Second, US images from the uncalibrated probe are used to reconstruct the 3D image volume. Finally, we use image registration and point registration for super-resolution unification of US images, the model and the tracking device. Results: In multiple probe calibration experiments at different probe depths, the average calibration precision was 0.163 mm; in the needle tip tracking experiment, the average detection accuracy was 0.335 mm. Conclusion: Guided by the specially designed model, our method can realize precise and accurate ultrasound probe calibration under volume effect. Keywords: Ultrasound probe calibration · Volume effect · Model-guided · 3D reconstruction

1 Introduction Localized ultrasound systems combine ultrasound probe with tracking devices (e.g., optical tracker, magnetic tracker, end of a robot arm), to provide spatial information. This combination endows temporally advantaged ultrasound spatial information and extends the application scenarios. In previous studies, localized ultrasound systems have been used to provide quantitative spatial information for ultrasound-guided punctuations [1, 2] and 3D reconstructions [3–5]. Accurate US probe calibration is critical for correct positions of US images. Existing calibration methods often use positions of key points and lines in tracking space and image space. However, US images represent the accumulation of reflection signals within an area, leading to the volume effect which would generate artifacts near the key features and reduce the calibration accuracy. Prager et al. [6] used a mechanical device to keep US probe perpendicular to the calibration plane, in order to ensure the correct position it presents in US images. This © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 10–17, 2024. https://doi.org/10.1007/978-3-031-51485-2_2

A Model-Guided Method for Ultrasound Probe Calibration

11

method requires a complex setup. Chen et al. [7] calculated the thickness of the acoustic beam based on the artifact distribution and used it to eliminate further the error caused by volume effects. However, this method is time-consuming and prone to human mistakes. N-line phantom is widely used in US probe calibration. Researchers [8, 9] have used various image-processing algorithms to automatically extract the intersection positions of the phantom and US image planes from images including volume artifacts. Wang et al. [10] integrated spatial position from line phantoms and geometric features from the model’s surface for higher calibration accuracy. However, this method is still proven to be influenced strongly by the volume effect. To realize accurate US probe calibration under volume effect, we propose a modelguided calibration method that can take advantage of the artifact’s shape. First, a model structure is designed to obtain image areas weakly affected by the volume effect during continuous scanning. US 3D reconstruction is then used for model reconstruction with an uncalibrated US probe. After that, the 3D image is registered with the model surface for super-resolution locations of key features. Finally, the model is used to connect the phantom and US image to realize accurate calibration.

2 Methods 2.1 Calibration Model Design and General Ideas The ultrasound volume effect expands the key features of the target which are not perpendicular to the image plane and blurs the image. This phenomenon affects the precision of the extraction of key features, thus reducing US probe calibration accuracy. Meanwhile, surfaces with different slopes have different volume artifact areas. Therefore, we propose a rotationally symmetrical calibration unit with continuously varying slopes. A rotationally specific calibration structure is then designed with multiple calibration units (Fig. 1a). As shown in Fig. 1b, during the continuous US scanning of the calibration unit, the area of the artifact caused by the volume effect changes with the angle between the US image and phantom surface. Generally, the artifact width in the normal direction of the edge increases when the probe moves from the center of the calibration unit to the margin. The artifact caused by the volume effect is minimal when the US plane passes through the center line of the calibration unit. Based on these features, we register the artifact-free model surface to 3D reconstructed US image to realize super-resolution US feature extraction. The complete working flow is shown in Fig. 2. The model is respectively registered with the reconstructed US image and the phantom, and is used as the bridge between the ultrasound image and the phantom for accurate calibration. 2.2 3D Reconstruction Based on an Uncalibrated Probe A 6 degree of freedom tracking target M is rigidly fixed on the US probe and an external W = tracking device W is used to track the probe and give the pose of the target as TM W W (RM , tM ). The US reconstruction is then finished by calculating the position of every pixel in the space W in each frame.

12

J. Zhao et al.

Fig. 1. a Calibration phantom containing 3 * 3 basic units. b The areas influenced by volume effect in a cross-section. c Image acquired at the centerline of the calibration unit. d Image acquired at other locations.

Fig. 2. System working flow: a Calibration model. b Reconstructed 3D ultrasound image. c Points picked from the phantom. d Registration results of the points and the phantom. e The surface extracted from the model. f Registration result of the model surface and US image.

Giving the pixel spacing (sx , sy ) of US image, the position of pixel (ui , vi ) in US  T probe coordinate system U can be calculated as Xi = ui sx , vi sy , 0 . Therefore, the position of the ith pixel and the jth frame in space W can be represented as:   W M M W R ti,j = RW X + t (1) i,j M ,j U U + tM ,j M where (RM U , tU ) is the spatial transformation between the US probe and the tracking target. Moreover, RM U can be directly determined by the assembly structure.

A Model-Guided Method for Ultrasound Probe Calibration

13

For an uncalibrated US probe, we designed a value fi,jW , which has a similar form W: compared to ti,j   M W fi,jW = RW M ,j RU Xi,j + tM ,j

fi,jW

(2)

W and Using the properties of rigid spatial transformations, the difference between ti,j can be calculated: W M ti,j − fi,jW = RW M ,j tU

(3)

For a scanning process where the rotation of the US probe RW M ,j has not changed, the displacement of any pixel between the real 3D position and that calculated with fi,jW remains constant. Thus, we define a space W  based on fi,jW , which only differs from W = RW t M . By converting pixels in every frame to space W  , space W on translation: tW  M U we can reconstruct US image without probe calibration. To acquire US images where rotation is fixed, the probe is fixed on a robot arm for translational movement. After acquiring a series of US 2D images Ij , 3D spatial positions of these points are calculated using Eq. (2) to form point clouds {Pj } with intensity information. After that, voxel down sampling and Gaussian filtering are used to generate the 3D reconstructed US image I 3D (Fig. 2b). 2.3 Image Registration To provide super-resolution information for probe calibration, this section registers the model space I to US reconstruction space W  under volume effect using reconstructed image I 3D and the outer surface from the designed model. In the cross-sectional direction (Fig. 1b), during the scanning procedure where the probe rotation is fixed, the width of the artifact caused by the volume effect is the smallest when the US image crosses the centerline of the calibration unit, and increases when the probe moves to the margin. Meanwhile, the intensity of US images near the true surface is higher and decreases on both sides (Fig. 1d). The intensity outside the artifact area can be regarded as 0. Based on such volume artifact features, we first extract the outer surface of the model with morphological processing (Fig. 2e) and rigidly register the surface to the reconstructed image I 3D based on Mattes mutual information [11, 12]. When the model surface is not completely aligned with the real surface in the US image, some areas of the model surface would be in the low-intensity artifact region or outside the artifact region of the US image, thus the error of similarity measurement would increase. When they are perfectly aligned, the error of similarity measurement would reach the minimum. Therefore, the relationship TIW  between the model space I and the space W  can be obtained by optimizing the Mattes mutual information values through stepwise iterations. At the same time, the size of the volume artifact varies at different depths of the probe, so the 3D rigid registration can take advantage of the global volume artifact properties to improve the registration accuracy.

14

J. Zhao et al.

For practical implementation, we use the model surface with a certain thickness for registration to address the effects of voxelization of the model surface, US image resolution and the multiple noises in real US. 2.4 Probe Calibration To finally calibrate the US probe, this section first registers the model space I to tracking device W . A tracking probe is first used for the coordinates localization of the conical tips, with which a coarse registration result TIW,coarse can be calculated. After that, the tracking probe is further used to fetch surface points on the model. Using TIW,coarse for initiation and the geometric mean of the minimum Euclidean distances from the fetched points to the model for evaluation, an iterative optimization method is used to obtain the transformation from the model to the tracking device TIW (Fig. 2d). Finally, the transform between tracking device W and the space W  can be calcu  −1 , and the US probe calibration result tUM can therefore be lated as T W = TIW TIW W calculated:  −1 W tW (4) tUM = RW M 

3 Experiments and Results To verify the performance of the ultrasound probe calibration method proposed in this paper, we fabricated the designed calibration phantom using the stereolithography 3D printing method with resin. The model was placed in a water tank which was kept at 37 ◦ C constantly to simulate sound speed in the human body [9]. The localized ultrasound systems included a 2D ultrasound system (Mindray DC8 with probe L12-3E), a magnetic tracking device (Northern Digital Incorporated Aurora®), and a graphics computing server (Intel® Core™ i9-10900X CPU @ 3.70GHz × 20, Ubuntu 20.04.5 LTS). The ultrasound image is acquired in the graphics server via a video capture card. To resolve the time asynchrony between the ultrasound image and the tracking information from different devices, we use a plane imaging and periodic motion based method [9] for time calibration. The time delay was tested to be approximately 106 ms. In this section, the precision of the calibration results was first verified. Four sets of calibration experiments were conducted at probe depths of 30 mm, 35 mm, and 40 mm respectively, and the root mean square error (RMSE) of calibration results on translation was calculated on each axis to evaluate the calibration precision. The results are shown in Table 1, where the precision ranges from 0.14 mm to 0.19 mm. After the calibration procedure, feature point picking-up errors were used to evaluate the accuracy of calibration results. A wooden needle tip was used as the test target. A magnetic tracking probe was used to pick the tip position 10 times, whose mean value was used as the experiment’s ground truth (GT). The positions of the needle tips were extracted from the US images at each probe depth, and the 3D spatial positions were then calculated based on the calibration results. The standard deviation of the picked points

A Model-Guided Method for Ultrasound Probe Calibration

15

Table 1. Probe calibration precision Translational RMSE (mm)

X-direction

Y-direction

Z-direction

Space

30 mm depth

0.010

0.012

0.044

0.190

35 mm depth

0.153

0.026

0.018

0.150

40 mm depth

0.008

0.047

0.037

0.148

All

0.092

0.134

0.048

0.163

on different axis and Euclidean distance from these points to GT was used to evaluate the accuracy. In this experiment, the needle tip picking with US was repeated twice for each probe depth, including 30 mm, 35 mm and 40 mm. The results are shown in Table 2. The average point-picking accuracy was 0.335 mm over all the six experiments. Table 2. Probe calibration accuracy

Error (mm)

X-direction

Y-direction

Z-direction

Space

0.223

0.168

0.142

0.335

4 Discussions and Conclusions This paper proposes a model-guided ultrasound probe calibration method, which makes full use of the variations of artifacts caused by the volume effect under different conditions. A specially designed model was used as feature guides connecting ultrasound images and the phantom for probe calibration. The calibration precision experiment showed that the calibration results for three different probe depths have translational errors of 0.190 mm, 0.150 mm and 0.148 mm, respectively. This result implicated that the proposed calibration method had a high reproducibility. Though calibration results from different depths showed high accordance (Fig. 3a), systematic differences exist between probes with different depths (Fig. 3b). This may be due to morphological differences in volume artifacts at different depths. The point-picking experiment showed that the pixel localization accuracy reached 0.335 mm in utility, which is an improvement compared to relevant works, as shown in Table 3. Moreover, the proposed method is not restricted in circumstances where the magnetic tracking system is used for tracking information. It can be extended to situations where optical tracking devices and robot arms are used to provide the pose of the probe. Compared to the previous works, the proposed method mainly contributes in: 1. Design a model structure that can make use of the shape of the volume artifacts and achieve accurate registration.

16

J. Zhao et al.

Fig. 3. a Vector diagram of calibration results. b Vector diagram of relative error of multiple calibration results. Table 3. Comparison of results of relevant works Precision (mm)

Accuracy (mm)

Prager [6]

0.92



Chen [7]



0.72

Chen [8]



0.66

Shen [9]

0.896

1.022

Wang [10]



0.759

Pu [13]



0.694

Our Method

0.163

0.335

2. Provide an ultrasound 3D reconstruction method based on an uncalibrated probe. 3. Use a 3D model to connect ultrasound images and the phantom, to achieve superresolution calibration. Although the method proposed can effectively consider volume effect characteristics and reduce the influence of volume effect on the accuracy of US probe calibration, certain limitations exist. First, this method needs to fix the rotation of the US probe during data acquisition and needs the assistance of the robot arm or multi-axis translation structure, leading to a relatively complicated calibration. Besides, the volume effect varies at different probe depths and probe frequencies, which may potentially impact the registration accuracy of the proposed method. Further studies are needed to achieve accurate and convenient probe calibration under different conditions. Acknowledgment. This work was supported by National Key R&D Program of China (2022YFC2405304), NSFC (U20A20389), Tsinghua University Initiative Scientific Research Program (20197010009) and a grant from the Guoqiang Institute, Tsinghua University.

A Model-Guided Method for Ultrasound Probe Calibration

17

References 1. Chen, K., Li, Z., Li, L. et al.: Three dimensional ultrasound guided percutaneous renal puncture: a phantom study. In: Proceedings of 2012 IEEE-EMBS International Conference on Biomedical Health Information. IEEE, pp. 683–686 (2012) 2. März, K., Franz, A.M., Seitel, A., et al.: Interventional real-time ultrasound imaging with an integrated electromagnetic field generator. Int. J Comput. Ass. Rad. 9(5), 759–768 (2014) 3. Zhou, X., Papadopoulou, V., Leow, C.H., et al.: 3-D flow reconstruction using divergencefree interpolation of multiple 2-D contrast-enhanced ultrasound particle imaging velocimetry measurements. Ultrasound Med. Biol. 45(3), 795–810 (2019) 4. Dai, X., Lei, Y., Wang, T., et al.: Self-supervised learning for accelerated 3D high-resolution ultrasound imaging. Med. Phys. 48(7), 3916–3926 (2021) 5. Song, S., Huang, Y., Li, J. et al.: Development of implicit representation method for freehand 3D ultrasound image reconstruction of carotid vessel. In: 2022 IEEE International Ultrasonic Symposium. IEEE, pp. 1–4 (2022) 6. Prager, R.W., Rohling, R.N., Gee, A., et al.: Rapid calibration for 3-D freehand ultrasound. Ultrasound Med. Biol. 24(6), 855–869 (1998) 7. Chen, T.K., Ellis, R.E., Abolmaesumi, P.: Improvement of freehand ultrasound calibration accuracy using the elevation beamwidth profile. Ultrasound Med. Biol. 37(8), 1314–1326 (2011) 8. Chen, T.K., Thurston, A.D., Ellis, R.E., et al.: A real-time freehand ultrasound calibration system with automatic accuracy feedback and control. Ultrasound Med. Biol. 35(1), 79–93 (2009) 9. Shen, C., Lyu, L., Wang, G., et al.: A method for ultrasound probe calibration based on arbitrary wire phantom. Cogent Eng. 6(1), 1592739 (2019) 10. Wang, K.-J., Chen, C.-H., Lo, C.-Y. et al.: Ultrasound calibration for dual-armed surgical navigation system. J. Healthc. Eng. (2022) 11. Rahunathan, S., Stredney, D., Schmalbrock, P. et al.: Image registration using rigid registration and maximization of mutual information. In: 13th Annual Med Meets Virtual Reality Conference (2005) 12. Mattes, D., Haynor, D.R., Vesselle, H. et al.: Nonrigid multimodality image registration. Med. Imaging 2001: Image Process. 4322, Spie, 1609–1620 (2001) 13. Pu, G., Jiang, S., Yang, Z. et al.: A novel ultrasound probe calibration method for multimodal image guidance of needle placement in cervical cancer brachytherapy. Phys. Medica 100, 81–89 (2022)

Real-Time Medical Tool Runout Monitor Based on Dual Laser Displacement Sensors Sheng Yang, Haowei Li, Hui Ding, and Guangzhi Wang(B) Department of Biomedical Engineering, Tsinghua University, Beijing, China [email protected]

Abstract. Objective: Robot-assisted drilling or milling is used widely in surgeries concerning bones, which is restricted in precision and safety due to tool bit runout. This paper aims to achieve real-time accurate runout status monitor, to support safer surgeries. Methods: First, a rigid tool runout model is constructed. An orthogonal dual laser tracking structure and relevant calibration methods are then proposed. An algorithm is finally proposed to realize tool tip tracking under eccentrical load with the tool model and tracking structure. Results: The experiment under no load indicated a high accordance between the model and the measurement. Relative errors of 8.11% and 14.96% were presented in accuracy tests on x and y, respectively, while 5.5 μm and 6.2 μm absolute error were presented for precision. Moreover, the method was able to differentiate different stages during drilling under eccentrical loads. The calculation efficiency reaches 1736 fps. Conclusion: The proposed method can provide real-time accurate surgical drilling tool position status, proving the potential to improve the safety of surgeries. Keywords: Robotic Surgery · Drilling · Runout · Tracking

1 Introduction Robot-assisted drilling and milling are widely used in various orthopedic procedures such as hip replacement, intervertebral fusion, and cochlear implantation [1, 2]. Different surgical drilling tools are used to create surgical channels, facilitating implant placement and other procedures. The stability of rotating cutting tools is extremely significant for safe and precise surgery execution. However, due to the deviation when setting up the tool and the eccentric load when working on the rugged surface, the tools would be deflected during the surgery, whose amplitude is related to various factors including tool length, rotation speed, drilling depth, and tool shape [3]. In some surgeries, long drills are used, which are prone to suffer from serious runouts, potentially threatening vessels and nerves [4]. The runout effect would also increase cutting force, affect surface smoothness and increase heat generation [5]. Therefore, tool runout significantly impacts the precision and safety of the surgery. By monitoring tool runout, the safety of the execution can be evaluated in real time, and further feedback to robot control parameters, which is significant for the safe execution of robot-assisted orthopedic surgeries. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 18–25, 2024. https://doi.org/10.1007/978-3-031-51485-2_3

Real-Time Medical Tool Runout Monitor Based on Dual

19

During manual drilling and milling, surgeons adjust the operation status according to feelings and experiences. Researchers in machining optimize the tools’ structure by modeling to reduce runout [6, 7], where real-time runout monitoring is not considered. In robotic surgeries, some researchers use vibration sensors to detect tool wear and bone breakage [8, 9]. Others use force sensors to identify bone type during drilling to assess risks of breakthrough [9, 10]. However, little research has been found tracking tool tip directly. To provide effective tool runout information in robot-assisted surgeries, we propose an orthogonal laser tool tip tracking method. The medical tool runout is first modeled under eccentric load. A dual laser displacement sensor tracking structure is then designed and calibrated. An algorithm is finally designed for real-time tool tip tracking. The proposed method is tested in different scenarios to prove the capability of accurate tool monitoring.

2 Methods 2.1 Error Model

Fig. 1. a Tool rotation model with the shaft offset. b Simplified model of the tool. c, d Schematic diagram of the shaft section at different depths.

This section models the drilling tool under an eccentrical load during surgery. Compared to bones, drilling tools are with higher stiffness and can be considered rigid, while most of the tools are cylindrical. The runout of the tool mainly comes from the setup procedure and tool-holder radial deformation caused by cutting force during execution. The first one includes errors from the non-parallelism between the tool and the holder and the radial displacement. In most cases, the connection between the tool and the holder is well-adjusted, leading to little error. However, tool-holder radial deformation caused by eccentrical load leads to non-parallelism between the tool and holder, contributing most to the runout effect. As shown in Fig. 1a, b, drilling tool runout can be idealized as a rigid body with a fixed endpoint and a deflection error θ. At different execution statuses, the tool would rotation differently due to cutting force.

20

S. Yang et al.

2.2 Dual Laser Displacement Sensors Tracking Structure and Calibration Method An orthogonal dual laser tracking structure is proposed to realize tool monitoring (Fig. 2a). Two laser displacement sensors are fixed with the motor orthogonally at the same height by a mounting structure. By calculating the tool deflection angle from sensor data, tool tip position can thus be estimated. The laser displacement sensors measure the closest distance from the sensors to the tool sx,0 , sy,0 , which are in each sensor’s coordinate. To uniform the coordinates of sensors, their origin positions are first calibrated. As shown in Fig. 2b, a 3-dimensional micro-positioning platform is used to move a cylindrical pin gauge with radius r horizontally near the detection area. The assembly structure guarantees the parallelism between the moving directions and the sensors. First, the pin gauge is moved along the x-axis until the y-axis sensor achieves the minimum value as sy ; then, it is moved along the y-axis until the x-axis sensor achieves the minimum value as sx . Finally, the offset value of each sensor’s origin can be calculated as δx = sx − r, sy = sy,0 − δy . The detected distance of the sensors in the uniform coordinate in x and y direction can therefore be represented as sx = sx,0 − δx , sy = sy,0 − δy .

Fig. 2. a Dual laser displacement sensors tracking structure. b Origin calibration procedure of the tracking structure.

To estimate the tool tip position from sensor data, the distance from the laser displacement sensors to tool-holder connection point la and that from tool tip to connection point lt need to be calibrated (Fig. 1b). The distance between sensors and the tool tip lt − la can be directly measured. By generating two different relative displacements between the tool tip and the motor, relative movement can be calculated at the tool tip as σt and at laser sensors as σa . Finally, the lengths la and lt can be solved separately with σσat = llat . 2.3 Cutting Tool Tip Position Measurement Method As shown in Fig. 3a, the cross-section of the tool on the plane of laser displacement sensors is an ellipse when deflection exists. The deflection angle θ can be determined as tanθ = ||X0 − Xc ||2 /la , where X0 represents the rotation center and Xc represents the center of the tool. The focal length of the ellipse can be calculated with F1 F2 = 2rt tanθ , where rt is the radius of the tool. Therefore, the positions of the focus are F1 , F2 = (1 ± k)Xc ∓ kX0 , where k = rt ·||X0 − Xc ||2 /la . The measured point from the sensors

Real-Time Medical Tool Runout Monitor Based on Dual

21

SX (sx , 0), SY (0, sy ) lays on the boundary of the ellipse. Therefore, Xc and the radial deviation Xδa = Xc − X0 can be solved from the following formula:  ||F1 − SX||2 + ||F2 − SX||2 = 2rt (1) ||F1 − SY||2 + ||F2 − SY||2 = 2rt Finally, runout of the tool tip can be calculated as Xδ = lt · Xδa /la .

Fig. 3. a Shape of the cross-section when the tool is deflected. b The relationship between the tool and the cross-section. 0 , s0 )} was collected To determine the rotation center X0 of the tool, sensor data {(sx,i y,i under no load. First, the cross-sections of the tool are idealized as circles to calculate 0 } and fit the rotation center X 0 . After that, new tool centers {X k+1 } can tool centers {Xc,i 0 c,i be calculated every time based on the previous rotation center X0k . New rotation center X0k+1 can therefore be fitted. The iterative until the change of center   procedure runs  k k+1  position is lower than a certain threshold: X0 − X0  < δ. The final rotation center

can be obtained as X0 = X0k .

2

3 Experiments and Results 3.1 Experimental Setup In the experiment, Panasonic HG-C1030 laser displacement sensors (measurement range: ±5 mm, beam diameter: 50μm, configured as analog output) were used for tracking. The signal of sensors were sampled by an analog-to-digital converter (AD7606, 16bit accuracy, 1 kHz sampling rate). The motor used for the experiments was Maxon EC32 (maximum speed: 11000 rpm). The tools were a twist drill (diameter: 4 mm, length: 74 mm) and an endmill (diameter: 4 mm, length:50 mm). The tool holder was ER11 flexible chuck. A three-axis motion mechanism was used as a motion actuator. Sensor calibration results were executed 5 times with results δx = −0.602 ± 0.105 mm, δy = 0.384 ± 0.097 mm. 3.2 Stability and Precision First, runout data of the twist drill under no load and under different rotation speeds were recorded to evaluate the stability of calibration. As shown in Fig. 4b, the drill

22

S. Yang et al.

center moved in a circle around the rotation center, which was in accordance with the proposed model. When the rotation speed increased from 10% to 100% with 10% as step, the calibrated rotation center kept stable, with differences lower than 0.01 mm on both x and y axis between any two test circumstances (Fig. 4a). The average calibration result under 10 rotation speed was X0 (−0.71 mm, 0.45 mm).

Fig. 4. a Calibrated rotation center under no load when rotation speeds increase from 10% to 100%. b Recorded tool centers at 50% rotation speed. c, d Tool tip positions estimated with the proposed method when the tool tip is moved 0.1mm each time.

To evaluate the tracking accuracy of the tool tip, a milling cutter was fixed on the motor, while the tip was forced to move by a micro-positioning platform on x and y axis. The offset increased from 0.1 mm to 0.8 mm, with an increment of 0.1 mm every step. The estimated tool tip movement is compared to that provided by the micro-positioning platform to evaluate the tracking accuracy and precision. The milling tool was calibrated before the experiment using the method proposed in Sect. 2.2 by forcing the tool tip to perform movements of 0.1 mm and 0.9 mm. As a result, the tool parameters were la = 12.945 mm, llat = 5.232. As shown in Fig. 4c, d, the proposed tracking method presented relative errors of 8.11% and 14.96% in accuracy, along x and y axis respectively. The standard error of the detected tip position along the targeting axis was used to evaluate tracking precision, which were prone to be 5.5μm and 6.2μm. In general, the proposed method presented high accuracy and precision. A comparatively large error was found on y axis, which may be due to a small relative movement between the milling tool and the micro-positioning platform during the experiment procedure.

Real-Time Medical Tool Runout Monitor Based on Dual

23

3.3 Use Case: Drilling To evaluate the capability of the proposed method to monitor tool status during the execution procedure, a drilling experiment was implemented under different eccentric loads. As shown in Fig. 5a, a bull barrel bone was used for the drilling test. The drilling process was performed along z axis, while three different injection points with different slopes are picked to generate different eccentric loads.

Fig. 5. a Drilling on bull barrel bone at different eccentrical loads. Rotation speed 50%, forward speed 0.5mm/s. b–d Tip positions measured by proposed method at drilling paths 1, 2 and 3.

The drill was first calibrated as lt /la = 6.08. Figure 5b–d displays the tool tip movement during the drilling procedure. Three procedures can be observed during drilling. Before the drill contacted the bone, the tool rotated around the center as a circle. When the drill contacted the bone, a radial offset was presented, whose amplitude was influenced by the strength of eccentrical load. When drilling at a larger slope, the un-uniform cutting force led to a larger tool tip offset. In the case with the largest slope, the offset was close to 1 mm (Fig. 5d), which would threaten the safety of the surgery. When the drill entered the bone, the deflection was restricted. In general, the proposed method could differentiate the drilling status both under different stages of execution and different external drilling statuses. Therefore, it has the potential to provide dynamic tool information during surgeries. 3.4 System Performance A graphics computing server (Intel® Core™ i7-7700 CPU @ 3.60 GHz, Ubuntu 20.04.5 LTS) was used to verify the efficiency of the proposed method. The average time consumed per frame was 0.576 ms (1736 fps) in a test of 100,000 frames, which is higher than the sample rate (1 kHz).

24

S. Yang et al.

4 Discussions and Conclusions This study proposes a method using orthogonal dual laser displacement sensors to track the tool tip position for robot-assisted drilling and milling surgeries. The experiment under no load proved a high accordance between the proposed model and the measured data. Discrepancies were prone to be less than 0.01 mm under different conditions, indicating the high stability of the calibration method. The tracking precision was proven to be 5.5 μm and 6.2 μm along x and y axis when a sub-millimeter tool tip offset existed, while relative tracking precision was 8.11% and 14.96%. Therefore, the proposed method can provide accurate tool tip position. In the drilling experiments, the proposed method was able to differentiate the tool status under different drilling stages and different eccentric loads, indicating a capability of dynamic monitoring of tool status under execution. The computation efficiency was 1736 fps, which meets the requirements of real-time monitoring at 1 kHz sample rate. In general, the proposed method can provide real-time accurate tool tip information during robot-assisted drilling and milling tasks, which can potentially improve the safety of the surgeries. This work mainly contributes in: 1. Propose and validate a drilling tool error model applicable to medical scenarios. 2. Propose a method for real-time accurate tool tip tracking. 3. Evaluate the tool runout procedure when drilling real bones under different eccentrical loads. However, certain limitations exist for the proposed method. The assemble of the proposed tracking structure is highly related to machining and mounting accuracy, which would lead to poor tracking accuracy when improperly assembled. Therefore, future research is needed to analyze and eliminate these effects. Moreover, more studies are needed to combine the proposed tracking method with the robot control strategies to improve the safety of robot-assisted drilling and milling surgeries. Acknowledgments. This work was supported by National Key R&D Program of China (2022YFC2405304), NSFC (U20A20389), Tsinghua University Initiative Scientific Research Program (20197010009), Beijing Municipal Science&Technology Commission (L192046) and a grant from the Guoqiang Institute, Tsinghua University.

References 1. Tauscher, S., Fuchs, A., Baier, F., et al.: High-accuracy drilling with an image guided light weight robot: autonomous versus intuitive feed control. Int. J. Comput. Assist. Radiol. Surg. 12(10), 1763–1773 (2017) 2. Weber, S., Gavaghan, K., Wimmer, W., et al.: Instrument flight to the inner ear. Sci. Robot. 2(4), 12 (2017) 3. Schmitz, T.L., Couey, J., Marsh, E., et al.: Runout effects in milling: Surface finish, surface location error, and stability. Int. J. Mach. Tools Manuf 47(5), 841–851 (2007) 4. Boiadjiev, T., Boiadjiev, G., Delchev, K., et al.: Feed rate control in robotic bone drilling process. Proc. Inst. Mech. Eng. [H] 235(3), 273–280 (2021)

Real-Time Medical Tool Runout Monitor Based on Dual

25

5. Feldmann, A., Gavaghan, K., Stebinger, M., et al.: Real-time prediction of temperature elevation during robotic bone drilling using the torque signal. Ann. Biomed. Eng. 45(9), 2088–2097 (2017) 6. Krüger, M., Denkena, B.: Model-based identification of tool runout in end milling and estimation of surface roughness from measured cutting forces. Int. J. Adv. Manuf. Technol. 65(5–8), 1067–1108 (2013) 7. Jing, X., Tian, Y., Yuan, Y., et al.: A runout measuring method using modeling and simulation cutting force in micro end-milling. Int. J. Adv. Manuf. Technol. 91(9–12), 4191–4201 (2017) 8. Dai, Y., Xue, Y., Zhang, J.X.: Bioinspired integration of auditory and haptic perception in bone milling surgery. IEEE-ASME Trans. Mechatron. 23(2), 614–623 (2018) 9. Dai, Y., Xue, Y., Zhang, J.X. et al.: Burr wear detection based on vibration sense during surgical milling. In: 2016 35th Chinese Control Conference, IEEE, pp. 6307–6310 (2016) 10. Boiadjiev, T., Boiadjiev, G., Delchev, K., et al.: Far cortex automatic detection aimed for partial or full bone drilling by a robot system in orthopaedic surgery. Biotechnol. Biotechnol. Equip. 31(1), 200–205 (2017)

Correction of Premature Closure of Sagittal Suture with Small-Incision Traction Bow Shanshan Du1,2(B) , Li Wen1 , Zhenmin Zhao2 , and Junchen Wang1 1 School of Mechanical Engineering and Automation, Beihang University, No. 37 Xueyuan

Road, Haidian District, Beijing 100191, China [email protected] 2 Department of Plastic Surgery, Peking University Third Hospital, No. 49 North Garden Road, Haidian District, Beijing 100191, China

Abstract. Premature closure of cranial suture is a disease in which skull suture fuses prematurely, which leads to deformity of skull, and then affects brain development and even endangers life. Premature closure of different cranial sutures can lead to different cranial deformities. Sagittal sutures usually begin to close about 18 months, and premature closure of sagittal sutures can lead to scaphoid head or long head deformities. Traditional surgery can only remove part of the skull after craniotomy, so as to increase the cranial cavity space and avoid restricting the growth and development of the brain. However, the operation is traumatic, and the sagittal sinus is located below the sagittal suture, which has a large amount of blood supply, and the risk of massive hemorrhage is very high. Therefore, through the simulation analysis of skull reconstruction in children, this study designed a minimally invasive incision, combined with the self-developed memory alloy traction arch, which avoided the destruction of sagitt.Through the application of binocular vision navigation system during operation, the position of traction arch before operation and the osteotomy line during operation can be accurately navigated to the operation field, which greatly improves the accuracy of operation and reduces the operation risk. Keywords: Early closure of sagittal suture · Memory alloy traction arch · Visual navigation

1 Introduction During embryonic development, the cranial fornix develops from the mesenchymal tissue. First, it forms a capsule around the developing brain; then, the outer mesenchymal layer is formed gradually by intramembranous ossification. During development, the brain is surrounded by dural fibers that are closely connected to the suture system [1–3]. A skull suture is formed in the approximate position of the membranous bone during embryonic development and later functions the main site of bone expansion [4–7]. Premature closure of cranial sutures is the premature bone fusion of one or more cranial sutures in the cranial fornix. The growth of skull and brain tissue under the cranial suture is limited, resulting in “compensatory” growth of the head in other parts, leading © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 26–37, 2024. https://doi.org/10.1007/978-3-031-51485-2_4

Correction of Premature Closure of Sagittal Suture with Small-Incision

27

to skull and facial deformities. The characteristics of craniofacial malformations are related to early closure of cranial sutures [8]. Premature closure of the sagittal suture is a common form of premature closure of cranial sutures. After premature closure of the sagittal suture, the skull cannot grow perpendicular to the sagittal suture and grows parallel to it, resulting in various types of boat-shaped head deformities, which are characterized by long anterior and posterior diameters of the whole head, narrow left and right diameters, with or without symptoms of increased intracranial pressure [9]. Scholars worldwide have made long-term explorations of the surgical treatment of premature closure of sagittal sutures, but there is no unified standard treatment. This kind of corrective surgery usually requires craniotomy to remove part of bone, so as to increase the cranial cavity space and ensure brain development. The surgical trauma, bleeding and postoperative infection have a high probability of complications. This study adopted the surgical design of a small incision and non-whole-skull craniotomy. Before surgery, digital technology was applied to reconstruct the skull, sagittal sinus, and dural space of the children, simulate the surgical process, and improve the safety of the operation. In this study, the traction bow with personalized design is positioned and placed in real time by optical navigation system, and the parameters of the traction bow were measured before the operation. Accurate positioning of preoperative design placement points, no need for large-scale craniotomy, less trauma and less bleeding, The stress response of different positions of the skull and cranial suture to the traction bow was simulated through finite element analysis to treat non-syndromic boat-shaped head deformity in children and track the treatment effect [10–13].

2 Materials and Methods 2.1 Patients The clinical data of three children with non-syndromic scaphoid head deformities treated in the plastic surgery department of the Third Hospital of Peking University from September 2018 to August 2020 were retrospectively analyzed. Their ages ranged from 6 to 22 months with an average of 11.3 months. All children had normal nervous system development and no symptoms of intracranial hypertension. The three-dimensional computed tomography (CT) scan of the head before the operation showed that there was no sagittal suture (Fig. 1). To create a skull model, the anteroposterior and transverse diameters of the head and the volume of the skull cavity were measured, and the skull index (maximum transverse diameter of the head / maximum anterior posterior diameter of the head × 100) was calculated (Table 1). The operation design scheme and risks were explained to the children’s family members before operation. The family members signed the operation consent form and agreed to provide their data for clinical research. This study is in adherence to the principles of the Declaration of Helsinki.

28

S. Du et al.

Fig. 1. CT showed cranial suture images. a 3D reconstruction of coronal suture, closed sagittal suture and herringbone suture b 2D cranial suture imaging display

Table 1. Head measurement indexes before and after operation Anteroposterior diameter

Transverse diameter

Cranial index

Cranial volume

case 1

150.99

118.06

78.19

851.8

case 2

139

98.66

70.98

570.25

2.2 Surgical Method 1. Measurement of traction bow parameters The personalized design  type traction bow is made of nickel titanium shape memory alloy with the following specifications: length of 60 mm, width of 30 mm, and diameters of 1.5 and 2 mm. The traction bow was placed on the force-measuring support for the repeated extrusion test. The measurement process and data are shown in Fig. 2. 2. Surgical design According to the CT data of the child”s head, the data were imported into Mimics software (version: 20.0.0.691, Materialise Inc, Belgium) to design a three-dimensional simulation of the surgical approach. The sagittal sinus and dural space of the child were reconstructed, and simulated hand planning was performed. According to the reconstructed three-dimensional image, the osteotomy line was designed to avoid the sagittal sinus, considering the closed sagittal suture as the center, opening 15 mm on both sides, from the front to the coronal suture on both sides, and then to the herringbone suture on both sides. The width of the osteotomy strip was 1.5 mm. A “” personalized customized memory alloy traction bow of 2 mm in diameter (Fig. 3) was placed. The traction force of the traction bow is 12 N to limit the growth direction of the child’s skull, playing the role of adjusting the shape of the head. 3. Finite element analysis and design of auxiliary traction hook placement point The reverse engineering software Geomagic Studio 12 (Geomagic, USA) was used to optimize and refine the model to obtain the meshed model of the skull body and

Correction of Premature Closure of Sagittal Suture with Small-Incision

29

Fig. 2. Extrusion test of traction bow. a–c use pressure sensors to repeatedly open and close the traction bows with diameters of 1.5 mm and 2 mm, respectively. d test data show that the traction hook with diameter of 2 mm (blue) generates 12 N traction under the minimum compression of 10 mm, and the traction hook with diameter of 1.5 mm (red) generates 8.3 N traction under the minimum compression of 10mm. Traction gradually decreases with an increase in opening and closing width. When the width is 43 mm, the 1.5-mm-diameter traction bow generates traction of 1 N and the 2-mm-diameter traction bow generates 2.8 N traction. Through repeated experiments, it was proven that the stability of the traction hook was not affected by long-term extrusion.

Fig. 3. Simulated osteotomy line. a Sagittal sinus region. b, c According to the position of sagittal sinus, the osteotomy line was designed, which was about 15 mm away from the midline and 1.5 mm wide

then imported into ABAQUS/CAE 2016 to divide the mesh as well as provide the material parameters and loading force application position of the skull according to relevant experience and formula. The force application point was set at the junction of the front, middle, and rear 1/3 of the osteotomy strip, and the calculation results were obtained according to the structural statistics calculation method of ABAQUS. The stress nephogram of the skull model under the current loading force (5–10 N) and small displacement (0–0.5 mm) were obtained. As shown in Fig. 4, the stress concentration parts are mainly a herringbone suture and frontotemporal suture, which is consistent with the selected area of skull measurement. According to the simulation results, the stress at the occipital protrusion point is 1083.3% of the overall average stress, and the stress at the cephalic points on both sides is 2583.3% and 2252.4% of the overall average stress, respectively. 4. Optical surgical navigation system (OSNS)

30

S. Du et al.

Fig. 4. Finite element analysis of traction force. a Select the strength point to apply cattle traction, b the stress area where the skull corresponds to deformation

Fig. 5. The workflow of the whole process [14]

Figure 5 shows the workflow applied by surgeons and mechanical engineers [14].The OSNS used in this study consisted of a real-time binocular tracking system, visual markers that were attached to both the patient and surgical tools, a display monitor, and a workstation computer. The binocular camera used two lenses with a focal length of 12 mm, the baseline distance of the binocular camera was set to be 300 mm (suitable for the medical field of view), the image acquisition frequency was 80 Hz, and the image resolution was 2048 × 2048. The binocular tracking system [14], which was built from two CCD cameras, was similar to an “eye” that acquires real-time images and transmits them to the workstation computer for processing; the workstation computer, which was similar to the brain, performed the algorithmic analysis. The stereo rectification of the two cameras was completed after lens distortion correction. X-shaped feature corner points detected by binocular vision (as shown in the Fig. 6) were used to calculate the disparity of the two matching feature points. Later, the 3D coordinate values of the feature points under the camera coordinate system were calculated according to the triangulation principle. The

Correction of Premature Closure of Sagittal Suture with Small-Incision

31

local coordinate system was established using principal component analysis preoperatively, and then the coordinates of these feature points in this coordinate system were recorded and stored as a marker template. Intraoperatively, the template matching was performed based on the detected feature points. The position and pose of the successfully matched visual markers in the camera coordinate system were then calculated.

Fig. 6. Working principle of optical navigation system. Identification of the probe and the Marker on the braces by the binocular lens for positioning on the model, while the probe and the reconstructed model of the skull are presented in real time on the monitor [14].

5. Operation method The child was placed in the prone position after successful endotracheal intubation and general anesthesia. Routine iodophor disinfection was applied to the operation site, and a shop aseptic sheet was placed. 5.1. To expose the operation site, the marking pen designed two parallel arc incision lines at the center line of the head, approximately 3 cm away from the anterior fontanel and herringbone suture. The front end of the incision reached the anterior fontanel and the rear end reached the herringbone suture. Local infiltration anesthesia was administered with 0.25% lidocaine injection and 1:400000 adrenaline along the design line of the scalp incision. After local anesthesia was achieved, the scalp was cut along the scalp incision design line to the superficial periosteum, and the superficial periosteum was sharply separated under the cap aponeurosis. The separation range reached the anterior fontanelle forward, crossed the herringbone suture backward, and separated laterally to approximately 2.5 cm from the midline (Fig. 7). 5.2 Skull fenestration and parietal osteotomy Two osteotomy lines parallel to the midline of the sagittal suture were designed on the parietal surface on both sides approximately 7.5 mm away from the midline of the sagittal suture to avoid damaging the sagittal sinus during the operation. The skull fenestration position was marked with a marker at the parietal osteotomy line corresponding to the scalp incision line, two fenestration positions on each side of the osteotomy line were marked, the milling cutter head was inserted from the fenestration position to the inner side of the parietal bone, the bone was cut along the parietal osteotomy line, and the bone

32

S. Du et al.

Fig. 7. Preoperative scribing design

was ground along the fenestration position with a skull ring drill to expose the epidural space. Meningeal stripping ion was used to strip the epidural space to the inner side of the parietal bone. 5.3 For the placement of the bow distractor, the bow distractor was customized according to the estimated traction distance of the child’s preoperative evaluation. After osteotomy, the opening section of the bow distractor was fixed with silk thread according to the size of the window opening position. The two bow distractors were placed into the parietal bone window opening, the bow of the front distractor protruded backward, the bow of the rear distractor protruded forwards, the distractor in the parietal bone skin incision was placed, and the silk thread was removed. 5.4 After sufficient hemostasis, the cap aponeurosis and scalp were sutured intermittently, and negative-pressure drainage was placed under the skin (Fig. 8).

Fig. 8. Surgical process diagram. a Fenestration, osteotomy, b traction arch placement, c suture, placement and drainage

5.5 Stoperative treatment After the operation, the patient was transferred to the intensive care unit (ICU) for 1– 2 days and then transferred to the general ward. Third-generation cephalosporins were routinely administered for 7–10 days.

Correction of Premature Closure of Sagittal Suture with Small-Incision

33

2.3 Results In this group, three cases of early closure of the sagittal suture were corrected, and the skull shape was satisfactory. Upon follow-up for 12–24 months, the average anterior and posterior cranial diameter showed an increase of 16.72 mm (11%) from 151.07 mm before the operation to 167.79 mm after the operation. The average transverse diameter of both temporal parts was 112.29 mm before operation and 131 mm after operation, which is an increase of 18.71 mm (16.7%). The ratio of head transverse diameter to anterior posterior diameter decreased from 1:1.35 to 1:1.28 after operation. The average head index before and after operation was 74.26% and 78.29%, respectively. The volume of skull cavity increased from 775.28 cm3 before operation to 1071.17 cm3 after operation. The boat-shaped head deformity improved significantly (Tables 2 and 3).

3 Discussion Cranial malformations with early sagittal closure are complex and variable because the skull contour depends on the starting time, initial fusion site, and degree of fusion progress. Traditional surgical methods include cranial suture reconstruction, such as David’s “I”-shaped cranial suture reconstruction. This method requires open surgery and is suitable for children with mild cranial malformation under 3 months of age. The orthopedic cap needs to be worn for a long time after operation, which increases the discomfort of the children and the number of repeated visits. After surgery, incomplete correction and recurrence may occur due to bone space re-fusion, and a satisfactory skull index and shape may be obtained. For children over 6 months old with significant cranial deformity, the traditional method adopts partial or even whole cranioplasty, including floating cranial flap cranioplasty, plum blossom flap cranial flap cranioplasty, skull flap remodeling, and displacement. Its advantage is that it can deal with the limited skull and skull with compensatory hyperplasia at the same time, and it can be formed in one operation for older children and in children with severe deformity. It also results in better correction of skull shape and reduction of intracranial pressure. However, this type of operation involves large trauma and increased bleeding, forming a dead space between the transplanted bone flap and the dura mater after operation. The blood supply between the dura mater and bone flap is destroyed, and dead bone formation and intracranial infection can occur. All of these defects limit the wide application of this operation. Distraction osteogenesis was first performed in patient with premature closure of the sagittal suture in 1998 by Sugawara et al. Distraction osteogenesis assisted by a distractor can be applied to elderly patients and those with severe cranial deformities. A better skull shape can be obtained through vertical extension of the sagittal suture after operation. However, due to the unidirectional extension of the left and right radial direction of the distractor, it is still unable to shorten the compensatory overgrowth of the front and rear radial direction, which often turns the boat-shaped head into a small boat-shaped head, and the skull index cannot return to normal. After operation, a cranial brace is still required for orthopedic treatment. In 2019, Weimin et al. introduced and improved the combined distraction technology, multi-block osteotomy, and multi-directional distraction technology. The advantage of

150.99

139.00

163.23

case 1

case 2

case 3

120.14

98.66

118.06

73.60

70.98

78.19 903.78

570.25

851.80 169.41

179.04

154.93

Postoperative Cranial volume

Anteroposterior diameter

Cranial index

Anteroposterior diameter

Transverse diameter

Preoperative

132.69

131.78

128.53

Transverse diameter

Table 2. Preoperative and postoperative last follow-up measurements

78.30

73.60

82.96

Cranial index

1162.22

1085.94

1065.35

Cranial volume

34 S. Du et al.

74.26

775.28

Cranial volume

112.29

Transverse diameter

Cranial index

151.07

Anteroposterior diameter

Preoperative average (mm)

1071.17

78.29

131.00

167.79

Postoperative average (mm)

295.89

4.03

18.71

16.72

Average growth value (mm)

38.17

5.4

16.7

11

Proportion of growth (%) 1:1.35

Preoperative ratio

Table 3. Analysis of changes before operation and the last follow-up after operation

1:1.28

Postoperative proportion

Correction of Premature Closure of Sagittal Suture with Small-Incision 35

36

S. Du et al.

this method is that it is suitable for the treatment of various types of scaphoid head deformities. The combined extension of multiple bone flaps avoids the movement of large bone flaps in a single direction, maintains the skull top stretched under a certain radius, shortens the extension cycle compared with unidirectional extension, and corrects the left-right and front-back cranial deformities after operation. However, there are some disadvantages of this method. The operation requires careful preoperative design; during osteotomy, the sinus, brain, and fontanelle should be avoided. It needs to be performed in a unit with rich experience in craniofacial surgery, it requires ICU monitoring after surgery, and the skin around the skull top extender requires frequent nursing and strict parental care. In addition, the cost of multiple extenders is high. Based on previous experience, we further improved the design scheme and performed a minimally invasive small-incision traction bow to correct the boat-shaped head. The traction bow of nickel titanium alloy was designed and made, and the performance of the traction bow was tested to ensure its safety and effectiveness. At the same time, the placement position of the traction bow and the influence of traction force on skull deformation were predicted through finite element analysis. Advantages of this method include the following: (1) A minimally invasive incision avoids high-risk operations, such as scalp coronal incision, large-area peeling of the scalp flap, more bleeding, and craniotomy. It reduces surgical trauma in children, reduces the risk of infection and bleeding, and reduces the difficulty of surgical operations without the participation of neurosurgeons. (2) The personalized customized memory alloy traction bow is costeffective, is easy to place and use during operation, has a stable continuous effect with no midway adjustment, no impact on normal life and activities, and reduces discomfort and pain in children. (3) The surgical design can be simulated by a three-dimensional printed model before operation, resulting in more accurate measurements and shortened operation time. (4) Rapid postoperative recovery, no external regulator, a reduced risk of infection and discomfort in children, the version was very small, does not affect the appearance and shape, and improved the satisfaction of the children and their families. However, there are problems and deficiencies still to be solved: (1) The parameters of the personalized memory alloy traction bow were adjusted, and the mechanical setting of the traction bow was relatively fixed. With the passage of children’s traction time, the measurement of the required traction force cannot be fed back and adjusted at any time. The traction effect can only be evaluated through the measurement and re-examination of children’s head appearance and head circumference contour, but the traction force cannot be changed. (2) We cannot predict whether the left and right diameters will narrow again in the long term after traction. It is not known whether traction bow force will affect the normal bone suture and skull development of children; therefore, long-term follow-up is necessary.

4 Conclusions Surgical treatment of early closure of sagittal sutures in infants is required. Scholars worldwide are constantly exploring new surgical methods to achieve better results and minimize trauma and risk. Therefore, the use of a small-incision traction bow to correct premature closure of the sagittal suture is worth popularizing and perfecting. Compared

Correction of Premature Closure of Sagittal Suture with Small-Incision

37

with previous surgical methods, this operation is favored by an increasing number of children and their families as it has low risk, low cost, and high efficiency. Problems in surgical methods, design, and other links need to be addressed through further studies. Acknowledgments. I would like thank all those have helped me to make this paper possible. I would like to thank my two tutors, Professor Wen Li and Professor Wang Junchen, for their guidance, and the team of Director Zhao Zhenmin of Plastic Surgery Department of the Third Hospital of Peking University for their support. I would like to thank all the relevant personnel in the operating room of the Third Hospital of Beijing Medical University for their help and the students in the Soft Robot Laboratory of Beihang University for their help. This work was supported in part by the National Key Research and Development Program of China under Grant 2022YFC2405401.

References 1. Neuro-osteology, K.I.: Crit. Rev. Oral Biol. Med. 9(2), 224–244 (1998) 2. Chai, Y., Maxson, R.E., Jr.: Recent advances in craniofacial morphogenesis. Dev. Dyn. 235(9), 2353–2375 (2006) 3. Nah, H.D., Pacifici, M., Gerstenfeld, L.C., Adams, S.L., Kirsch, T.: Transient chondrogenic phase in the intramembranous pathway during normal skeletal development. J. Bone Miner. Res. 15(3), 522–533 (2000) 4. Gong, S.G.: Cranial neural crest: migratory cell behaviour and regulatory networks. Exp. Cell Res. 325(2), 90–95 (2014) 5. Twigg, S.R., Wilkie, A.O.: A Genetic-pathophysiological framework for craniosynostosis. Am. J. Hum. Genet. 97(3), 359–377 (2015) 6. Slater, B.J., Lenton, K.A., Kwan, M.D., Gupta, D.M., Wan, D.C., Longaker, M.T.: Cranial sutures: a brief review. Plast. Reconstr. Surg. 121(4), 170e-e178 (2008) 7. Zhao, X., Qu, Z., Tickner, J., Xu, J., Dai, K., Zhang, X.: The role of SATB2 in skeletogenesis and human disease. Cytokine Growth Factor Rev. 25(1), 35–44 (2014) 8. Ann, L., Michael, W., Jeffrey, L., James, M.: Effect of premature sagittal suture closure on craniofacial morphology in a prehistoric male Hopi. Cleft Palate Craniofac. J. 31(5), 385–396 (1994) 9. Covemale, L.S.: Craniosynostosis. Pediatr. Neurol. 53(5), 394–401 (2015) 10. Slater, B.J., Lenton, K.A., Kwan, M.D., et al.: Cranial sutures: a brief review. Reconstr. Surg. 121(4), 170–178 (2008) 11. Sharma, R.K.: Craniosynostosis. Indian J Plast Surg 46(1), 18–27 (2013) 12. Mulliken, J.B., Vander Woude, D.L., Hansen, M., LaBrie, R.A., Scott, R.M.: Analysis of posterior plagiocephaly: deformational versus synostotic. Plast. Reconstr. Surg. 103(2), 371– 380 (1999) 13. Zaleckas, L., Neverauskien˙e, A., Daugelavicius, V., Šidlovskait˙e-Baltak˙e, D., Raugalas, R., Vištartait˙e, B., et al.: Diagnosis and treatment of craniosynostosis: Vilnius team experience. Acta Med Litu 22(2), 111–121 (2015) 14. Chen, Y., Du, S., Lin, Z., Zhang, P., Zhang, X., Bin, Y., Wang, J., Zhao, Z.: Application of trans-sutural distraction osteogenesis based on an optical surgical navigation system to correct midfacial dysplasia. Sci. Rep. 12, 13181 (2022)

A Home-Style Intelligent Monitoring Sanitize Robot B. Liu1 , J. Yang1(B) , J. Ding1 , D. Zhao1 , and J. Wang2 1 School of Electrical Engineering, Shenyang University of Technology, Shenyang, Liaoning,

China [email protected] 2 China Medical University, Shenyang, Liaoning, China

Abstract. According to the problems and needs faced by existing sanitize robots in home scenarios, this paper includes two modes: sanitizing and intelligent temperature monitoring. According to the problems and needs faced by existing sanitize robots in home scenarios, this paper includes two modes: sanitizing and intelligent temperature monitoring. The robotic arm is used in the intelligent temperature monitoring mode to deliver the temperature measuring instrument to the appropriate temperature measurement spot, producing measurements with an accuracy of up to 0.2 °C and alarm for abnormal body temperature. The sanitize mode uses real-time control and coordinate control two mobile control modes and is equipped with an obstacle avoidance control system to make its driving accuracy reach 1–3 cm; Based on the Cartesian spatial trajectory planning method, the sanitize trajectory of the robotic arm is simulated and planned, and the sanitized area of the household environment can reach 20 m2 /min and the spraying area can be greater than 1 dm2 by the disinfection device equipped with the robotic arm, which improves the sanitize efficiency and accuracy. Keywords: Monitor sanitize robot · Infrared temperature sensor · Family scenes

1 Introduction During the Corona Virus Disease 2019, the “disinfection robot” provided disinfection services for hospitals and other places, greatly reducing the risk of infection. However, in the face of various environments in the home, people often cannot clean and disinfect properly. And the existing “no contact” disinfection method also faces many problems in the application of home scenes. For example: • The disinfection is not accurate enough, and the entire room cannot be disinfected without dead space. • The disinfection robot is large in size, inconvenient to transport, and difficult to carry to the work site. In view of the above problems and needs, this paper designs a home intelligent monitoring sanitize robot, which realizes the health detection and disinfection work in the home © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 38–47, 2024. https://doi.org/10.1007/978-3-031-51485-2_5

A Home-Style Intelligent Monitoring Sanitize Robot

39

scene. The monitoring and disinfection robot designed in this design is small in size, easy to transport and move, and uses a robotic arm equipped with a disinfection device to disinfect, so that the disinfection is more accurate. Keep your family away from bacteria and viruses. The main contents of this article include the following aspects: • Control of intelligent disinfection function with intelligent obstacle avoidance system in home scenes. • The control of intelligent temperature measurement mode can effectively and accurately detect the human body temperature signal and alarm abnormal temperature data. • In order to verify the accuracy and reliability of the detection method, the home intelligent monitoring sanitize robot was tested, and the results showed that to a certain extent, this home intelligent monitoring sanitize robot can realize the early warning prompt when the human body temperature is high and successfully complete the disinfection work.

2 Home Monitoring Sanitize Overall Design of Robots The home-type intelligent monitoring sanitize robot in this paper is composed of a disinfection mechanism, a main control mechanism, a robotic arm mechanism, and a walking mechanism, among which the walking mechanism adopts a crawler design. The system circuit adopts STM32F103ZET6 as the core processor, and is equipped with peripheral circuits: buck voltage stabilization circuit, motor drive circuit, voice control circuit, robotic arm control circuit, lighting circuit, water pump control circuit, MCU90615-BBC temperature acquisition circuit, touch display circuit, alarm circuit, GY-53L1 infrared obstacle avoidance circuit, MPU6050 attitude detection circuit. The block diagram of the system design is shown in Fig. 1. The voice controller is connected to the relay to realize the voice wake-up of the robot and after the initialization is completed, it can choose to enter the intelligent temperature monitoring mode or sanitize mode through the voice or TFT LCD screen. When the temperature monitoring command is detected, it automatically enters the intelligent temperature measurement control, in the same way, after detecting the intelligent sanitize control command, it automatically enters the sanitize control and can realize mode switching and parameter setting through screen touch or voice control. The control flow of the total control system is shown in Fig. 2.

3 Home Monitoring Sanitize Control of Various Functions of Robots 3.1 Sanitize Mode Control Use the voice or TFT LCD screen to select access to sanitizing control. If you want to disinfect your hands, you can select them by voice or TFT LCD screen, place your hands at the best disinfection point through infrared ranging prompts, and then disinfect your hands. After the disinfection is completed, you can exit the hand disinfection mode under the voice or TFT LCD screen prompt, return to the main control interface or maintain the hand disinfection mode to re-disinfect the hand.

40

B. Liu et al. power

Buck regulator circuit

Robotic arm control circuit Water pump control circuit GY-53L1 infrared obstacle avoidance circuit MPU6050 attitude detection circuit

STM32F103ZET6 minimal system circuit

Motor drive circuit

Voice control circuit Touch display circuit Lighting circuits

GY-MCU90615 temperature acquisition circuit Alarm circuit

Fig. 1. General block diagram of the system hardware

Without hand disinfection, it automatically switches to ambient sanitization control, at which point the robotic arm automatically activates and moves to the optimal position. Next, the water pump extracts the disinfectant solution and sprays the disinfectant evenly through the nozzle. At the same time, the processor controls the mobile device through Bluetooth real-time control and coordinates control, and the robot will automatically avoid obstacles when encountering obstacles. When no obstacle is detected, stop and record the rotation angle, go straight for 3s and rotate the current recording angle to the left, then go straight for 3 s, rotate a certain angle, and go straight. Autonomous obstacle avoidance during driving can well avoid collisions and damage to items in the home environment. When the disinfection environment is narrow and the mobile machinery is inconvenient, accurate disinfection is achieved by the movement of the robotic arm. After the disinfection is completed, you can use the voice or TFT LCD screen to control the exit area disinfection mode and return to the main control interface. The sanitize mode control flowchart is shown in Fig. 3. 1. Robotic arm control This research employs a robotic arm with a disinfection and temperature monitoring equipment to precisely clean and gauge the temperature of various complicated situations in the house.Through the robot arm, effective epidemic home prevention is made possible.The six-degrees-of-freedom robotic arm is utilized to clean the complicated surroundings of the family and perform trajectory planning so that the end effector can successfully finish the sanitizing work. The manipulator is reversed under the known path point using the Cartesian spatial trajectory planning method, and the sanitized position is converted into the angle of each joint so that the manipulator can take the appropriate actions to finish the sanitized task. The control process sanitizes the manipulator is shown in Fig. 4. The spatial straight-line interpolation method is used to make the end trajectory of the robotic arm a straight line. The trajectory of the robotic arm is simulated through MATLAB, and the coordinates of each interpolation point are determined so that the trajectory sanitizes the robotic arm is rectangular. The spatial arc interpolation method

A Home-Style Intelligent Monitoring Sanitize Robot

41

Start Voice control The robot starts working initialize Voice control

Screen controls

Select the mode temperature monitoring mode control

sanitize mode control

Fig. 2. General block diagram of system software Start initialize Voice control

Screen controls

sanitize mode Robotic arm control

Hand sanitizer?

N

Y

Water pump work N

Infrared ranging N N

Motor driven N

Obstacles? Y Obstacle avoidance

Exit?

0.9999 and abs(Hreal (3, 4) − Hdes (3, 4)) < 0.02, that is, keeping the laser beam as perpendicular to the sample surface as possible while maintaining the sample in focus range.

Fig. 2. a Real-time OCT image processing for robotic pose control. b Coarse-to-fine 3D image stitching of wide-field scanning.

(c) 6-DOF 3D image stitching Since both translation and rotation are involved in the pose adjustment process of the scanning head, it is necessary to develop an image stitching algorithm with 6 DOF. We first perform coarse alignment between each 3D volume using the pose information of the robot at each adjusted imaging point. Due to the weak contrast of OCT images, there are few features available for image registration, so we resort to point cloud registration for fine alignment. For each A-line in the 3D volume, the surface position is extracted through threshold peak detection. Then, we convert all surface positions into a point cloud and use the iterative closest point (ICP) algorithm for one-by-one registration of those surface point clouds with initial transformations set to coarse-alignment transformations. In overlapping areas, we perform average blending to construct consecutive wide-field images.

3 Experiments and Results (D) Analysis of automatic pose adjustment A multi-angle imaging experiment was carried out using an infrared laser viewing card (VRC2, Thorlabs) as an imaging sample to verify that the system can automatically reach the appropriate imaging pose at various initial states. In this set of experiments, the number of scanning points is set to 1. We saved the pose of the robot after each adjustment step and calculated the pose error compared to the last state. We use 3-axis Euler angles (the rotation order is Z-Y-X) to represent the rotation, and the curves of rotation and translation errors are shown in Fig. 3b. The four groups of experimental

Robot-Assisted Optical Coherence Tomography for Automatic Wide-Field Scanning

69

data presented in Fig. 3 are representative to a certain extent. The initial state of the first and second groups was normal imaging under different angles, while the initial image of the third group was inverted, indicating that the distance between the scan head and the tissue was small. The system could handle these situations properly and finally reach an ideal imaging pose. In the fourth group, the initial state has no image signal, which is due to the distance between the scan head and the sample is too far or too close, completely out of the focus range. In this case, the robot needs to try to find the sample surface through an axial movement (about 10 steps, during which the rotation angle does not change), and then it can smoothly reach the ideal pose.

Fig. 3. Analysis of automatic pose adjustment. a Point cloud and central B-scans (lower right) acquired before pose adjustment. b Rotation (upper) and translation (lower) error curves during pose adjustment (Red, green and blue represent the x, y, and z axis, respectively). c Point cloud and central B-scans (lower right) acquired after pose adjustment.

We calculated the angle between the normal direction of the point cloud obtained at the final state and the axial direction of the image coordinate system, and the average is 0.1501 ± 0.0974 degrees. This further proves that the system can accurately adjust the imaging pose. (e) Wide-field scanning and stitching We also carried out large-field scanning experiments on human skin to verify the feasibility and accuracy of the automatic scanning and image stitching algorithm. Figure 4 presents the images obtained by the proposed system after scanning the human skin (back of the right hand) with a 3 * 3 grid trajectory. The lateral FOV of the stitched image reached 22.2 × 21.9 mm2 . Figure 4a was obtained by transforming and superimposing the original OCT volumes using the resulting transformation matrix after the 3D stitching was completed, and the en face and side views (Fig. 4b, c) were generated using 3D slicer [17]. From enlarged views in Fig. 4a, we can find that in the overlapping areas, the

70

Y. Li et al.

adjacent volumes have good consistency and the original structures are continuous with no artifact exists, which proves the good accuracy of the proposed stitching algorithm. The proposed system is also competent for more complex irregular tissue surfaces, as shown in Fig. 5.

Fig. 4. Wide-field OCT imaging of human skin. a B-scan view and enlarged views of overlapped areas. b En face view. c Lateral view. d 3D rendering of stitched volumes.

Fig. 5. Flexible and automatic wide-field OCT imaging on complex and irregular skin surface (6 scans stitched).

4 Discussion and Conclusion The proposed system is verified through qualitative and quantitative experiments. The system could find appropriate imaging positions and orientations on irregular sample surfaces and ensure the image quality. But the current stitching algorithm takes a long computation time and lacks quantitative evaluation, which is one of the problems we need to consider in the future. We will also carry out studies on wide-field OCT imaging for the diagnosis of soft tissue diseases, so as to improve the ease of use and portability of the system from the perspective of clinical needs. In conclusion, we presented a robot-assisted OCT imaging system capable of automatic wide-field scanning. We used the real-time 3D OCT data obtained from sparse scanning as visual feedback to achieve the precise adjustment of the robotic scanner’s

Robot-Assisted Optical Coherence Tomography for Automatic Wide-Field Scanning

71

pose, and successfully realize the automatic localization and image stitching on complex skin tissues. The system would be beneficial for high-resolution lesion boundary recognition, precise surgical guidance, and intelligent theranostics. Acknowledgment. The authors acknowledge supports from National Natural Science Foundation of China (82027807, U22A2051, 82172112), National Key Research and Development Program of China (2022YFC2405200), Beijing Municipal Natural Science Foundation (7212202), Institute for Intelligent Healthcare, Tsinghua University (2022ZLB001), and Tsinghua-Foshan Innovation Special Fund(2021THFS0104).

References 1. Huang, D., Swanson, E.A., Lin, C.P., et al.: Optical coherence tomography. Science 254, 1178–1181 (1991) 2. Marvdashti, T., Duan, L., Aasi, S.Z., et al.: Classification of basal cell carcinoma in human skin using machine learning and quantitative features captured by polarization sensitive optical coherence tomography. Biomed. Opt. Express 7, 3721 (2016) 3. Zhang, R., Fan, Y., Qi, W., et al.: Current research and future prospects of IVOCT imagingbased detection of the vascular lumen and vulnerable plaque. J. Biophotonics 15, e202100376 (2022) 4. Luo, S., Fan, Y., Chang, W., et al.: Classification of human stomach cancer using morphological feature analysis from optical coherence tomography images. Laser Phys. Lett. 16, 095602 (2019) 5. Kut, C., Chaichana, K.L., Xi, J., et al.: Detection of human brain cancer infiltration ex vivo and in vivo using quantitative optical coherence tomography. Sci. Transl. Med. 7, 292ra100– 292ra291 (2015) 6. Li, Y., Fan, Y., Hu, C., et al.: Intelligent optical diagnosis and treatment system for automated image-guided laser ablation of tumors. Int. J. Comput. Assist. Radiol. Surg. 16, 2147–2157 (2021) 7. Ji, Y., Zhou, K., Ibbotson, S.H., et al.: A novel automatic 3D stitching algorithm for optical coherence tomography angiography and its application in dermatology. J. Biophotonics 14, e202100152 (2021) 8. Wang, Z., Potsaid, B., Chen, L., et al.: Cubic meter volume optical coherence tomography. Optica 3, 1496–1503 (2016) 9. Song, S., Xu, J., Wang, R.K.: Long-range and wide field of view optical coherence tomography for in vivo 3D imaging of large volume object based on akinetic programmable swept source. Biomed. Opt. Express 7, 4734–4748 (2016) 10. Laves, M.-H., Kahrs, L.A., Ortmaier, T.: Volumetric 3D stitching of optical coherence tomography volumes. Curr. Directions Biomed. Eng. 4, 327–330 (2018) 11. Sprenger, J., Saathoff, T., Schlaefer, A.: Automated robotic surface scanning with optical coherence tomography. In: 2021 IEEE 18th international symposium on biomedical imaging (ISBI). IEEE (2021) 12. Draelos, M., Ortiz, P., Qian, R., et al.: Contactless optical coherence tomography of the eyes of freestanding individuals with a robotic scanner. Nature Biomedical Engineering (2021) 13. Ortiz, P., Draelos, M., Narawane, A., et al.: Robotically-aligned optical coherence tomography with gaze tracking for live image montaging of the Retina. In: 2022 International Conference on Robotics and Automation (ICRA). IEEE (2022)

72

Y. Li et al.

14. Huang, Y., Li, X., Liu, J., et al.: Robotic-arm-assisted flexible large field-of-view optical coherence tomography. Biomed. Optics Express 12 (2021) 15. Dahroug, B., Tamadazte, B., Andreff, N.: PCA-based visual servoing using optical coherence tomography. IEEE Robot. Autom. Lett. 5, 3430–3437 (2020) 16. Tsai, R.Y., Lenz, R.K.: A new technique for fully autonomous and efficient 3 d robotics hand/eye calibration. IEEE Trans. Robot. Autom. 5, 345–358 (1989) 17. Fedorov, A., Beichel, R., Kalpathy-Cramer, J., et al.: 3D Slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imag. 30, 1323–1341 (2012)

Adversarial Detection and Defense for Medical Ultrasound Images: From a Frequency Perspective Jian Wang1 , Sainan Zhang1 , Yanting Xie1 , Hongen Liao2 , and Fang Chen1(B) 1 Department of Computer Science and Engineering, Nanjing University of Aeronautics and

Astronautics, Nanjing, China [email protected] 2 Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, China

Abstract. B-mode ultrasound imaging is a popular medical imaging technique, and deep neural networks (DNNs), like other image processing tasks, have become popular for the analysis of B-mode ultrasound images. However, a recent study has demonstrated that medical deep learning systems can be compromised by carefully-engineered adversarial attacks with small imperceptible perturbations. Therefore, the adversarial attacks of ultrasound images are analyzed from a frequency perspective for the first time, and an easy-to-use adversarial example defense method is proposed, which can be generalized to different attack methods with no retraining need. Extensive experiments on two publicly-available ultrasound datasets, i.e., Breast ultrasound and Thyroid ultrasound datasets, have demonstrated that the hereby proposed defense method can detect the adversarial attacks of ultrasound images with a high mean accuracy. Keywords: Adversarial attacks · Ultrasound images · Defense

1 Introduction Given that ultrasound is cost-effective, non-hazardous, portable, and can be routinely performed in various clinical examinations, ultrasound image has been widely used in medical diagnosis [1]. Traditional analysis methods of ultrasound images mainly finish differential malignant and benign diagnoses by experienced physicians based on information such as the shape and the texture [2]. The development of deep learning has made deep neural networks (DNNs) increasingly popular for the analysis of Bmode ultrasound images. For example, Guo et al. [3] achieved good performance in the automatic classification of thyroid ultrasound standard plane images using the 8layer CNN model ResNet. Lazo et al. [4] compared different DNNs including VGG-16 and Inception V3 for the task of automated breast tumor classification. Obviously, the potential applicability of DNNs-based ultrasound image analysis systems in clinical diagnosis has been confirmed. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 73–82, 2024. https://doi.org/10.1007/978-3-031-51485-2_9

74

J. Wang et al.

Although DNNs-based ultrasound image analysis systems bring about the opportunity to further improve lesion diagnosis, the discovery of adversarial examples poses a security threat to the ultrasound image analysis systems [5]. There are some methods commonly used to generate adversarial examples for attacking DNNs models. The published works of adversarial examples are broadly divided into two categories: (1) gradient-based attack algorithms, such as single-step attack FGSM [6], multi-steps attack I-FGSM [7], and PGD [8], etc. and; (2) optimization-based attack methods, such as C&W [9], DeepFool [10], etc. In Fig. 1, the ultrasound adversarial examples are generated by adding the carefully designed perturbations to the original ultrasound images using different attack methods. As shown in Fig. 1, adversarial ultrasound images with imperceptible perturbations for human beings can cause the DNNs to misclassify a normal image as benign or malignant. Therefore, adversarial ultrasound examples can fool DNNs into making wrong predictions, which undoubtedly influences their security.

Fig. 1. Ultrasound adversarial examples attack the classifier. The left image depicts the original ultrasound image with normal prediction results. Images on the right are three ultrasound adversarial examples generated by different attack methods, which disturb the classifier and cause wrong predictions.

Much research has been proposed to defend against adversarial attacks. One common strategy is adversarial training, which augments the training set with adversarial examples to retrain the DNN model and improve the robustness of the model. However, the adversarial training method is not suitable for medical imaging datasets, since a large number of adversarial images injected into the training dataset may significantly influence the accuracy. To tackle this problem, adversarial examples detection methods have been proposed to ensure the security, which mainly distinguish the examples using the difference between original examples and adversarial examples. For example, by analyzing the difference of data manifold between original examples and adversarial examples, Feinman et al. [11] trained the logistic regression (LR) with the features obtained from the density estimation and Bayesian uncertainty of the models to detect the adversarial examples. The Detection by Attack (DBA) method presented in [12] is designed to detect adversarial examples by virtue of the asymmetric vulnerability of original examples and adversarial examples.

Adversarial Detection and Defense for Medical Ultrasound Images

75

Different from existing studies, this paper first analyzes the adversarial attacks of ultrasound images from a frequency perspective, and proposes an easy-to-use adversarial example defense method, which can be generalized to different attack methods. Specially, only a simple frequency domain conversion is required for the hereby proposed defense method, with no need for retraining. Extensive experiments on two publicly available ultrasound datasets have demonstrated the efficiency of the proposed defense method in detecting the adversarial attacks of ultrasound images with high mean accuracy.

Fig. 2. The framework of our method, which can be divided into two modules including the ultrasound adversarial examples analysis and ultrasound adversarial examples detection

2 Ultrasound Adversarial Images Detection Method: From a Frequency Perspective 2.1 Overview As shown in Fig. 2, the framework of the proposed method can be divided into two modules, i.e., the ultrasound adversarial examples analysis and ultrasound adversarial examples detection. In the module of ultrasound adversarial examples analysis, the original images are used to train the ultrasound classification models, and different attack algorithms are adopted to generate aggressive adversarial examples. For ultrasound adversarial examples, not only the feature distribution of the adversarial examples is analyzed, but the difference between ultrasound adversarial examples and ultrasound original images of the frequency domain is also compared. Based on the analysis results of adversarial examples, the detection of adversarial ultrasound examples is proposed from a frequency perspective, which only requires a simple frequency domain conversion to calculate the variation coefficient of the spectrum for determining whether unseen ultrasound images are normal or adversarial. 2.2 Feature Distribution of Adversarial Examples Exploring the specific characteristics of the adversarial ultrasound images is the basis for developing an effective detection method. Firstly, difference of the feature distribution between original ultrasound images and adversarial ultrasound images is analyzed.

76

J. Wang et al.

Figure 3 visualizes the feature distributions of original and adversarial ultrasound samples extracted from the last layer of ResNet18 [13] on Thyroid ultrasound datasets using the t-SNE method. It can be clearly observed that features of the original images (green circles) and the adversarial ultrasound images (blue circles) sharing the same class label of “normal” are obviously different. In addition, the generated ultrasound adversarial examples cross the original decision boundary of the DNN model, which undoubtedly fools the ultrasound classification models and influences the diagnosis security.

Fig. 3. The distribution of the original examples (green circles and yellow stars) and the generated adversarial examples (blue circles and red stars) of Thyroid datasets, with the grey line representing the decision boundary of the DNN model.

2.3 Difference of Frequency Domain From human observations, the added perturbations in the adversarial examples crafted by different attack methods share some similarities with the speckle noise, an inherent interference in ultrasound imaging systems [14]. Previous studies about speckle reduction explored the significance of frequency domain information for analyzing speckle noise [15–17]. In the face of such a phenomenon, differences between the original and adversarial ultrasound images are hereby analyzed from a frequency perspective. In this section, 100 original images are randomly selected to generate the corresponding adversarial images for analyzing the adversarial examples. On the one hand, a comparison is drawn between the grayscale histogram of the original and adversarial ultrasound images in the spatial domain. As shown in Fig. 4a, the original and adversarial ultrasound images have no major difference in the spatial domain. On the other hand, the above ultrasound samples are transformed from the spatial domain into the frequency domain for further analysis of the differences. Herein, the ultrasound images are transformed from the spatial domain to the frequency domain by the Fourier transform, complying with the following formula: F(u, v) =

M −1 N −1

f (x, y)e−i2π ( M + N ) ux

vy

(1)

x=0 y=0

where M and N denote the width and height of the ultrasound images, respectively; f (x, y), the value of the ultrasound image position (x, y) in the spatial domain; (u, v),

Adversarial Detection and Defense for Medical Ultrasound Images

77

the coordinate of a spatial frequency on the frequency spectrum; F(u, v), the complex frequency value; and e and i, Euler’s number and the imaginary unit, respectively. Besides, the Fourier transform can be written as: F(u, v) =

M −1 N −1 x=0 y=0

  ux f (x, y) cos2π M +

 ux vy  N − isin2π M

+

vy  N

(2)

Fig. 4. a Grayscale histograms of the original and adversarial examples using different attack methods in the spatial domain; b Spectrum histograms of the original and adversarial examples using different attack methods in the frequency domain.

As shown in Fig. 4b, the difference of the spectrum histograms’ distribution between the original and adversarial ultrasound images is more obvious in the frequency domain compared to that in the spatial domain. 2.4 Coefficient of Variation According to the above interesting findings of the adversarial ultrasound, it is worth trying to design a defense method by virtue of the distribution difference between the original and adversarial ultrasound examples in the frequency domain. Against such a context, a new and easy-to-use evaluation measure, namely, Coefficient of Variation (C.V ) of the frequency spectrum, is hereby proposed from a frequency perspective to distinguish the original and adversarial ultrasound examples with no need for retraining. This evaluation measure is related to the distribution of the spectrum histograms. The calculation of Coefficient of Variation can be expressed as: C.V =

α·



1 WH

W −1 H −1 1 WH

2 x=0 y=0 (g(x,y)−g(x,y)) W −1 H −1 x=0 y=0 g(x,y)

(3)

where the spectrum size is W × H ; (x, y), the coordinate of a spectrum; g(x, y), the value at point (x, y); g(x, y), the average value of pixels in the whole spectrum; and α, a hyperparameter. Herein, the C.V evaluation measures of the randomly selected 100

78

J. Wang et al.

original images and corresponding adversarial examples are calculated in the spatial and frequency domains, respectively. As shown in Fig. 5a, the C.V evaluation measures of adversarial images tend to be the same in the spatial domain. Comparatively, as shown in Fig. 5b, the C.V evaluation measures in the frequency domain have significant intervals, indicating the feasibility of detecting ultrasound adversarial examples using C.V values from a frequency perspective.

Fig. 5. The C.V value of the original examples and adversarial examples in the spatial domain (left) and the frequency domain (right), respectively.

3 Experiments and Results 3.1 Datasets and Implementation Details Datasets (1) Breast dataset: The used breast ultrasound dataset [18] contains 780 ultrasound images from patients aged from 25 to 75 years old. These breast ultrasound images with an average size of 500 × 500 pixels were divided into three categories, i.e., benign, malignant, and normal images. (2) Thyroid dataset: The used thyroid ultrasound dataset [19] was from the Chinese Medical Ultrasound Artificial Intelligence Alliance (CMUAA), involving 3,644 thyroid ultra sound images. These thyroid ultrasound images were divided into two categories, i.e., normal images and malignant images. Implementation Details The model was implemented with Pytorch, with the codes running on the single GPU of NVIDIA RTX 12 GB. In this experiment, three common CNN classifiers, i.e., VGG13 [20], ResNet18 [13] and Inception-V3 [21] were used as the target models. To train the model, the images were resized on two datasets to 224×224 for VGG13 and ResNet18, 299 × 299 for Inception-V3. In addition, three types of attacks including the FGSM, PGD white-box attacks and SPSA [22] black-box attack were applied to the crafting of the adversarial ultrasound examples for evaluating the frequency domain-based adversarial detection method.

Adversarial Detection and Defense for Medical Ultrasound Images

79

3.2 Adversarial Examples Detection on Ultrasound Images This section presents the detection results for the aggressive adversarial ultrasound examples under different attacks and perturbation levels. The detection accuracy of the proposed method was measured according to the following formula: Acc. =

TA+TO TA+FA+TO+FO

(4)

where the TA and TO represent the number of correctly detected real ultrasound adversarial examples and original examples, respectively; FA, the number of true ultrasound original examples mistakenly detected as ultrasound adversarial examples; and FO, the number of true ultrasound adversarial examples mistakenly detected as ultrasound original examples. (1) Adversarial examples detection under different attacks To demonstrate the advantages of the hereby adopted adversarial detection approach under different attacks with the L∞ perturbation ε = 0.02 it was compared with the related defend works including DBA [11] and LR [12]. Table 1 shows the detection success rate of ultrasound adversarial examples crafted by different attacks. For quantitative results in Table 1, the proposed approach achieves the best detection results on two datasets, and performs better compared to the DBA and LR methods, with an increase of more than 9.11% and 5.5%, respectively. Besides, this frequency-based detection method is easy to use with no need for model training. Table 1. Accuracy of three methods for ultrasound adversarial examples detection. For each attack method, the L∞ same perturbation ε = 0.02 was set on BUSI and Thyroid datasets. Datasets BUSI

Thyroid

Defend Attacks

VGG13

ResNet18

InceptionV3

DBA

LR

Ours

DBA

LR

Ours

DBA

LR

FGSM

79.84

83.52

95.67

82.31

84.51

96.37

81.39

83.67

91.79

PGD

76.25

81.34

93.76

79.68

83.23

95.16

77.64

79.94

87.96

SPSA

77.64

80.13

95.89

80.02

87.43

96.51

78.98

81.64

91.37

FGSM

82.17

85.97

93.78

83.45

87.06

92.56

80.94

86.58

93.60

PGD

79.06

83.49

90.37

78.94

85.04

91.46

77.06

84.41

93.89

SPSA

79.95

84.07

94.29

81.46

85.94

93.31

78.98

85.13

95.88

Ours

(2) Adversarial examples detection under different perturbation levels Herein, the L∞ perturbation levels were set as 0.010, 0.015, 0.020, 0.025, and 0.030 while generating ultrasound adversarial examples. Figure 6 shows the detection rate of the ultrasound adversarial examples under different perturbation levels. For three different classification models and attack methods, the detection rate of the hereby proposed

80

J. Wang et al.

method is no less than 80%, even under a small perturbation level of 0.010. In addition, with the increase of perturbation, the detection success rates are also increasing steadily, which may be attributed to the fact that adding larger perturbation may cause more significant differences between the original and adversarial examples in the frequency domain, thereby promoting the effective detection. The reason why the frequency-domain-based defense method of ultrasound adversarial examples can achieve a high defense accuracy may be as follows: (1) For the ultrasound analysis tasks, frequency information has shown infinite potential [23, 24]. Hence, the hereby proposed method makes full use of the difference between the original and adversarial ultrasound images in the frequency domain; (2) Existing defense methods of natural images fail to consider the characteristics of ultrasound images. Different from the existing studies, the adversarial attacks of ultrasound images are hereby analyzed from a frequency perspective, and the defense method is designed according to the characteristics of ultrasound images.

Fig. 6. The accuracy of our proposed ultrasound adversarial examples detection method. a, b and c represent the accuracy of the method for three ultrasound classification model (i.e. VGG13, ResNet18, Inception-V3) with three attack algorithms on two datasets, respectively.

4 Conclusion This study first analyzes the adversarial attacks of ultrasound images from a frequency perspective, and proposes an easy-to-use adversarial example defense method that only needs a simple frequency domain conversion to detect the ultrasound adversarial examples, with no need for model retraining. Extensive experiments on two publicly available ultrasound datasets have illustrated that this method can be generalized to different attack methods and acquire high defense accuracy. In short, the present work emphasizes the importance of considering frequency analysis in the case of designing defenses for ultrasound images. Acknowledgments. The authors acknowledge supports from National Nature Science Foundation of China grants (62271246, U20A20389, 61901214), China Postdoctoral Science Foundation (2021T140322, 2020M671484), Jiangsu Planned Projects for Postdoctoral Research Funds(2020Z024), High-level Innovation and Entrepreneurship Talents Introduction Program of Jiangsu Province of China. Fang Chen is the corresponding author.

Adversarial Detection and Defense for Medical Ultrasound Images

81

References 1. Ma, X., Niu, Y., Gu, L., et al.: Understanding ad-versarial attacks on deep learning based medical image analysis systems. Pattern Recognit. 110, 107332 (2021) 2. Acharya, U.R., et al.: Thyroid lesion classification in 242 patient population using gabor transform features from high resolution ultrasound images. Knowl. Based Syst. 107, 235–245 (2016) 3. Guo, M., Du, Y.: Classification of thyroid ultrasound standard plane images using resnet-18 networks. In: 2019 IEEE 13th International Conference on ASID. IEEE, pp. 324–328 (2019) 4. Lazo, J.F., Moccia, S., Frontoni, E., et al.: Comparison of different cnns for breast tumor classification from ultrasound images. arXiv preprint. arXiv:2012.14517 (2020) 5. Byra, M., et al.: Adversarial attacks on deep learning models for fatty liver disease classification by modification of ultrasound image reconstruction method. In: 2020 IEEE IUS. IEEE, pp. 1–4 (2020) 6. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint. arXiv:1412.6572 (2014) 7. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. In: 5th ICLR, ICLR 2017, Toulon, France, April 24–26, Conference Track Proceedings (2017) 8. Madry, A., Makelov, A., Schmidt, L., et al.: Towards deep learning models resistant to adversarial attacks. arXiv preprint. arXiv:1706.06083 (2017) 9. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks learned by transduction. In: The 10th ICLR, ICLR 2022, Virtual Event, April 25–29 (2022) 10. Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: IEEE CVPR, pp. 2574–2582 (2016) 11. Feinman, R., Curtin, R.R., Shintre, S., et al.: Detecting adversarial samples from artifacts. arXiv preprint. arXiv:1703.00410 (2017) 12. Zhou, Q., Zhang, R., Wu, B., Li, W., Mo, T.: Detection by attack: Detecting adversarial samples by undercover attack. In: ESORICS. Springer, pp. 146–164 (2020) 13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016) 14. Tai, T.M., Jhang, Y.J., Hwang, W.J., et al.: Speckle image restoration without clean data. arXiv preprint. arXiv:2205.08833 (2022) 15. Traney, G.E., Allison, J.W., Smith, S.W., et al.: A quantitative approach to speckle reduction via frequency compounding. Ultrason. Imag. 8(3), 151–164 (1986) 16. Park, J., Kang, J.B., Chang, J.H., et al.: Speckle reduction techniques in medical ultra- sound imaging. BMEL 4(1), 32–40 (2014) 17. Yoon, C., Kim, G.D., Yoo, Y., et al.: Frequency equalized compounding for effective speckle reduction in medical ultrasound imaging. Biomed. Signal Process. Control 8(6), 876–887 (2013) 18. Al-Dhabyani, W., Gomaa, M., Khaled, H., et al.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020) 19. Zhou, J., et al.: Thyroid nodule segmentation and classification in ultrasound images. In: International Conference on MICCAI (2020) 20. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556 (2014) 21. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE CVPR, pp. 2818–2826 (2016) 22. Uesato, J., O’donoghue, B., Kohli, P., et al.: Adversarial risk and the dangers of evaluating against weak attacks. In International Conference on Machine Learning. PMLR, pp. 5025– 5034 (2018)

82

J. Wang et al.

23. Hacihaliloglu, I., Abugharbieh, R., Hodgson, A., et al.: Bone segmentation and fracture detection in ultrasound using 3d local phase features. In: MICCAI. Springer, pp. 287–295 (2008) 24. Xian, M., et al.: Fully automatic segmentation of breast ultrasound images based on breast characteristics in space and frequency domains. Pattern Recogn. 48(2), 485–497 (2015)

A Novel Model-Independent Approach for Autonomous Retraction of Soft Tissue Jiaqi Chen, Longfei Ma, Xinran Zhang, and Hongen Liao(B) Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing 100084, China [email protected]

Abstract. Surgical robots have played an important role in guiding surgery toward precise interventions and targeted therapy. Aim to provide standard surgical solutions, autonomous surgical robotic systems can potentially improve the reproducibility and consistency of robot-assisted surgery. To meet the demands of enhancing the autonomy of surgical robots, we propose a novel model-independent training methodology for retraction and manipulation of deformable tissue in robotic surgery. We use contrastive optimization to learn deformable tissue’s underlying visual latent representations. Then we apply an improved version of Deep Deterministic Policy Gradients (DDPG) with asymmetric inputs to train an agent in a simulation environment. To evaluate the effectiveness and validity of our method, we tested our approach based on three different criteria. Compared to the state-of-the-art method, our method could especially accomplish safe tissue retraction task in constrained situations without collecting expert demonstrations. Keywords: Robotic surgery · Deformable tissue manipulation · Autonomous manipulation · Deep reinforcement learning

1 Introduction Advanced surgical techniques have led to better prognostic outcomes for patients, but have also brought challenges to surgeons’ perception, reasoning and decision-making abilities. In response to the clinical demands, surgical robots have played an important role in guiding surgery toward precise interventions and targeted therapy. However, current robotic surgery relies exclusively on teleoperation control strategies, introducing human-related risk factors into the surgical system. Autonomous surgical robotic systems aim to provide standard surgical solutions, potentially reduce human error and improve the reproducibility and consistency of robot-assisted surgery [1]. In robot-assisted minimally invasive surgery, a large portion of the time is spent in repeated retraction and manipulation of the tissue to expose the area of interest [2]. For example, in laparoscopic adrenalectomy, retraction of the fat tissue is essential for better exposure of the adrenal gland and the kidney. This subtask is rather stressful for the surgeons since it may have to be repeated several times due to tissue/instrument relative slippage. This procedure of tissue retraction requires extensive interaction with © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 83–90, 2024. https://doi.org/10.1007/978-3-031-51485-2_10

84

J. Chen et al.

soft tissue which has heterogeneous physical and geometric properties, with high inter and intra-subject variability. The limited perception involves risks such as instrument collision and tissue damage that can negatively affect the procedure [3]. Methods for deformable object manipulation can be divided into two categories: model-based control methods and data-driven control methods. However, using model-based techniques for autonomous manipulation of an unknown homogeneous/heterogeneous deformable tissue might not be feasible in a real surgical site, since the tissue might experience a large variation in its mechanical properties. Moreover, the surrounding anatomies may constrain the deformation of the tissue and bring challenges to the modeling of the tissue deformation [4]. On the other hand, model-independent datadriven and learning-based control methods may be a practical solution for deformable object manipulation. However, acquiring datasets and expert demonstrations are rather expensive and unsafe for the tissue retraction task [5]. In this study, we propose a novel model-independent training methodology for the retraction and manipulation of deformable tissue in robotic surgery. To automate deformable tissue retraction, we use contrastive optimization to learn the underlying visual latent representations of deformable tissue. Then we use an improved version of DDPG [6] with asymmetric inputs to train an agent in a simulation environment. To thoroughly evaluate the performance of the proposed approach, we tested our approach based on three different criteria. We demonstrated substantial improvements in constrained deformable tissue retraction over other approaches.

2 Methods We consider a constrained surgical scene to accomplish the tissue retraction task. The objective of the task is to optimally and efficiently grasp and retract soft tissues to expose a hidden Region of Interest (ROI) to an intra-operative fixed-position camera. An efficient retraction policy should prevent the pulling of highly constrained areas that are adjacent to the attachment points (APs). Such policies could empirically minimize the risk of tissue damage and task failure. We use the da Vinci Research Kit [7] and the UnityFlexML [8, 9] framework to simulate the manipulation and retraction of fat tissue. The UnityFlexML framework is based on NVIDIA FleX [10] for soft object deformation simulation. We combine State Representation Learning (SRL) and asymmetric Actor-Critic Deep Reinforcement Learning (DRL) to learn the optimal policy for the retraction of the constrained deformable tissues, as described in Fig. 1. 2.1 State Representation Learning (SRL) Setup To overcome the challenge of high-dimensional perceptions, SRL methods can be used to create a latent representation that contains only the useful information for controlling the robot. SRL approaches also play an important role in state estimation and representation of deformable objects. As shown in Fig. 2, we use a contrastive optimization based SRL method.

A Novel Model-Independent Approach for Autonomous Retraction of Soft Tissue

85

Fig. 1. Overview of our SRL and DRL method

Fig. 2. SRL setup of our method

The observation It is embedded into a latent space through the network ϕ as zt = ϕ(It ). Given a transition (zt , at , zt+1 ) in the latent space, the forward dynamics model as f predicts the next latent state as z t+1 = f (zt , at ). The minimization of the contrastive prediction loss is used to train the forward dynamics model f . The contrastive loss creates a latent space by pushing representations from the same class together and ones from different classes apart. We use the InfoNCE [11] contrastive prediction loss as followed:     h zˆt+1 , zt+1 (1) L = −ED log k   i=1 h zˆt+1 , zˆi 

where zˆi is one of the negative samples. In practice, we randomly pick B trajectories other than the anchor trajectory in the same batch, and then randomly choose k samples from these trajectories as negative samples. h is the similarity function. In practice, we use the similarity function as followed: h(z1 , z2 ) = exp(−z1 − z2 2 )

(2)

where the norm is 2 -norm. We use a dataset of 2132 trajectories collected from the simulation environment to train the contrastive optimization model. The dataset is composed of the pre-trained part and the random exploration part. Each trajectory contains 30 timesteps of images from the RGB camera (128 * 128 * 3) and other environment states. We adopt random crop [12] for the data augmentation of the dataset. Augmentations are applied randomly across the batch but consistently across the trajectory, which enables the augmentation to retain the temporal information.

86

J. Chen et al.

2.2 Deep Reinforcement Learning (DRL) Setup We use DDPG [6] with asymmetric inputs to learn the policies of tissue retraction. DDPG has derived numerous algorithms to improve the performance, among which the Twin Delayed Deep Deterministic Policy Gradient Algorithm (TD3) [13] addresses the problem of overestimation bias of Q function. We employ TD3 with asymmetric input to accelerate the deep reinforcement learning training process by exploiting privileged information in the simulation environment, as shown in Fig. 3. The asymmetric input network structure uses state representations of high-dimensional observations as inputs to the actor network and low-dimensional environmental states (end effector position, robotic arm states, etc.) as inputs to the critic network. This network structure greatly reduces the number of trainable parameters, accelerates the learning process and improves the accuracy of the critic output.

Fig. 3. DRL setup of our method

The agent of our DRL setup is represented by the end effector (EE) of the da Vinci PSM. The initial position of the kidney and tumor is defined as pk . The initial position of EE is defined as p0 . The position of EE at time t is defined as pt . The target position of EE is defined as pT . The positions of the attach points (APs) are defined as pA1 and pA2 . We use ct ∈ {0, 1} to define the state of gripper (open or close) of EE. Inspired by the Dense2Sparce reward shaping approach in [14], we combine the dense reward and sparse reward. Before the k episode, we design a dense reward function that has the advantage of faster learning speed. We also deploy an “anti-goal” depending on pA1 and pA2 , thus specifying a state that the agent should avoid. ⎧ ⎨ min 0, −pt − pk  + dt ∗ α1 , if ct = 0 rd = min 0, −pt − pT  + dt ∗ α2 , if ct = 1 (3) ⎩ 1, if pt − pT  < δ where dt is defined as the weighted Euclidean distance of pt and pA1 , pA2 . The δ is defined as the threshold of EE reaching the target position. α1 and α2 are the normalization factors. After the k episode, we use a sparse reward function, which has the advantage

A Novel Model-Independent Approach for Autonomous Retraction of Soft Tissue

of being less susceptible to interference and noise in the environment.

1, if pt − pT  < δ rs = 0, otherwise

87

(4)

3 Results To evaluate the effectiveness of our method, we tested our approach based on three different criteria: sample efficiency, accomplishment degree of the task and safety level indicator of the task. Our virtual scene simulates the tissue retraction task during an adrenalectomy procedure. The yellow tissue represents the renal adipose tissue that needs to be retracted to expose the tumor (green sphere) embedded in the underlying kidney (red object). We compared our approach against the state-of-the-art method CURL [15], which employs auxiliary self-supervision tasks to accelerate the learning progress of RL methods. Our method can especially accomplish safe tissue retraction task in constrained situations. 3.1 Sample Efficiency Learning curves obtained with our method and CURL are shown in Fig. 4. The shaded area spanned the range of reward curves obtained starting from five different initialization seeds. Our method reached higher reward values than the CURL method and showed a smoother learning pattern and more consistent performance between seeds.

Fig. 4. Learning curves of normalized reward for our method and the baseline CURL

3.2 Accomplishment Degree of the Task In the evaluation of the accomplishment degree of the task, we additionally specified a target plane as shown in Fig. 5, of which the deformable tissue needed to be retracted to one side. The proportion of the particles of the fat tissue that met the position condition was calculated and defined as η. The task success rate was averaged over the different combinations of APs positions (8 cases) and randomized initial EE positions. The task success rates over different task accomplishment conditions of our method and the CURL were shown in Table 1.

88

J. Chen et al.

Fig. 5. Visualization of target planes for tissue retraction

Table 1. Task success rates over different task accomplishment conditions of our method and the baseline CURL Task accomplishment condition

Our method (%)

CURL method (%)

Successful grasp with η < 80%

98.5

94.8

Successful grasp with η ≥ 80%

92.6

89.9

Successful grasp with η ≥ 85%

85.0

80.0

Successful grasp with η ≥ 90%

73.5

67.4

The sequence of action frames for task completion of our method and CURL method are shown in Fig. 6. Our method was able to grasp more away from the APs and expose the tumor and kidney regardless of the random conditions and avoid the overstretch and damage of the deformable tissue.

Fig. 6. Sequences of action frames for tissue retraction task completion of our method (top) and CURL method (bottom)

3.3 Safety Level Indicator of the Task We applied a novel evaluation approach to assess the safety level of the tissue retraction task. We calculated the entropy distribution of the histogram of feature angles of the fat tissue particles. Then we evaluated the safety level of learned policies based on the additional spatial information. 40 trajectories with a target threshold of 80% were collected for each method. 10 deformable tissue states in each trajectory were recorded and evaluated. If the proportion of safe tissue states evaluation in the trajectory was higher than 90%, the trajectory was considered a safe retraction. The proportion of safe

A Novel Model-Independent Approach for Autonomous Retraction of Soft Tissue

89

retraction trajectories in successful trajectories of our method was 92.5% and that of the CURL method was 80.0%. Our method could perform safer tissue retraction with less tissue overstretch.

4 Discussion and Conclusion Results show that our method demonstrates higher accomplishment degree of the task despite of the random states of the environment and can especially accomplish safe tissue retraction task in constrained situations, avoiding overstretch of the tissue. However, our proposed method must be tested on hardware system to further validate its efficacy. Subsequent work will focus on sim-to-real transfer of the trained policies and transferring the learnt policies to different scenes. In conclusion, we present a novel model-independent data-driven method for retraction and manipulation of deformable tissue in robotic surgery. This method does not need to collect human demonstrations and can learn visual latent representations of deformable tissue from merely RGB images. The proposed method indicates the potential for autonomous control methods to be used in clinical environments. Acknowledgment. The authors acknowledge supports from National Key Research and Development Program of China (2022YFC2405200), National Natural Science Foundation of China (82027807, U22A2051), Beijing Municipal Natural Science Foundation (7212202), Tsinghua University Spring Breeze Fund (2021Z99CFY023), Institute for Intelligent Healthcare, Tsinghua University (2022ZLB001), and Tsinghua-Foshan Innovation Special Fund (2021THFS0104).

References 1. Wagner, M., Bodenstedt, S., Daum, M., et al.: The importance of machine learning in autonomous actions for surgical decision making. Art. Int. Surg. 2(2), 64–79 (2022). https:// doi.org/10.20517/ais.2022.02 2. Patil, S., Alterovitz, R.: Toward automated tissue retraction in robot-assisted surgery. In: ICRA, IEEE, Anchorage, AK, USA, pp 2088–2094 (2010). https://doi.org/10.1109/ROBOT. 2010.5509607 3. Attanasio, A., Scaglioni, B., Leonetti, M., et al.: Autonomous tissue retraction in robotic assisted minimally invasive surgery—A feasibility study. IEEE Robot. Autom. Lett. 5(4), 6528–6535 (2020). https://doi.org/10.1109/LRA.2020.3013914 4. Tagliabue, E., Dall’Alba, D., Pfeiffer, M., et al.: Data-driven intra-operative estimation of anatomical attachments for autonomous tissue dissection. IEEE Robot. Autom. Lett. 6(2), 1856-1863 (2021). https://doi.org/10.1109/LRA.2021.3060655 5. Retana, M., Nalamwar, K., Conyers, D.T., et al.: Autonomous data-driven manipulation of an unknown deformable tissue within constrained environments: a pilot study. In: ISMR IEEE, Georgia, USA, pp. 1-7(2022). https://doi.org/10.1109/ISMR48347.2022.9807519 6. Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. In: ICLR, San Diego, CA, USA (2015). http://arxiv.org/abs/1509.02971 7. Kazanzides, P., Chen, Z., Deguet, A., et al.: An open-source research kit for the da Vinci® surgical system In: ICRA, IEEE, Hong Kong, pp 6434–6439 (2014). https://doi.org/10.1109/ ICRA.2014.6907809

90

J. Chen et al.

8. Tagliabue, E., Pore, A., Dall’Alba, D., et al.: Soft tissue simulation environment to learn manipulation tasks in autonomous robotic surgery. In: IROS, IEEE, Las Vegas, NV, USA, pp. 3261–3266 (2020). https://doi.org/10.1109/IROS45743.2020.9341710 9. Pore, A., Corsi, D., Marchesini, E., et al.: Safe reinforcement learning using formal verification for tissue retraction in autonomous robotic-assisted surgery. In: IROS, IEEE, Prague, Czech Republic, pp. 4025–4031(2021). https://doi.org/10.1109/IROS51168.2021.9636175 10. NVIDIA gameworks. Nvidia FleX. https://developer.nvidia.com/flex 11. Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv e-prints (2018). https://doi.org/10.48550/arXiv.1807.03748 12. Laskin, M., Pinto, L., Lee, K., et al.: Reinforcement learning with augmented data. arXiv e-prints (2020). https://doi.org/10.48550/arXiv.2004.14990 13. Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv e-prints (2018). http://arxiv.org/abs/1802.09477 14. Luo, Y., Dong, K., Zhao, L., et al.: Balance between efficient and effective learning: dense2sparse reward shaping for robot manipulation with environment uncertainty. arXiv e-prints (2020). http://arxiv.org/abs/2003.02740 15. Srinivas, A., Laskin, M., Abbeel, P.: CURL: contrastive unsupervised representations for reinforcement learning. arXiv e-prints (2020). http://arxiv.org/abs/2004.04136

A Soft Robot Based on Magnetic-Pneumatic Hybrid Actuation for Complex Environments Zhuxiu Liao, Jiayuan Liu, Longfei Ma, and Hongen Liao(B) Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, China [email protected]

Abstract. Soft robots show greater medical and industrial application value for their potential miniaturization, inherent compliant softness, and adaptability to unstructured environments than conventional rigid counterparts. However, it remains challenging to simultaneously guarantee carrying load capacity and moving flexibly in unstructured environments and confined spaces. In the present work, we propose a novel magnetic-pneumatic hybrid-actuated soft robot for locomotion in unstructured environments. This robot design yields multiple degrees of freedom, strong shape deformability, and great adaptability to irregular terrain and scenes with different slopes and roughness. Locomotion experiments on different environments were conducted to validate its performance. Our work pioneers a novel design and actuation approach for soft robots that show great potential in industrial and medical applications. Keywords: Soft robot · Magnetic-pneumatic actuation · Unstructured environment

1 Introduction Soft robots are of great interest for their applications in the fields of industrial [1] and medical areas, such as minimally invasive surgery [2], endoscopic procedures [3], and drug delivery [4]. Due to their potential miniaturization, flexible architecture, and adaptability to unstructured environments, soft robots can non-invasively enter narrow areas inside our bodies and perform diagnoses or treatments [5]. However, there remain challenges to the wide utilization of soft robots, especially for medical applications, such as motion control and biocompatibility. There are several methods for the actuation of soft millirobots: electrical [6], chemical [7], optical [8], tendon [9], hydraulic [10], pneumatic [11], and magnetic [12] actuation. Electrical-driven soft robots tend to require high voltage, while chemical-driven robots are prone to overheating and toxic side effects. Considering biocompatibility and safety for human interaction in medical applications, electric and chemical-actuated methods are not suitable. Electrically or chemically actuated methods cannot satisfy safety requirements in industrial pipelines due to potential electrical sparks or chemical reactions. When the operating position is located deep in the human body or industrial © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 91–97, 2024. https://doi.org/10.1007/978-3-031-51485-2_11

92

Z. Liao et al.

pipeline, it is difficult for the light-driven robot to ensure the effective transmission of light. Tendon-driven soft robots often can only produce bending motion and have poor stretching performance, thus lacking effective autonomous motion capability. In addition, when the rope-driven robot reaches the deep part of the human body, the intertwined coupling of the rope will affect the final output force of the rope and affect the robot’s movement. Pneumatically driven and hydraulically driven soft robots can produce movements such as extension, bending, and twisting by changing the pressure inside the soft robot because of the non-isotropic nature of soft robots. They have the combined advantages of relatively strong body support and shape deformability, thus increasing obstacle-crossing and carrying-load capacities. Compared to hydraulically driven robots, pneumatically driven robots are faster in deformation and lighter in weight, providing the basis for rapid robot motion and drug delivery. However, the heading directions and moving range of pneumatical soft millirobots are limited by the amount and position of actuation chambers. Complementary to pneumatic soft robots, magnetically soft robots are actuated by the external field. They demonstrate superior capacities of moving flexibility and performing robustly, especially in wet or smooth conditions. However, with the relatively weak shape deformability, soft magnetic millirobots are inferior to pneumatic soft millirobots in terms of carrying the load and over-obstacle performance, especially in rough conditions. Magnetically driven robots are commonly controlled by magnetic fields using three-dimensional Helmholtz coils and moveable magnets or electromagnets. Threedimensional Helmholtz coils consist of an array of electromagnetic coils with large magnetic field degrees of freedom and controllability, but they require the robot to be in the middle of the three-dimensional Helmholtz coils, which limits the robot’s working space and restricts the robot’s application in medical and industrial scenarios. It has a larger working space for using movable magnets or electromagnets. However, precise control of the direction of the magnetic field and the magnitude of the magnetic force requires sufficient degrees of freedom of motion in the motion system. Purely using magnetic fields to drive soft robots often leads to complex motion systems, increasing the complexity of the system and the difficulty for application in medical and industrial scenarios. In the present work, we propose a novel hybrid-actuated soft robot for locomotion in unstructured environments, as shown in Fig. 1. We fuse magnetic and pneumatic actuation on a millimeter scale. Combining the advantages of pneumatical and magnetic actuation, the proposed millirobot performs robustly even in harsh environments like colon or industrial pipelines. It is of great significance for reducing patients’ discomfort, postoperative recovery time, and missing rates for diagnosing colonic disorders in medical applications. In this study, the hybrid-actuated soft robot is tested in different conditions, including locomotion on varied planes, slopes, and pipes. To explore its potential for clinical application, we tested the robot’s motility in scenarios such as a gastric phantom model, a respiratory tract phantom model, a vascular phantom model, a large intestine phantom model, and a porcine colon. In the experiments of this study, the soft robot can walk fast on varied surfaces and surmount different obstacles using only one air tube and one

A Soft Robot Based on Magnetic-Pneumatic Hybrid Actuation

93

Fig. 1. The schematic diagram of the potential application of the soft robot in the stomach and the intestine.

solenoid coil. Vertical crawling is achieved by using two electromagnetic coils and one air tube. The robots’ motion capability of carrying the image acquisition device is also tested. In addition, for surgical treatment applications, the air chamber of our proposed robot not only helps the robot to achieve deformation but also enables the delivery of drugs to achieve surgical treatment. The soft robot can walk on the uneven and wet chicken skin surface and release drugs to the target location by using only one tube.

2 Design Concept and Fabrication Inspired by worms, which achieve locomotion by telescopic movement and switching off the fixed part, our proposed soft millirobot mainly consists of three parts: high-hardness soft body, low-hardness soft body, and foot. The configuration is shown in Fig. 2a. A pneumatic-networks air chamber, which is made up of a series of channels, is used in our proposed robot. Due to the hardness difference between the upper and the lower layer, the soft millirobot bends under air inflation (Fig. 2b).

Fig. 2. Design concept of the hybrid-actuated soft robot. a The structure of the hybrid-actuated soft robot. b The bending performance of the hybrid-actuated soft robot. c The schematic diagram of the principle of the robot’s bending motion. d The schematic diagram of robot steering.

As for the magnetically actuated part, we use one solenoid coil to generate magnetic torque, pushing, and pulling force. Notably, the magnetic force can be controlled by changing the magnitude and direction of the current on the electromagnetic coil and adjusting the direction of the electromagnetic coil. In the initial state, the internal magnetic field of the soft robot is oriented horizontally to the right. When the external

94

Z. Liao et al.

magnetic field is oriented vertically upward, the internal magnetic field of the robot will tend to align with the external magnetic field, causing the proximal end of the robot to be fixed while the distal end curls upward. Conversely, when the external magnetic field is oriented vertically downward, the distal end is fixed.

Fig. 3. The schematic diagram of the motion gait of the worm and the hybrid-actuated soft robot. a The schematic diagram of worm movement. b The schematic diagram of the worm-like gait of the hybrid-actuated soft robot. c Demonstration of the soft robot’s worm-like gait on the surface.

When the distal end of the soft robot curls upward, the contact area between the robot and the moving surface gets smaller, and the frictional force between the robot and the moving surface will be reduced, which makes it convenient the adjustment of the heading direction. When the direction of the magnetic field changes, omnidirectional motion can be achieved. The process of fabrication is divided into three steps: preparation of low-hardness bottom structure and foot structure, preparation of cavity structure and upper highhardness structure, and magnetization operation. With this processing method, robots of different hardness and sizes for corresponding scenarios can be manufactured.

3 Motion Gait and Analysis Figure 3a shows that worm movement can be simplified into three steps: 1. Lift the head section and adjust the heading direction effectively. 2. Fix the end section and extend the soft body to realize the displacement of the head section. 3. Fix the head section and shrink the soft body to realize the overall displacement. Inspired by worms, with the adjustment of the magnetic field, fixation and omnidirectional movement of our soft robot can be achieved in harsh environments, which makes it a possible solution for industrial and medical applications. As shown in Fig. 3b, the worm-like autonomous movements of our proposed soft millirobot can be achieved in three steps:

A Soft Robot Based on Magnetic-Pneumatic Hybrid Actuation

95

1. Fix the proximal end of the soft robot by applying a vertical upward magnetic field. Elongate the soft robot’s body and lift the distal end of the soft robot by inflating the air chamber. 2. Change the direction of the external magnetic field to vertical downward, and the soft robot switches from proximal fixation to distal fixation to realize the change in the robot’s posture and position. 3. Keep on applying a downward magnetic field and evacuate the air chamber. The soft robot returns to its original state, and overall displacement is fulfilled. In the worm-like gait of the soft robot, worm-like elongation and heading direction adjustment can be both achieved. This capability enables it to realize autonomous locomotion on unstructured surfaces. This worm-like motion based on external magnetic field direction change has a large working space because it is not a magnet/electromagnet following system. At the same time, the worm-like motion of the soft robot is based on bending motion and thus has better motion performance and motion efficiency in crossing obstacles.

Fig. 4. The schematic diagram of the fabrication process of the hybrid-actuated soft robot.

We assume that the soft millirobot bends uniformly, the radius of curvature is Rr , the bending angle is θr , the external magnetic field remains uniform and the angle between the external magnetic field and the vertical direction is θB . According to the kinetic model of the robot (Fig. 4), the acceleration of the robot in all directions can be defined as: mr ax = FBx − Ftx − fx − Fax mr ay = FBy − Fty − fy − Fay mr az = mr g − FBz − Ftz − FN − Faz

(1)

→ a (x, y, z) represents the acceleration of the where mr represents the mass of the robot, − − → − → robot, FB (x, y, z) represents the magnetic force on the distal end, Ft (x, y, z) represents − → − → the tensile force driven by the air tube, f (x, y) represents the friction force, Fa (x, y, z) − → represents the force on the robot driven by air pressure, FN represents the support force on the robot, g represents the gravitational acceleration. Considering that the weight of − → the air tube is small enough, the tensile force Ft driven by the air tube can be neglected.

96

Z. Liao et al.

Magnetic force on the robot can be defined as: → − − → − → − → FB = Vr · (Mr · ∇ ) · B (x, y, z)

(2)

− → where Vr represents the volume of the robot, Mr represents the magnetization of the − → robot, B (x, y, z) and represents the magnetic flux density of the robot. Assuming that the magnetization of the robot is uniform, the magnetization intensity of the soft robot can be expressed as: y z − → , −M ·  ) Mr (x, y, z) = (0, M ·  2 2 2 z +y z + y2

(3)

where M represents the magnetization intensity amplitude of the robot.

Fig. 5. Applications of the soft robot in different environments. a Locomotion in a rugged pipe. b Locomotion in a soft pipe. c Locomotion on a curve. d Locomotion on a Y-like pipe with the integration of a CMOS camera. e Locomotion on a bronchus phantom. f Locomotion on a femoral artery phantom. g Locomotion on stomach phantom. h Vertical climbing on a pipe. i Vertical climbing on a pipe with various diameters. j Locomotion on a U-like pipe. k Locomotion on an intestine phantom. l Drug release on pig intestine surface.

4 Experiment Different environments are used to evaluate our robots’ capability of locomotion in harsh environments as shown in Fig. 5. The robot proposed in this paper can move flexibly and stably on varied surfaces. Through the phantom experiments, application potential in different fields has been demonstrated. The combination of magnetic and pneumatic enables our robot to have multiple degrees of freedom, strong shape deformability, and great adaptability to various environments, which makes it possible for our robot to fulfill complex tasks.

A Soft Robot Based on Magnetic-Pneumatic Hybrid Actuation

97

5 Conclusion The present paper presents the design, fabrication, and motion gait analysis of the hybridactuated soft millirobots. We fuse the magnetically and pneumatically actuated methods to meet application requirements for unstructured environments. Locomotion experiments demonstrate that the proposed soft millirobot shows potential industrial and medical application values to perform closer inspection and manipulation in unstructured and constrained environments.

6 Discussion For future work, we will further improve the locomotion speed, carrying capacity, biocompatibility, and obstacle-crossing ability of the proposed robot through tailoring design parameters. Acknowledgment. The authors acknowledge supports from National Natural Science Foundation of China (82027807, U22A2051), National Key Research and Development Program of China (2022YFC2405200), Beijing Municipal Natural Science Foundation (7212202), Institute for Intelligent Healthcare, Tsinghua University (2022ZLB001), and Tsinghua-Foshan Innovation Special Fund(2021THFS0104) and Tsinghua University Spring Breeze Fund (2021Z99CFY023).

References 1. Rich, S.I., Wood, R.J., Majidi, C.: Nat. Electron. 1(2), 102 (2018) 2. Gifari, M.W., Naghibi, H., Stramigioli, S., Abayazid, M.: Int. J. Med. Robot. Comput. Assisted Surgery 15(5), e2010 (2019) 3. Runciman, M., Darzi, A., Mylonas, G.P.: Soft Rob. 6(4), 423 (2019) 4. Cui, H., Zhao, Q., Wang, Y., Du, X.: Chem.–An Asian J. 14, 14, 2369 (2019) 5. Iida, F., Laschi, C.: Procedia Comput. Sci. 7, 99 (2011) 6. Li, G., et al.: Nature 591(7848), 66 (2021) 7. Zhu, H., Xu, B., Wang, Y., Pan, X., Qu, Z., Mei, Y.: Sci. Robot. 6, 53 eabe7925 (2021) 8. Liu, J.A.-C., Gillen, J.H., Mishra, S.R., Evans, B.A., Tracy, J.B.: Sci. Adv. 5, 8 eaaw2897 (2019) 9. Kastor, N., Mukherjee, R., Cohen, E., Vikas, V., Trimmer, B.A., White, R.D.: Robotica 38(1), 88 (2020) 10. Rozen-Levy, S., Messner, W., Trimmer, B.A.: Int. J. Robot. Res. 40(1), 24 (2021) 11. Niu, H., et al.: Soft Rob. 8(5), 507 (2021) 12. Ze, Q., Wu, S., Nishikawa, J., Dai, J., Sun, Y., Leanza, S., Zemelka, C., Novelino, L.S., Paulino, G.H., Zhao, R.R.: Science Adv. 8, 13 eabm7834 (2022)

A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional Spatial Interaction Nan Zhang, Tianqi Huang, Jiayuan Liu, Yuqi Ji, Longfei Ma, Xinran Zhang, and Hongen Liao(B) Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, China [email protected]

Abstract. Brachytherapy is an efficient clinical treatment way for cervical cancer, which is one of the most common gynecological malignancies. Traditional manual cervical tumor segmentation is a massive and time-consuming task, and we tend to enhance the manual segmentation efficiency. In this study, we present a novel virtual reality environment that supports cervical tumor 3D segmentation using user hands and spatial pan. We created a cervical imaging visualization platform, included the 2D and 3D medical image display and an interactive plane that connects 2D and 3D medical images. Furthermore, we proposed a 3D spatial interaction method that combines free-hand interaction with an optical localization-based interactive pen, which is both convenient and efficient. We built a prototype and verified the feasibility of the proposed environment through user research. Our prototype was well received by the participants, and the average tumor segmentation time was around 5 min. We hope to improve our environment in future. Keywords: Virtual reality · Human-computer interaction · In-situ interaction · 3D volume rendering · Floating autostereoscopic display

1 Introduction Cervical cancer [1] is one of the most common gynecological malignancies. In recent years, the incidence of cervical cancer has shown a younger trend. In clinical, brachytherapy of cervical tumor [2, 3], in which the radiation source is placed within the body, can achieve a cure rate of more than 80%. The implementation of brachytherapy relies on the accurate result of cervical tumor segmentation [4] and path planning [5]. However, the essential task of cervical tumor segmentation is huge and time-consuming, which restricts the progress of treatment. The existing cervical tumor segmentation methods are usually classified into manual and automatic segmentation. With the development of computer technology, the automatic cervical tumor segmentation methods are widely concerned. Torheim et al. [6] proposed an automatic method for delineating locally advanced cervical cancers using © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 98–105, 2024. https://doi.org/10.1007/978-3-031-51485-2_12

A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional

99

machine learning approach. The tumor segmentation results was 0.44 for Dice similarity coefficient. Lin et al. [4] developed a U-Net-based deep learning method in cervical tumor segmentation, and the segmentation results was 0.84 for Dice similarity coefficient. However, it is still challenge to segment a cervical tumor that exhibits unclear boundaries and complex anatomical structure. The difference and change of patient images make automatic segmentation methods unable to be fully involved. In clinical, most of the segmentation is completed by manual segmentation. The expert’s judgment on the tumor area is a power advantage of segmentation. The manual segmentation based on two-dimensional visualization and interaction with a keyboard and mouse is error-prone and time-consuming task for the experts. New human-interaction modalities with three-dimensional (3D) perception and precision-fast interaction are necessary. Based on the above consideration, we propose a virtual reality (VR) environment for cervical tumor segmentation via 3D visualization and 3D spatial interaction. Our 3D spatial interaction method supports the image processing of 3D spatial free sites, which brings convenience to tumor segmentation. Our contributions are: (1) a novel cervical tumor segmentation VR environment: we provide an integrated image processing VR platform for tumor segmentation and complete image processing tasks in 3D space, (2) 3D spatial interaction with gesture and spatial pan: we realize the 3D spatial interaction with user hands and a pan for in-situ tumor segmentation, and the generated interactive plane also assists the user in perceiving the 3D spatial structure of medical images.

Fig. 1. a Overview of the environment, b move the interactive plant, and c change the visualization parameters.

2 Methods 2.1 Overview of Our Environment In this study, we present a cervical tumor segmentation VR environment (Fig. 1), which realizes the manual segmentation with 3D spatial interaction. The proposed VR environment supports the visualization of 2D and 3D medical imaging and interaction with

100

N. Zhang et al.

medical images and button bar. The workflow of our proposed method included two parts: medical imaging process and VR environment process. In the part of medical imaging process (as shown in Fig. 2), the original MR image of cervical cancer is pretreated, including the data reading, rearrangement and generate 3D texture data. After that, the data input in the developed VR environment. According to the visualization platform, 3D spatial interaction and 3D segmentation label generation, the cervical tumor can be segmented. 2.2 Visualization Platform of Cervical Imaging In clinical, the manual segmentation of tumors needs experts to delineate the boundary of the tumor layer by layer on 2D display device. Because the human body is a natural 3D structure, the segmentation work based on multi-layer 2D images has a high demand for doctors’ spatial imagination and clinical experience. In our environment, a visualization platform of cervical imaging is developed, as shown in Fig. 1a. We display the cervical imaging in both 2D and 3D display, and provide a realtime synchronous interactive plane assisting doctors to identify the spatial structure of the interactive positions. As shown in Fig. 1b, c, the interactive plane is optional and free-moving. It undertakes the task of connecting the visualization between 2D and 3D medical images. Therefore, the view of 2D slice depends on the cutting of the interactive plane on the 3D volume rendering image. The 3D volume-rendered image makes up for the lack of intuitive 3D visualization, which is based on the volume ray casting algorithm. In the rendering process, we sample and synthesize from the front to the back of the image,   (1) Colori = 1 − Colori−1 .a ∗ Colori + Colori−1 where Colori is the cumulative color and Colori is the sample point color. To increase rendering specificity, we use three parameters (I , D and K) to process the color of sample points, Colori .rgb = I ∗ Colori .rgb Colori .a = D ∗ pow(Colori .a, K), Colori .rgb = Colori .rgb ∗ Colori .a.

(2)

where I change the color intensity of the sample points, D determines the sampling depth, and and K) changes the contrast between the colors of the sample points. We also normalize the voxel values to 0–1 and map these values to a pseudo-color map. The characteristic pseudo-color image can usually make the image display partially enhanced (as show in Fig. 1b, c). Besides, we utilize the 3D floating autostereoscopic visualization to realize the naked-eye display of the environment according to a previous study [7]. The 3D floating autostereoscopic display can provide rich parallax information and depth perception without glasses. The rendering principle [8] included in the multi-capturing and optical reconstruction of the 3D scene. In this study, we adjust the virtual camera array in the VR environment according to the depth of the floating medical imaging, which ensures the hand-eye coordination and the accurate in-situ interaction.

A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional

101

Fig. 2. Workflow of the cervical tumor segmentation environment.

2.3 3D Spatial Interaction and 3D Tumor Segmentation To achieve accurate spatial localization and interaction, we integrated an optical localization-based interactive pen into this system. We constructed a pen with three optical spheres on the end by 3D printing (Fig. 3a). The tracking of the pen’s position and pose is achieved through the OptiTrack optical positioning system [9]. We wrote a unity interface to receive the poses of the optical spheres in real time, translate the poses into pen tip positions and rotation angles for input into our VR environment. In the development of human-computer interaction environment, we use Leap Motion to set the virtual interactive hand used in the segmentation interaction. The virtual interactive hand graph changes in real time according to the real human hand detected by the Leap Motion sensor, ensuring the real-time interaction. We use the cooperation of two hands to complete the development of a variety of incremental interactive gestures, which include changing the rotation and position of interaction plane (Fig. 4a, b), changing the field of vision and position (Fig. 4c, d), and others.

Fig. 3. Spatial pan interaction and tumor segmentation. The contours are drawn on different plane, like on (a) horizontal plane and (b) sagittal plane.

We propose a method for 3D tumor segmentation based on gesture interaction and pen interaction. The basic idea of the method is to generate 3D tumor segmentation labels from a small number of in-plane segmentation results. First, the plane for segmentation

102

N. Zhang et al.

is selected in our VR environment by gesture interaction (Fig. 4a, b). Then, the tumor contour is drawn on the selected 2D plane using the interactive pen (Fig. 3a). As shown in Fig. 3a, b, the contours are drawn on several planes with different angles selected by gesture interaction with a fixed rotation axis. After that, the drawn contours are paired on the vertical plane of the rotation axis and fitted with a cubic spline curve to generate 3D segmentation labels. The segmented label can be output from the VR environment with a data format that required by users.

3 Results and Discussion 3.1 System Architecture and Technical Details The proposed VR cervical tumor segmentation prototype is realized through Unity 2021.3.8f1c1. Considering the hardware construction, the prototype (Fig. 5) is configured with the specification in Table 1. Figure 5a, b shown the user interacted with the medical image in the front and side view, respectively. Figure 6a, b shown the rendered elemental imaging array, and floating horizontal parallax of cervical imaging for 3D floating autostereoscopic display, respectively. 3.2 User Study We recruited five students to participate in the user study, and they voluntarily signed the consent form. Each participant segmented two same cervical tumor images with our prototype (Fig. 7). We recorded their experience and evaluation of the prototype, and the segmentation time. The average segmentation time for participants is around 5 min. Participates considered that the proposed VR environment has good interaction design, which makes the segmentation task can be easily completed. 3.3 Discussion Our goal was to test whether VR-based cervical tumor segmentation approach were presented and to see whether the segmentation efficiency improved. In this study, a cervical tumor segmentation VR environment was developed, and the proposed methods demonstrated the benefits, such as multimodal integration and high interactivity, of constructing image processing platform in VR environment. The user study shown that the segmentation task may be completed within minutes, and the environment was well accepted. In addition, we proposed an interactive plant that can association 3D and 2D medical images in real-time. Our volunteers expressed their praise for this design. They think that the interaction plane can assist them in determining the position of the 2D slice to be processed in the 3D image while providing a good 3D perception. However, several deficiencies exist in this study. The segmentation results cannot be evaluated in the developed environment. Although the proposed environment has been well recognized in terms of segmentation efficiency, the evaluation of segmentation accuracy remains to be completed in the follow-up work.

A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional

103

Fig. 4. Free-hand interaction. a, b change the rotation and position of interaction plane, c, d change the field of vision and position.

Table 1. Configuration parameters of the prototype Equipment

Parameters

Specification

Liquid crystal display panel

Size

344 × 194 mm

Resolution

3840 × 2160

PPI

282

Pitch

0.124 mm

Gap

1.05 mm

Gesture recongnition device

Hardware

Leap Motion

Computer workstation

CPU

Intel CORE i7-9750H

GPU

NVIDIA GeForce GTX 1660Ti

Lenticular lens arrar

Fig. 5. User interacted with the medical image in the a front and b side view.

104

N. Zhang et al.

Fig. 6. a The rendered elemental imaging array and b floating horizontal parallax of cervical imaging for 3D floating autostereoscopic display.

Fig. 7. User segmentation results in different slices.

4 Conclusion Our research demonstrates the beneficial role of 3D visualization and spatial interaction in medical image processing. In this study, cervical tumors can be rapidly segmented through our proposed VR environment. The proposed visualization platform can provides the visualization of 3D and 2D images with intuitive connectivity, and the hand-pan interaction method ensures hand-eye coordination and convenience of interaction. This spatial interaction modality not only absorbs the advantages of 3D spatial perception, but also reduces the degree of freedom of interaction through 2D image processing, thus avoiding the accuracy loss of 3D spatial interaction to a certain extent. Besides, the 3D floating autostereoscopic visualization provide 3D parallax of the AR environment without glasses. We propose a novel AR cervical tumor 3D segmentation method by which the cervical can be segmented through hands and pan. The proposed method enables fast segmentation. Acknowledgment. The authors acknowledge supports from National Natural Science Foundation of China (U22A2051, 82027807, 62101305), National Key Research and Development Program of China (2022YFC2405200), Beijing Municipal Natural Science Foundation (7212202), Institute for Intelligent Healthcare, Tsinghua University (2022ZLB001), Tsinghua-Foshan Innovation Special Fund (2021THFS0104).

A VR Environment for Cervical Tumor Segmentation Through Three-Dimensional

105

References 1. Buskwofie, A., David-West, G., Clare, C.A.: A review of cervical cancer: incidence and disparities. J. Natl. Med. Assoc. 112(2), 229–232 (2020) 2. Holschneider, C.H., Petereit, D.G., Chu, C., Hsu, I.-C., Ioffe, Y.J., Klopp, A.H., Pothuri, B., Chen, L.-m., Yashar, C.J.B.: Brachytherapy: a critical component of primary radiation therapy for cervical cancer: from the Society of Gynecologic Oncology (SGO) and the Am. Brachyther. Soc. (ABS) 18(2), 123–132 (2019) 3. Pötter, R., Tanderup, K., Schmid, M.P., Jürgenliemk-Schulz, I., Haie-Meder, C., Fokdal, L.U., Sturdza, A.E., Hoskin, P., Mahantshetty, U., Segedin, B.J.T.L.O.: MRI-guided adaptive brachytherapy in locally advanced cervical cancer (EMBRACE-I): a multicentre prospective cohort study. 22(4), 538–547 (2021) 4. Lin, Y.-C., Lin, C.-H., Lu, H.-Y., Chiang, H.-J., Wang, H.-K., Huang, Y.-T., Ng, S.-H., Hong, J.-H., Yen, T.-C., Lai, C.-H.J.E.r.: Deep learning for fully automated tumor segmentation and extraction of magnetic resonance radiomics features in cervical cancer 30, 1297–1305 (2020) 5. Fields, E.C., Hazell, S., Morcos, M., Schmidt, E.J., Chargari, C., Viswanathan, A.N.: Imageguided gynecologic brachytherapy for cervical cancer. Seminars in radiation oncology 30(1), 16–28 (2020) 6. Torheim, T., Malinen, E., Hole, K.H., Lund, K.V., Indahl, U.G., Lyng, H., Kvaal, K., Futsaether, C.M.: Autodelineation of cervical cancers using multiparametric magnetic resonance imaging and machine learning. Acta Oncologica 56(6), 806–812 (2017) 7. Zhang, N., Huang, T., Meng, Y., Zhang, X., Liao, H.: A novel in-situ interactive 3D floating autostereoscopic display system with aerial imaging plate. In: International Conference of Display Rechnology, pp. 244–247 (2020) 8. Zhang, N., Wang, H., Huang, T., Zhang, X., Liao, H..: A VR environment for human anatomical variation education: modeling, visualization and interaction. IEEE Transactions on Learning Technologies 17, 391–403 (2024) 9. Furtado, J.S., Liu, H.H., Lai, G., Lacheray, H., Desouza-Coelho, J.: Comparative analysis of optitrack motion capture systems. SMRC’18, pp. 15–31 (2019)

An Image Fusion Method Combining the Advantages of Dual-Mode Optical Imaging in Endoscopy Shipeng Zhang1 , Ye Fu2 , Xinran Zhang1 , Longfei Ma1 , Hui Zhang1 , Tianyu Xie2 , Zhe Zhao3,4 , and Hongen Liao1(B) 1 Department of Biomedical Engineering, School of Medicine, Tsinghua University, No. 30,

Shuangqing Road, Beijing, China [email protected] 2 Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing, China 3 Beijing Tsinghua Changgung Hospital, Beijing, China 4 School of Clinical Medicine, Tsinghua University, Beijing, China

Abstract. In recent years, endoscopic imaging technology has developed rapidly. Dual-mode optical imaging (DMOI) technology in endoscopy allows for rapid imaging of living tissue at the same position. It includes white light imaging (WLI) and compound band imaging (CBI), with each mode offering complementary advantages. WLI is the most commonly used mode in endoscopy. CBI is a virtual optical staining technique that highlights the small vascular structures of the gastrointestinal mucosa. It is essential to fuse their features to improve the image quality of endoscopy. This paper utilizes an image fusion network based on proportional maintenance of gradient and intensity for image fusion. A volunteer experiment was conducted in this study. The oral images were taken for the training and testing of the fusion network, and the results were obtained. We objectively evaluated the quality of the fused images and compared them with the WLI images. The results show that the fused images have natural color and rich details, and the image quality is significantly improved. This study is the first to propose the image fusion of DMOI in endoscopy. Our approach combines the strengths of both source images, which can assist medical professionals in making more precise diagnoses during clinical examinations. We think it has important significance for improving the detection rate of early lesions. Keywords: Endoscopy · Dual-mode optical imaging · Image fusion · PMGI

1 Introduction Endoscopy is vital in diagnosing and treating human digestive system diseases [1]. Endoscopic imaging techniques are constantly evolving to meet diverse clinical needs. Endoscopic optical staining technology can observe small mucosal structures and abnormal changes [2], providing a means to highlight the color and detail contrast of the lesion. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 106–113, 2024. https://doi.org/10.1007/978-3-031-51485-2_13

An Image Fusion Method Combining the Advantages …

107

Conventional endoscopy is based on white light imaging (WLI), while the optical staining technique changes the illumination source for staining. Such techniques include compound band imaging (CBI) [3], narrow band imaging (NBI) [4], and blue light imaging (BLI) [5]. These optical staining techniques play the role of optical biopsy to a certain extent and are widely favored by clinicians due to their simple operation. These staining techniques have specific requirements for light sources and cannot be used simultaneously with WLI. Dual-mode optical imaging (DMOI) can acquire WLI images and CBI images in a very rapid time. CBI uses blue and green narrow-band light, which are strongly absorbed by hemoglobin in the superficial mucosa [6]. That is why CBI can emphasize the microvascular structure of the mucosal surface, which is essential for detecting early lesions. However, CBI images are difficult to show other tissue features. The brightness of WLI images is high, the color is actual, and the information on the mucosal tissue is comprehensive. Still, the contrast of the superficial blood vessels is often low. Therefore, image fusion combining their benefits has important clinical significance. Effective image fusion can extract meaningful information from images and retain it to improve their analysis and visual quality [7]. WLI images are bright but with less detail feature ratio. On the other hand, CBI images are low brightness but rich in detail features. Considering the characteristics of the two images, this paper adopts an image fusion network based on proportional maintenance of gradient and intensity (PMGI) [8]. It unified the image fusion as the texture and intensity scale-preserving of the source image. We carried out fusion tests and evaluated the quality of the fused images. The results showed that our method retained the color and brightness of WLI images and contained the rich vascular features of CBI images.

2 Methods DMOI endoscopy system adds a real-time rotating color wheel on the light source. It can take two modes of images of the same living tissue location in a short period, which provides hardware support for image fusion of the two modes. In recent years, deep learning-based methods for image fusion have emerged as a popular research field [9]. Traditional fusion methods require manual design and adjustment of fusion rules. Such as multi-scale decomposition and other operations according to image requirements, which is a complicated process. Deep learning methods involve the use of deep neural networks to automatically extract image features. It trains the model many times until it can output ideal fusion results due to the constraints of the loss function [10]. After the network structure is manually constructed, subsequent fusion processes do not require human intervention. The network can automatically extract the deep features of the image and directly output the fusion results. Our image fusion belongs to an unsupervised learning task. That is to say, there is no standard reference image as a reference, and supervised learning cannot be used for training. Unsupervised learning is a self-learning process that finds the regularity of data samples. There are no problems such as manual labeling difficulty, high cost, and inaccuracy. The goal of image fusion is to merge the most essential information from multiple source images, resulting in a single fused image that is more informative and visually

108

S. Zhang et al.

appealing. In the case of DMOI images, the WLI image appears bright, but the details are limited. In contrast, the CBI image has low brightness but contains highly detailed features, such as blood vessels. Taking into account the image features of the two modes, we adopted PMGI [8] to fuse the images of WLI and CBI. The typical fusion network structure using an encoderdecoder is the data of one encoder corresponding to one modality. At the same time, this model can mix the images of two modalities into the encoder for fusion. The information transfer block is added between the two encoders. PMGI can accomplish the image fusion task end-to-end, unifying the image fusion process as the maintenance of the gradient and intensity ratio of the source images. The pixel is the most fundamental element in an image, and the histogram distribution of the image can reflect the intensity of the pixel. The differences between pixels can constitute gradients, expressed as texture details in the image. Therefore, the PMGI network offers a comprehensive representation of the entire image based on gradient and pixel intensity, which is reflected in both the network structure and loss function. As shown in Fig. 1, the PMGI network is divided into two information extraction paths: intensity and gradient. The intensity path is responsible for extracting intensity information, and the gradient path is responsible for extracting texture information, that is, high-frequency features. The gradient and intensity information needs to be extracted and saved simultaneously. Different source image dimensions concatenate the input of the information extraction path of the intensity and gradient to maintain the potential correlation. In both paths, the PMGI network adopts four convolutional layers for feature extraction, drawing on DenseNet [11] and making dense connections on the same path to achieve feature reuse. During the process of convolution, there is an inevitable loss of information. However, feature reuse can help reduce the amount of data loss and improve the utilization of the features. The PMGI network uses the path information transfer module for information communication on the two paths. The inputs of the third and fourth convolutional layers depend on the outputs of all layers preceding this path and the outputs of previous layers on the other path. The path information transfer module can pre-fuse gradient and intensity information and enhance the information extracted by the lower layer. The first convolution layer uses a 5 × 5 kernel, and the last three use a 3 × 3 kernel. The path information transfer block is shown in Fig. 2, using a 1 × 1 convolution kernel. These convolution layers all combine batch normalization (BN) and the LRelu activation function. The PMGI network uses concatenation and convolution strategies to fuse the extracted features. It concatenates the feature maps of the two paths, still using the idea of feature reuse. The final concat procedure involves eight feature maps from two paths. The kernel size of the last convolution layer is 1x1 with a tanh activation function. The PMGI network designs a unified form of loss function according to the nature of images, which includes two types of loss terms: gradient loss and intensity loss. The intensity loss term constrains the rough pixel distribution, while the gradient loss term enhances texture details. Combining the two constraints makes the intensity distribution of the image fusion effect reasonable, and the texture details are rich.

An Image Fusion Method Combining the Advantages …

Fig. 1. The network structure of PMGI

Fig. 2. The structure of the path information transfer block

109

110

S. Zhang et al.

Construct the loss function as follows: LPMGI = λAint LAint + λAgrad LAgrad + λBint LBint + λBgrad LBgrad

(1)

where A and B represent two source images, L(·)int represents the intensity loss term of one image, L(·)grad represents the gradient loss term of one image, and λ(·) represents the weight of each loss term. The intensity loss is defined as: LAint =

1 H ×W ||Ifused

− IA ||22

(2)

LBint =

1 H ×W ||Ifused

− IB ||22

(3)

where Ifused represents the fused image, IA andIB represent the source images, and H × W represents the image size. The gradient loss is defined as: LAgrad =

1 H ×W ||∇I fused

− ∇IA ||22

(4)

LBgrad =

1 H ×W ||∇I fused

− ∇IB ||22

(5)

where ∇ represents the gradient operator.

3 Experiments and Results We conducted volunteer experiments using the AQ-200L endoscopy system (Aohua, Shanghai, China) to take 24 pairs of human oral DMOI images. They were used as training data, and the image size is 1080 × 1352. To obtain more training data, we use the extended strategy of cropping and decomposition to crop images into 132 × 132 image patches of small size, which represent different image features. We expect that the fused image retains the hue and brightness of the WLI image and integrates more details of the CBI image. It is regarded as a better inspection mode than the WLI mode. Therefore, the choice of the weight of the loss function λ(·) should satisfy: λWHITE int > λCBI int , λWHITE grad < λCBI grad

(6)

In this study, DMOI images were first converted to luminance images while preserving the chromaticity information of the WLI image. Then the luminance images were fused to obtain the final luminance image. Finally, according to the chromaticity information from the WLI image, the color restoration was carried out so that the color of the fused image could be actual and natural. In other words, more detailed features in the CBI image were added to the WLI image. Figure 3 shows two pairs of test images with full size, which were fed into the trained network. Figure 4 shows the fused images. We evaluated the quality of the fused images. Standard deviation (SD), average gradient (AG), spatial frequency (SF), entropy (EN), and feature mutual information (FMI)

An Image Fusion Method Combining the Advantages …

111

Fig. 3. Two pairs of DMOI images

Fig. 4. The fused images

Table 1. Evaluation results of the first WLI image and fused image Image

SD

AG

WLI image1

50.092

1.4279

Fused image1

61.298

2.7563

SF 5.1145 14.564

EN

FMI

3.8576



5.9045

9.5655

Table 2. Evaluation results of the second WLI image and fused image Image

SD

AG

WLI image2

38.285

0.89103

Fused image2

45.313

2.0694

SF 4.4242 13.738

EN

FMI

2.2477



4.7217

9.3871

are representative non-reference image quality evaluation indicators[12]. They were used

112

S. Zhang et al.

to evaluate the quality of the fused image objectively. We compare the evaluation results of the fused images with the WLI images. Tables 1 and 2 show the evaluation results. Figure 4 shows that the brightness of images based on the PMGI network is consistent with that of WLI images, and the natural color of WLI images is still preserved. Meanwhile, the blood vessels of CBI images are more prominent. From the objective evaluation results, the evaluation indexes of the fused images are significantly improved compared with those of the WLI images. FMI shows that our method better combined the information of DMOI images. The objective quality evaluation results of the fused images are in line with the subjective impressions of human eyes. This fusion method not only preserves the natural color and brightness of WLI images but also combines the clinical details of CBI images.

4 Discussion Considering the characteristics of DMOI images, this paper adopts the PMGI network for image fusion. Our method effectively extracts and combines the advantages of DMOI images, producing a fused image with natural colors and rich details, which is what we expect to see in clinical settings. However, this study only conducted volunteer experiments using oral images with similar structures to the gastrointestinal mucosa and has yet to conduct clinical trials with patients suffering from gastrointestinal diseases. In future research, clinical trials should be conducted to evaluate the effectiveness of our proposed method.

5 Conclusions In this paper, we present a fusion method that combines two typical examination modes in endoscopy, WLI and CBI, using the DMOI technology. Our approach relies on an image fusion network based on proportional maintenance of gradient and intensity to fuse the DMOI images. We also conducted a quality evaluation of the fused images and compared them to the commonly used WLI images in clinical practice. The results demonstrate that our proposed method achieves high fusion performance and combines the advantages of the two inspection modes. Our approach facilitates the detection of early lesions and enables physicians to make more accurate clinical judgments. We believe that our method has significant clinical application potential and can improve the detection rate of lesions. Acknowledgment. The authors acknowledge supports from National Key Research and Development Program of China (2022YFC2405200), National Natural Science Foundation of China (82027807, U22A2051), Beijing Municipal Natural Science Foundation (7212202), Institute for Intelligent Healthcare, Tsinghua University (2022ZLB001), and Tsinghua-Foshan Innovation Special Fund(2021THFS0104).

An Image Fusion Method Combining the Advantages …

113

References 1. Nakamoto, S., et al.: Indications for the use of endoscopic mucosal resection for early gastric cancer in Japan: a comparative study with endoscopic submucosal dissection. Endoscopy 41(09), 746–750 (2009) 2. He, Z., Wang, P., Liang, Y., Fu, Z., Ye, X.: Clinically available optical imaging technologies in endoscopic lesion detection: current status and future perspective. J. Healthc. Eng. 2021 (2021) 3. Joren, R., Oldenburg, B.: Surveillance of long-standing colitis: the role of image-enhanced endoscopy. Best Pract. Res. Clin. Gastroenterol. 29(4), 687–697 (2015) 4. Yao, K.: Principles of magnifying endoscopy with narrow-band imaging. In: Zoom Gastroscopy, pp. 49–56. Springer (2014) 5. Yoshida, N., et al.: Ability of a novel blue laser imaging system for the diagnosis of colorectal polyps. Dig. Endosc. 26(2), 250–258 (2014) 6. Kuznetsov, K., Lambert, R., Rey, J.-F.: Narrow-band imaging: potential and limitations. Endoscopy 38(01), 76–81 (2006) 7. Meher, B., Agrawal, S., Panda, R., Abraham, A.: A survey on region based image fusion methods. Inf. Fusion 48, 119–132 (2019) 8. Zhang, H., Xu, H., Xiao, Y., Guo, X., Ma, J., (eds.): Rethinking the image fusion: a fast unified image fusion network based on proportional maintenance of gradient and intensity. In: Proceedings of the AAAI Conference on Artificial Intelligence (2020) 9. Zhang, H., Xu, H., Tian, X., Jiang, J., Ma, J.: Image fusion meets deep learning: a survey and perspective. Inf. Fusion 76, 323–336 (2021) 10. Li, Y., Zhao, J., Lv, Z., Li, J.: Medical image fusion method by deep learning. Int. J. Cogn. Comput. Eng. 2, 21–29 (2021) 11. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q., (eds):. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017) 12. Kaur, H., Koundal, D., Kadyan, V.: Image fusion techniques: a survey. Arch. Comput. Methods Eng. 28(7), 4425–4447 (2021)

An End-to-End Spatial-Temporal Transformer Model for Surgical Action Triplet Recognition Xiaoyang Zou, Derong Yu, Rong Tao, and Guoyan Zheng(B) Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, China [email protected]

Abstract. Surgical activity recognition plays an important role in computer assisted surgery. Recently, surgical action triplet has become the representative definition of fine-grained surgical activity, which is a combination of three components in the form of . In this work, we propose an end-to-end spatial-temporal transformer model trained with multi-task auxiliary supervisions, establishing a powerful baseline for surgical action triplet recognition. Rigorous experiments are conducted on a publicly available dataset CholecT45 for ablation studies and comparisons with state-of-the-arts. Experimental results show that our method outperforms state-of-the-arts by 6.8%, achieving 36.5% mAP for triplet recognition. Our method won the 2nd place in action triplet recognition racing track of CholecTriplet 2022 Challenge, which also demonstrates the superior capability of our method. Keywords: Action recognition · Surgical action triplet · Transformer · Self-attention · Auxiliary supervision

1 Introduction Accurate recognition of surgical activity is essential for the development of intelligent surgery, which can help to improve the safety and automation in modern operating room. Actually, surgical activities can be graded from coarse-grained to fine-grained level [1]. For example, typical coarse-grained surgical activities include surgical phases and steps, which are often used to depict the general surgical workflow, while the fine-grained surgical activities are more focused on the description of specific surgical actions. Recent studies on surgical activity recognition mainly focus on surgical phase recognition and tool presence detection [2–5], which are still at a relatively coarse-grained level. Activity recognition at fine-grained level remains challenging. To meet this challenge, Nwoye et al. defined surgical action triplets to better standardize and facilitate researches of fine-grained surgical activity recognition [1]. For the reason that each specific surgical action should have an instrument as the subject and a target to be acted on, surgical action triplet is defined as a combination of three types of components, denoted as . Examples of surgical action triplets are illustrated in Fig. 1. Our goal is to realize accurate action triplet recognition based on surgical videos. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 114–120, 2024. https://doi.org/10.1007/978-3-031-51485-2_14

An End-to-End Spatial-Temporal Transformer Model

115

Fig. 1. Examples of surgical action triplets

However, it is quite challenging to recognize surgical action triplets directly from the video frames. First, surgical video frames contain rich and complex visual information, with significant intra-category differences and limited inter-category differences, making it difficult to extract features of action triplets from the video frames directly. Second, surgical videos are time-dependent while the action triplets can change frequently. Therefore, how to model the temporal dependency between adjacent frames becomes one key point to develop triplet recognition methods. Third, the large number of triplet categories and the unbalanced data distribution can have negative impact on triplet recognition. Multiple methods have been proposed on extracting reliable visual features from surgical videos. Twinanda et al. [6] first use convolutional neural network (CNN) to learn visual features from surgical videos for phase recognition. To enhance model’s temporal modelling capability, several methods incorporate long short-term memory networks or temporal convolutional networks for sequence processing [3]. Nowadays, transformer [7] has demonstrated its great power in sequential data processing and has had a profound impact on the development of computer vision. Some studies have used it for temporal modelling in surgical workflow analysis [3–5]. However, most of existed methods are trained in two-stage and do not exploit the spatial feature extraction capability of transformer. Researches on surgical action triplet recognition is still in its infancy. Nwoye et al. firstly present Tripnet [1] which utilizes instrument activation map as the guidance and propose a trainable 3D interaction space to modelling the inter-relationships between triplet components. On the basis of this study, they further propose Rendezvous [8] based on transformer with new form of spatial attention and semantic attention to capture associations between triplet and its components. Current studies mainly focus on triplet interactions and overlook the importance of establishing robust baseline model and introducing temporal information. In this work, we propose an end-to-end spatial-temporal transformer model with multi-task auxiliary supervisions. Our method exploits the capability of transformer for both spatial feature extraction and temporal modelling, and can be trained in an end-toend manner, establishing a strong baseline model for surgical action triplet recognition.

116

X. Zou et al.

Fig. 2. Schematic overview of our proposed end-to-end spatial-temporal transformer with multitask auxiliary supervisions.

2 Methods 2.1 End-to-End Spatial-Temporal Transformer In this work, an end-to-end spatial-temporal transformer is developed for surgical action triplet recognition, as illustrated in Fig. 2, which seamlessly integrates Swin-B Transformer as the backbone for spatial feature extraction and self-attention layers as temporal modelling module. Since action triplets change frequently during surgery, we believe that triplet recognition has weak demand for modelling long-term dependency and therefore consider video clips with limited sequence length as input to the model. Let b denotes the batch size, and L denotes the sequence length of input video clip. For the spatial feature extractor, the input video frames are first processed by SwinB Transformer [9] at a resolution of 192 × 192, outputting 1024-dimensional spatial features. Compared with CNN-based feature extractors, Swin Transformer fully exploits the advantages of attention mechanism and builds hierarchical feature maps with shifted window based self-attention, facilitating the perception of action triplets from intricate surgical scene. The extracted spatial features can be permuted as Fs ∈ Rb×L×1024 , which are then fed to two self-attention blocks for temporal modelling. Each self-attention block consists of a masked multi-head attention layer and a feed forward layer. Residual shortcut and layer normalization are set after each sub-layer. For each masked multi-head attention layer, features are projected to query (Q), key (K) and value (V ) vectors in 8 heads using three linear layers. Scaled dot-product attention is calculated for each head: QK T Attention(Q, K, V ) = Softmax(Mask ◦ √ )V d

(1)

where feature dimension d = 128 and an upper triangular attention mask is adopted for blocking future information. Outputs of all heads are concatenated and reprojected to 1024-dimension. It should be noted that only the feature at current time Ft ∈ Rb×1024 is extracted as the final output of our temporal modelling module. 2.2 Multi-task Auxiliary Supervisions Compared with surgical action triplet recognition, it is simpler to recognize triplet’s sub-components. Therefore, we can simplify the task of triplet recognition and let the

An End-to-End Spatial-Temporal Transformer Model

117

network learn valuable information from simple tasks to assist triplet recognition. For this purpose, we introduce multi-task auxiliary supervisions for the training of our transformer model. In addition to the supervision of surgical action triplets, we decoupled the triplet labels (yIVT ) to sub-components (yI , yV , yT ), enabling the supervisions of instruments, actions and targets. Considering that recognition of triplet and its sub-components is a multi-label classification problem, Sigmoid function is added for final prediction. Weighted binary cross-entropy losses (LWBCE ) are calculated for instruments (LI ), verbs (LV ), targets (LT ) and triplets (LIVT ): LC = {I ,V ,T ,IVT } = LWBCE (pC , yC )

(2)

Surgical phase information is also introduced for auxiliary supervision, for the reason that dependency between coarse-grained and fine-grained surgical activities probably exists, which can be beneficial for the network to learn features of surgical action triplets. Since surgical phase recognition is a multi-class classification problem, Softmax function is employed for final output, and weighted cross entropy loss (LWCE ) is used for supervision, denoted as LP : LP = LWCE (pP , yP )

(3)

All the weights are determined by median frequency balancing [8]. And the total loss of our multi-task supervision can be written in the form below: L = LI + LV + LT + LP + LIVT

(4)

3 Experiments 3.1 Dataset Description Experiments are conducted on a publicly available dataset CholecT45 [10], which is the only officially released dataset for surgical action triplet recognition. The dataset contains 45 laparoscopic cholecystectomy videos from dataset Cholec80 [6] with a resolution of 1920 × 1080 or 854 × 480, sampled at 1 frame-per-second. Labels are provided for 100 classes of surgical action triplets, which are available combinations of 6 classes of instruments, 10 classes of verbs, and 15 classes of targets. Multiple triplets are allowed to exist in one video frame. Moreover, since all the videos are sampled from Cholec80 dataset, which contains labels for 7 surgical phases, we are able to incorporate phase information into supervision. In this work, we follow the official five-fold data partition scheme of CholecT45 [10] and conduct rigorous five-fold cross-validation experiments. 3.2 Evaluation Metrics As aligned with previous studies [1, 8, 11], we adopt mean average precision (mAP) as the metric to evaluate the performance of surgical action triplet recognition. We can not only evaluate for the triplets (APIVT ), but also evaluate for their sub-components (API for instruments, APV for verbs and APT for targets) and tool-centric interactions

118

X. Zou et al.

Table 1. Ablative testing results (%) on one validation fold. Spatial represents the spatial feature extractor, temporal represents the temporal modelling module, LI + LV + LT represents the supervisions of triplet sub-components, and LP represents the supervision of surgical phase. LI + LV + LT

Spatial √

Temporal

















LP



API

APV

APT

APIV

APIT

APIVT

81.5

56.0

39.0

41.9

34.3

31.6

83.2

59.4

39.5

48.9

37.1

36.4

90.0

61.7

41.9

51.6

44.3

40.7

93.5

71.0

49.4

53.1

45.6

41.8

Table 2. Quantitative comparison results with state-of-the-art methods through five-fold cross validation. The average mAP values (%) and standard deviations (±) over five validation folds are reported. Method

API

APV

APT

APIV

APIT

APIVT

Tripnet [1]

89.9 ± 1.0

59.9 ± 0.9

37.4 ± 1.5

31.8 ± 4.1

27.1 ± 2.8

24.4 ± 4.7

Attention Tripnet [8]

89.1 ± 2.1

61.2 ± 0.6

40.3 ± 1.2

30.0 ± 2.9

29.4 ± 1.2

27.2 ± 2.7

Rendezvous [8]

89.3 ± 2.1

62.0 ± 1.3

40.0 ± 1.4

34.0 ± 3.3

30.8 ± 2.1

29.4 ± 2.8

RiT [11]

88.6 ± 2.6

64.0 ± 2.5

43.4 ± 1.4

38.3 ± 3.5

36.9 ± 1.0

29.7 ± 2.6

Ours

92.4 ± 1.5

70.6 ± 1.1

48.7 ± 3.5

46.3 ± 5.4

44.1 ± 3.1

36.5 ± 3.7

(APIV for instrument-verbs and APIT for instrument-targets). Among them, APIVT is the main metric to evaluate action triplet recognition, and all the other mAP metrics are calculated based on the final probability outputs of action triplet, which are helpful for detailed evaluation. To be specific, mAP scores are calculated as follows. We first calculate the average precision (AP) score for each category in each video. After that, for each category, by averaging the AP scores across all videos, we can get the category AP score. Then, the final mAP score is derived by averaging all category AP scores. 3.3 Implementation Details Our method is implemented using PyTorch framework, and trained on an NVIDIA TITAN RTX GPU. All the experiments are conducted in online and end-to-end manner. For each iteration, video clips consisting of 5 consecutive frames are used for training. Video frames are resized to 192 × 192, and black margins are cut automatically with morphology operation. At training stage, we conduct random horizontal flipping, rotation and color jittering (including adjustment of the brightness, saturation and contrast) for data augmentation. Batch size is set as 12. Adam optimizer with 1e-4 learning rate and 1e-5 weight decay is used for training.

An End-to-End Spatial-Temporal Transformer Model

119

4 Results 4.1 Ablation Studies Comprehensive ablation studies are conducted to validate the capability of each component that we proposed in our method. The ablative testing results on one validation fold are presented in Table 1. Four components are considered in the ablation studies, including the spatial feature extractor, temporal modelling module, supervisions of triplet sub-components and supervision of surgical phase. It is apparent that the improvement for each component is substantial. Note that LIVT is the only loss for supervision when supervisions of both sub-components and surgical phase are not available. Auxiliary branches can only be built when corresponding supervisions are used. As shown in Table 1, when temporal modelling module is used, a performance boost of 4.8% is observed in terms of APIVT compared to using only the spatial feature extractor. This indicates that our temporal modelling module successfully learns the temporal information embedded in input video clips, facilitating recognition of surgical action triplet. In addition, when the supervisions of triplet sub-components are adopted, the performance of triplet recognition improves by 4.3% in APIVT , which reveals that simplifying triplet recognition to the recognition of sub-components for auxiliary supervisions is beneficial for handling complex triplet recognition problem. Furthermore, one can see that when incorporating the supervision of surgical phase, the improvement is also substantial especially in the recognition of sub-components, with a performance boost of 9.3% in terms of APV and 7.5% in terms of APT . We believe that the dependency between coarse-grained and fine-grained surgical activities is helpful for surgical action triplet recognition. The model with all of the four components achieves the best recognition performance in all metrics, with 41.8% in terms of APIVT . 4.2 Comparisons with State-of-the-Arts Four state-of-the-art (SOTA) methods are selected for comparison through five-fold cross validation, including Tripnet [1], Attention Tripnet [8], Rendezvous [8] and RiT [11]. The quantitative comparison results are reported in Table 2, which reveal that our method outperforms the state-of-the-art methods by a large margin. Our method achieves an average APIVT of 36.5% over the five validation folds, which outperforms the best SOTA method RiT by 6.8%. Furthermore, our method won the 2nd place among 10 competitive teams from all over the world in CholecTriplet 2022 Challenge (action triplet recognition sub-task) [12], which can also demonstrate the superior recognition capability of our proposed method.

5 Conclusions In this paper, we propose an end-to-end spatial-temporal transformer-based framework for surgical action triplet recognition, trained with multi-task auxiliary supervisions. Our method outperforms the state-of-the-arts by 6.8% in terms of triplet recognition

120

X. Zou et al.

mAP through five-fold cross validation on CholecT45 dataset, and won the 2nd place in CholecTriplet 2022 Challenge. It indicates that our method holds the potential to be a strong baseline for surgical action triplet recognition, which is helpful for future works. Acknowledgment. This study was partially supported by Shanghai Municipal Science and Technology Commission via Project 20511105205 and by the National Natural Science Foundation of China via project U20A20199.

References 1. Nwoye, C.I., Gonzalez, C., Yu, T., et al.: Recognition of instrument-tissue interactions in endoscopic videos via action triplets. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, pp. 364–374, Lima, Peru (2020) 2. Wang, S., Xu, Z., Yan, C., et al.: Graph convolutional nets for tool presence detection in surgical videos. In: Information Processing in Medical Imaging: 26th International Conference, IPMI 2019, pp. 467–478, Hong Kong, China (2019) 3. Jin, Y., Long, Y., Gao, X., et al.: Trans-SVNet: hybrid embedding aggregation transformer for surgical workflow analysis. Int. J. Comput. Assist. Radiol. Surg. 17(12), 1–10 (2022) 4. Czempiel, T., Paschali, M., Ostler, D., et al.: Opera: attention-regularized transformers for surgical phase recognition. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, pp. 604–614 (2021) 5. Zou, X., Liu, W., Wang, J., et al.: ARST: auto-regressive surgical transformer for phase recognition from laparoscopic videos. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 1–7 (2022) 6. Twinanda, A.P., Shehata, S., Mutter, D., et al.: Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2016) 7. Vaswani, A., Shazeer, N., Parmar, N., et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017) 8. Nwoye, C.I., Yu, T., Gonzalez, C., et al.: Rendezvous: attention mechanisms for the recognition of surgical action triplets in endoscopic videos. Med. Image Anal. 78, 102433 (2022) 9. Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021) 10. Nwoye, C.I., Padoy, N.: Data splits and metrics for method benchmarking on surgical action triplet datasets (2022). arXiv:2204.05235 11. Sharma, S., Nwoye, C.I., Mutter, D., et al.: Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition (2022). arXiv:2211.16963 12. Nwoye, C.I., Yu, T., Sharma, S., et al.: CholecTriplet2022: Show me a tool and tell me the triplet—An endoscopic vision challenge for surgical action triplet detection (2023). arXiv: 2302.06294

2D/3D Reconstruction of Patient-Specific Surface Models and Uncertainty Estimation via Posterior Shape Models Wenyuan Sun, Yuyun Zhao, Jihao Liu, and Guoyan Zheng(B) Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, China [email protected]

Abstract. In this paper, we propose a novel method for 3D reconstruction of patient-specific surface models from 2D X-ray images and uncertainty quantification via posterior shape models. Taking the silhouette point cloud generated from biplanar X-ray images as the given partial information, a posterior shape model is constructed to compute the posterior distribution of the surface model given the silhouette. By sampling surface models from the posterior distribution, we can not only compute the patient-specific 3D reconstruction but also quantify the reconstruction uncertainty. Comprehensive experiments were conducted on 25 synthetic cases and 10 cadaveric cases of the proximal femur. Both quantitative and qualitative results demonstrated the effectiveness of the posterior shape model-based reconstruction method. Keywords: 2D/3D reconstruction · Posterior shape model · Bone surface model · Biplanar X-ray imaging · Orthopaedic surgery

1 Introduction Intra-operative fluoroscopic imaging has been widely applied in image-guided orthopaedic surgeries due to its ability to visualize underlying bones. Compared with three-dimensional (3D) intra-operative imaging based on computed tomography (CT), the radiation dose and the acquisition cost of fluoroscopic imaging are lower [1], while one-dimensional information is lost due to the projection. To guide the operation, surgeons have to imagine the 3D spatial information based on one or multiple two-dimensional (2D) X-ray images, which is time-expensive and error-prone [2]. To address this issue, 2D/3D reconstruction has been extensively studied, which aims to reconstruct a 3D representation of bones from 2D X-ray images. To incorporate prior knowledge to this ill-posed reconstruction problem, various methods have been proposed, which can be split into two main categories, i.e., 3D surface model reconstruction methods [3–5] and 3D volume reconstruction methods [6–8]. Methods belonging to the former category reconstruct 3D bone surface models by matching statistical shape models (SSM) with the input 2D images based on anatomical landmarks or image contours © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 121–127, 2024. https://doi.org/10.1007/978-3-031-51485-2_15

122

W. Sun et al.

[3–5], while 3D volume reconstruction methods estimate deformation fields from 2D images to generate 3D patient-specific volumes by warping an atlas based on statistical shape and intensity models (SSIM) [6–8]. Meanwhile, the estimation of uncertainty in 2D/3D bone surface model reconstruction is also highly desirable. Most traditional optimization methods only present a single maximum-point-estimate instead of a probabilistic distribution [3, 4, 6–8]. Thus, these methods cannot provide an uncertainty estimation. Thusini et al. [5] framed 2D/3D reconstruction as a Bayesian inference, and estimated the uncertainty by computing the full posterior distribution of model instances given the input images. However, the Markov Chain Monte Carlo (MCMC)-based approach used in this method required a time-consuming propose-and-verify inference procedure. In this paper, we propose a novel method for 3D reconstruction of patient-specific surface models from 2D X-ray images and uncertainty quantification via posterior shape models [9]. Taking the silhouette point cloud generated from biplanar X-ray images as the given partial information, a posterior shape model is constructed to estimate the posterior distribution of the surface model given the silhouette. By sampling surface models from the estimated probabilistic distribution, we can not only compute the patient-specific 3D reconstruction but also quantify the reconstruction uncertainty.

2 Method 2.1 Overview of the Present Method In this study, a set of bone surface models is employed, referred as  = {P 1 , . . . , P n }, where each surface model is described as a vector P i with N vertices: P i = {x1 , y1 , z1 , x2 , y2 , z2 , . . . , xN , yN , zN }T

(1)

All the bone surface models are aligned with each other using a scaled rigid registration, and the vertex-to-vertex correspondences are built using the method introduced in [10]. Based on the correspondences, a mean model is estimated: 1 Pi n n

P=

(2)

i=1

Figure 1a illustrates a schematic view of a biplanar fluoroscopic imaging setup. In the setup, two X-ray images are acquired from the anterior-posterior (AP) direction and the lateral (LAT) direction, respectively, which are assumed to be calibrated and co-registered to an imaging system coordinate system. To reconstruct the bone surface model, we estimate a scaled rigid transformation between the mean model to the imaging system coordinate system. Therefore, the reconstruction of bone surface models can be performed in the mean model space. A silhouette point cloud of the aligned mean model, denoted as S, is then extracted from both AP and LAT directions, which contains M silhouette points. Meanwhile, the 2D contours of the bony structure in both AP and LAT images are extracted. Based on S and the contours, a target silhouette point cloud S is

2D/3D Reconstruction of Patient-Specific Surface Models

123

Fig. 1. Method overview: a A schematic view of a biplanar fluoroscopic imaging setup; b the generation of silhouette point clouds from the aligned mean model and the extracted image contour.

then generated by using the method present in [3], as shown in Fig. 1b. Thus, S and S are paired. To reconstruct the bone surface model and estimate the reconstruction uncertainty, we construct a posterior shape model based on , which takes the paired silhouette point clouds as the given partial information using the method presented in [9]. By sampling K surface models from the posterior shape model, we then reconstruct the bone surface model by averaging the K samples, and estimate the reconstruction uncertainty based on the distance standard deviation of each vertex on the reconstructed surface model. 2.2 Construction of the Posterior Shape Model To construct the posterior shape model, we first build an SSM based on . Based on the mean model P, a covariance matrix  is calculated by:  T 1  Pi − P Pi − P n n

=

(3)

i=1

 is then expanded to  = UD2 UT with a principal component analysis (PCA), where U ∈ R3N ×n is composed of principal components which represent the characteristic shape variations of the surface model, and D ∈ Rn×n is composed of the eigenvalues which quantify their corresponding variance. Therefore, an instance can be generated from the SSM with a coefficient vector α = {α1 , . . . , αn }T : 

P = P + UDα = P + α

(4)

We assume that α is distributed according to a multivariate normal distribution  N (0, In ). Therefore, the bone surface model is distributed according to N P,  . Based on the SSM, the posterior shape model is then constructed to compute a conditional distribution of the whole surface model given the silhouette point cloud. We construct a matrix s by selecting rows from  that correspond to the silhouette points in S. The given silhouette point S is then represented as [9]: S = S + s α + ε

(5)

124

W. Sun et al.

where ε is a noise variable which is distributed according to a normal distribution   2 N 0, σ 2 I3M . As introduced in [9], we choose σ 2 = M1 S − S2 , which is the mean squared error between S and its best approximation in the SSM, denoted as S. Therefore, the posterior distribution of the whole surface model given the silhouette point cloud is calculated [9]:   (8) p(P|S) = N Pc ,  c 



where

−1    P c = P +  Ts s + σ 2 In Ts S − S

(9)

−1   c = σ 2  Ts s + σ 2 In T

(10)

Then, by performing a PCA on  c = Uc D2c UcT , we can generate a conditional instance with a coefficient vector α = {α1 , . . . , αn }T : P c = P c + Uc Dc α = P c + c α

(11)

2.3 Surface Model Reconstruction and Uncertainty Estimation via the Posterior Shape Model Based on the posterior shape model, we can model the characteristic shape variations of the bone surface model given the silhouette point cloud. To reconstruct the surface model, we sample K coefficient vectors from N (0, In ) to generate K surface models. We then take the average surface model as the reconstruction result:  K 1  αk (12) P rec = P c + c K k=1

Meanwhile, we calculate the distance standard deviation of each vertex as its reconstruction uncertainty. The uncertainty of the i th vertex pi , denoted as ρi , is calculated by:

K 1  2 (k) ρi = pi − pi 2 (13) K −1 k=1

3 Experiments and Results 3.1 Experimental Setup In this study, we used the proximal femur to evaluate the proposed method. 80 surface models, which were derived from the SSM of the complete femur used in [11], were employed to construct the posterior shape model. Each surface model contained 2917 vertices and 5786 triangles.

2D/3D Reconstruction of Patient-Specific Surface Models

125

25 synthetic cases and 10 cadaveric cases were used to evaluate the proposed method. The synthetic cases were generated by simulating the biplanar X-ray images of another 25 surface models from [11]. The cadaveric cases were generated from 10 dry cadaveric left femurs. Details about the experimental setup for these femurs were presented previously in [7]. Specifically, two fluoroscopic images were acquired using a calibrated C-arm from AP and LAT direction, respectively. For each femur, 96 3D points were randomly digitized on the surface using a trackable pointer. We then evaluated the reconstruction accuracy by computing the distance between these points and the reconstrued surface model. In this study, we chose K = 20. 3.2 Metrics In this study, we evaluated the reconstruction accuracy by average surface distance (ASD) and a modified Hausdorff distance. ASD measured the average distance from each vertex of the reconstructed surface model to the ground-truth surface model, and the modified Hausdorff distance was an undirected 95 percentile Hausdorff measurement (HD95). The lower ASD and HD95, the better the reconstruction. 3.3 Evaluation on the Synthetic Cases In this experiment, we evaluated the proposed reconstruction method on the 25 synthetic cases, and compared the proposed method with an SSM-based instantiation method as introduced in [3]. The experiment results were summarized in Table 1 and Fig. 2. As shown in the results, the posterior shape model-based method achieved a mean ASD of 0.91 mm and a mean HD95 of 2.26 mm. In contrast, the mean HD95 achieved by SSM-based instantiation was 0.08 mm larger. Meanwhile, the uncertainty estimation was shown in the last row of Fig. 2 using a color-coded visualization. Table 1. Quantitative results of the evaluation on 25 synthetic cases. Mean ASD (mm)

Mean HD95 (mm)

SSM-based instantiation [3]

0.91 ± 0.16

2.32 ± 0.40

Posterior shape model-based reconstruction

0.91 ± 0.19

2.26 ± 0.47

3.4 Evaluation on the Cadaveric Cases In this experiment, we evaluated the proposed method on the 10 cadaveric cases. The experiment results were summarized in Table 2 and Fig. 3. As shown in the results, the posterior shape model-based method achieved a mean ASD of 1.02 mm and a mean HD95 of 2.28 mm, which were 0.02 mm and 0.03 mm lower than that obtained by the SSM-based instantiation method [3], respectively. Meanwhile, we visualized the uncertainty estimated by the proposed method, as shown in the last row of Fig. 3.

126

W. Sun et al.

Fig. 2. Comparison of the results achieved by the SSM-based instantiation method [3] and the proposed method, and a color-coded visualization of the reconstruction uncertainty estimated by the posterior shape model-based reconstruction method.

Fig. 3. Visualization of the model reconstruction and the uncertainty estimation by the posterior shape model-based reconstruction method.

Table 2. Quantitative results of the evaluation on 10 cadaveric cases. Mean ASD (mm)

Mean HD95 (mm)

SSM-based instantiation [3]

1.04 ± 0.42

2.31 ± 0.79

Posterior shape model-based reconstruction

1.02 ± 0.33

2.28 ± 0.64

4 Conclusion In this paper, we propose a novel method for 3D reconstruction of patient-specific surface models from 2D X-ray images and uncertainty quantification via posterior shape models. Taking the silhouette point cloud generated from biplanar X-ray images as the given partial information, a posterior shape model is constructed to estimate the posterior distribution of the surface model given the silhouette. By sampling surface models from

2D/3D Reconstruction of Patient-Specific Surface Models

127

the estimated probabilistic distribution, we can not only compute the patient-specific 3D reconstruction but also quantify the reconstruction uncertainty. Comprehensive experiments were conducted on 25 synthetic cases and 10 cadaveric cases of the proximal femur. Both quantitative and qualitative results demonstrated the effectiveness of the proposed method. Acknowledgment. This study was partially supported by Shanghai Municipal Science and Technology Commission via Project 20511105205 and by the National Natural Science Foundation of China via project U20A20199.

References 1. Huppertz, A., Radmer, S., Asbach, P., et al.: Computed tomography for preoperative planning in minimal-invasive total hip arthroplasty: radiation exposure and cost analysis. Eur. J. Radiol. 78, 406–413 (2011) 2. Valenti, M., Ferrigno, G., Martina, D., et al.: Gaussian mixture models based 2D–3D registration of bone shapes for orthopedic surgery planning. Med. Biol. Eng. Comput. 54, 1727–1740 (2016) 3. Zheng, G., Gollmer, S., Schumann, S., et al.: A 2D/3D correspondence building method for reconstruction of a patient-specific 3D bone surface model using point distribution models and calibrated X-ray images. Med. Image Anal. 13, 883–899 (2009) 4. Baka, N., Kaptein, B.L., de Bruijne, M., et al.: 2D–3D shape reconstruction of the distal femur from stereo X-ray imaging using statistical shape models. Med. Image Anal. 15, 840–850 (2011) 5. Thusini, X.O., Reyneke, C.J.F., Aellen, J., et al.: Uncertainty reduction in contour-based 3d/2d registration of bone surfaces. In: Shape in Medical Imaging: International Workshop, ShapeMI 2020, Held in Conjunction with MICCAI 2020, Proceedings, pp. 18–29. Lima, Peru (2020) 6. Zheng G. (2011) Personalized X-ray reconstruction of the proximal femur via intensitybased non-rigid 2D-3D registration. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2011: 14th International Conference, Toronto, Canada, Proceedings, Part II 14, pp. 598–606. Toronto, Canada (2011) 7. Yu, W., Chu, C., Tannast, M., et al.: Fully automatic reconstruction of personalized 3D volumes of the proximal femur from 2D X-ray images. Int. J. Comput. Assisted. Radiol. Surg. 11, 1673–1685 (2016) 8. Yu, W., Tannast, M., Zheng, G.: Non-rigid free-form 2D–3D registration using a B-splinebased statistical deformation model. Pattern Recogn. 63, 689–699 (2017) 9. Albrecht, T., Lüthi, M., Gerig, T., et al.: Posterior shape models. Med. Image Anal. 17, 959–973 (2013) 10. Meller, S., Kalender, W.A.: Building a statistical shape model of the pelvis. Int. Congr. Ser. 1268, 561–566 (2004) 11. Zheng, G., Hommel, H., Akcoltekin, A., et al.: A novel technology for 3D knee prosthesis planning and treatment evaluation using 2D X-ray radiographs: a clinical evaluation. Int. J. Comput. Assisted Radiol. Surg. 13, 1151–1158 (2018)

Semantics-Preserved Domain Adaptation with Target Diverse Perturbation and Test Ensembling for Image Segmentation Xiaoru Gao, Runze Wang, Rong Tao, and Guoyan Zheng(B) Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, China [email protected]

Abstract. Despite the remarkable advance in unsupervised cross-domain image segmentation, existing works suffer from two main limitations. They either ignore the semantics preservation when transferring knowledge from source domain to target domain, or ignore to fully explore the rich information of the large amount of unlabeled data in target domain, leading to a bias segmentation model. To address these issues, we propose a novel semantics-preserved cross-domain image segmentation method with a new diverse image perturbation in the target domain, improving the capacity and robustness of the model. We also propose an effective test-time ensembling strategy for a more confident prediction result. Keywords: Unsupervised · Domain adaptation · Image segmentation · Semantics-preservation · Diverse perturbation

1 Introduction Deep neural networks have achieved significant progress in semantic image segmentation with large amount of labeled training data in a specific domain [1]. However, when applying it to other unseen domains, the model could completely fail due to the excessive domain discrepancy/shift [2, 3]. Since it is laborious and hard to obtain the pixel-wise annotations, many researchers have turned to solve unsupervised domain adaptation (UDA) [4] for cross-domain image segmentation (CDIS), aiming to adapt the knowledge learned from the labeled source domain to the unlabeled target domain. Instead of explicitly reducing the domain discrepancy between source and target domains via minimizing a pre-defined distance metric [5], recent works for unsupervised cross-domain image segmentation have achieved great progress by separating it into two sequential steps [6–8]. Generally, these approaches first construct an image translation module for cross-domain image synthesis, and then train the segmentation model with pseudo labels generated by synthetic images [6, 7] or add a discriminator for feature alignment [8]. Despite remarkable progress, we find two key ingredients are lacking in previous works. Firstly, most of them do not have explicit mechanism for semantics preserving during image translation, leading to synthesized images with lose of structural © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 128–135, 2024. https://doi.org/10.1007/978-3-031-51485-2_16

Semantics-Preserved Domain Adaptation with Target Diverse

129

characteristics. As a result, the synthesized images may generate noisy and erroneous pseudo labels for the segmentation network such that bring unreliable knowledge in the target domain. Secondly, previous studies ignored the fully exploit of large set of unlabeled target data which may lead to train a bias segmentation model. To address these issues, in this paper, we propose a novel method for cross-domain image segmentation. Specifically, we first introduce a semantics preservation regularization applied on both intra-domain reconstruction and cross-domain translation process to guarantee reliable image translation for accurate knowledge transfer. Then we propose a new target diverse perturbation which consists of self-diverse perturbation and crossdiverse perturbation as to the target domain to make the model learn rich semantic features of the unlabeled target domain. We further propose a simple test-time ensembling strategy to better handle the unseen difficult samples. Experiments on the cross-domain spine lumbar segmentation task demonstrates the effectiveness of the proposed method.

2 Methods 2.1 Preliminary Let x ∈ X and y ∈ Y be images from the source domain and the target domain, respectively. lgt ∈ LX are corresponding annotations to x. Note that we have no annotations for the target domain in the training phase. 2.2 Image Translation The image translation module in our framework is adapted from the MUNIT [9]. It disentangled the images into a domain-invariant content space C and a domain-specific attribute space A via an encoder-decoder-like network. The schematic diagram of the image translation module  inthe left of Fig. 1. We denote the content and  is shown c c attribute encoders with Ex , Ey and Exa , Eya , respectively, and the generators with   Gx , Gy . For given images x and y, we can obtain the content features cx = Exc (x) and cy = Eyc (y) and the corresponding attributes ax = Exa (x) and ay = Eya (y). Then we can obtain the self-reconstructed images the content and attribute  by recombining  features, i.e. x = Gx (cx , ax ) and y = Gy cy , ay . The translated images are obtained by recombining the content features with a randomly attribute code from the  sampled   ∗ ∗ reference domain, i.e., x˜ = Gx cy , ax and y˜ = Gy cx , ay . The translated images can be encoded again and we can obtain a cross-cycle reconstruction images   xˆ and yˆ with the  c prime encoded attributes, i.e., xˆ = Gx cx = Ey (˜y), ax and yˆ = Gy cy = Exc (˜x), ay . The losses for the image translation composed of L1 norm between the original input and corresponding reconstructed images, as well as the content and attribute features reconstruction losses and the generative adversarial losses between the translated images and reference images.

130

X. Gao et al.

Fig. 1. Left: A schematic diagram of image translation. Right: The illustration of the phase component of the Fourier transform.

2.3 Semantics Preservation Regularization (SPR) Recent style transfer-based methods in UDA aligned image distributions by domain discriminators but ignored the preservation of semantics. Moreover, cycle-consistency only enforce the invertibility of the learned transformation functions. To address this issue, we propose a semantics preservation regularization (SPR) in both image space and Fourier space from a global and local perspective. Image Space: We extract the high frequency information of an image with the Gaussian kernel, which is defined as:

1 m2 + n2 (1) exp − Kσ (m, n) = √ 2σ 2 2π σ 2 where (m, n) denotes the spatial location in an image, and σ 2 denotes the variance. The we can obtain the average smooth image gl by: gl (m, n) = Kσ (m, n) ∗ I (m, n)

(2)

where ∗ represents the convolutional operation. I represents the grayscale of the input image. Finally, we can obtain the image contained the sharp semantic information by: H (x) = I − gl

(3)

where x is the input image. Fourier Space: The phase component of the Fourier Transform contains the semantic information of the image. As is shown in the right of Fig. 1, when we change the amplitude of the Fourier Transform, the semantic of the reconstructed image is not changed. Moreover, the semantic structures can completely reconstructed with only the phase. Thus, we considered that the phase should be consistent before and after image translation. We constraint this by minimizing the cosine dissimilarity: P(x, x˜ ) = −

F(x), F(˜x) F(x)2 · F(˜x)2

where F is the Fourier Transform, ,  is the dot-product, and ·2 is the L2 norm.

(4)

Semantics-Preserved Domain Adaptation with Target Diverse

131

Intra-domain SPR: We propose to employ SPR on the intra-domain reconstruction images with L1 norm, i.e., for the source domain, the semantics should be preserved between the original image x, the self-reconstructed image x , and the cross-cycle reconstruction image xˆ . The same is done for the target domain. Cross-domain SPR: We further employ SPR on the cross-domain images with L1 norm, i.e., the semantics should be preserved between the input image x and the translated image x˜ . The same is done for image y and its translated images. 2.4 Target Diverse Perturbation (TDP) To fully explore and leverage labeled image information of the source domain and unlabeled image information of the target domain, we propose a novel online perturbation strategy named Target Diverse Perturbation (TDP), which composed of self-diverse perturbation and cross-diverse perturbation, to improve the compacity and generalization of the segmentation model. Then the segmentation model is trained based on these texture variant images to learn texture-invariant features under cross consistency. The schematic diagram is shown in the left of Fig. 2.

Fig. 2. Left: A schematic diagram of the proposed target diverse perturbation. Right: Visualization of the target diverse perturbations.

Self-diverse perturbation: This is proposed with the target domain images. Given an input image y, the self-diverse perturbation is generated by combining its content with i i different attribute codes sampled from Gaussian distribution: y˜ s = Gy cy , ay . These images are then used to train the segmentation model. Cross-diverse perturbation: To transfer knowledge from the labeled source domain to unlabeled target domain, we introduce cross-diverse perturbation on the annotated source data for accurate training. Given an input image x, the cross-diverse perturbation can be generated by combining  its content with different attribute codes sampled from i i target domain: y˜ c = Gy cx , ay . These images are then used to train the segmentation model. Segmentation model for cross-domain segmentation: We employ ResUnet [10] as our segmentation model. Sx and Sy are the segmentation models for the source domain and target domain, respectively. The left part of Fig. 3 shows the input and output of our segmentation model. We use the cross-entropy loss to train the segmentation models.

132

X. Gao et al.

Fig. 3. Left: The segmentation module in our method. Right: The illustration of the mechanism of test-time ensembling strategy.

2.5 Test-Time Ensembling (TTE) for CDIS Most common methods directly test on the unseen test dataset with the learned deep model. However, some difficult and noisy cases will be out of the model ability. We demonstrate a new Test-time Ensembling strategy for cross-domain segmentation task which is proved to be efficient for unseen domain with large domain shift. The right part of Fig. 3 shows the schematic diagram of TTE. For a given test image y, we first generate diverse images as introduced in self-diverse perturbation part. Then we input these images into Sy separately and get multiple predictions. The results are then added in the channel level after the softmax operation. The final prediction result is obtained by adopting the maximum value in each channel.

3 Experiments 3.1 Dataset and Preprocessing The spine MR dataset is from the MRSpineSeg Challenge [11], providing 172 annotated T2-weighted MR volumetric images as the training dataset. We considered this dataset as the target domain. We focus on the segmentation of the spine lumbar in this paper. We first reoriented all the MR images and resampled them to a unified spacing of 1 mm. Then we cropped them to an in-plane size of 192 × 224. For each case, we chose 20 slices before and after the middle slice. We split it into training and testing dataset, in which 120 cases for training and 52 cases for testing. The spine CT dataset is considered as the source dataset, since it is easy to obtain the annotation of the bony structure. It is an in-house dataset consists of 90 spine CT scans with annotations. We first reoriented all the CT images and resampled them to isotropic 1 mm. Then they were cropped to an in-plane size of 192 × 224. For each case, we chose 20 slices before and after the middle slice. 3.2 Comparison and Ablation For comparison methods, we first trained a model with only the source domain and directly tested on the target domain to measure the domain discrepancy, named No Adaptation. We also trained a fully supervised model with the target domain, referred as Fully Supervised. This can be considered as the upper bound. Furthermore, we compared

Semantics-Preserved Domain Adaptation with Target Diverse

133

our proposed method with two SOTA cross-domain segmentation methods, i.e., CyCMIS [7] and SIFA [8]. We designed and conducted four ablation studies to investigate the effectiveness of the proposed components, i.e., the semantics preservation regularization (SPR), the target diverse perturbation (TDP) and the test-time ensembling strategy (TTE). Our baseline model is the proposed framework after removing all three components, dubbed BaseModel. Then we gradually added these components to evaluate their efficacy. 3.3 Evaluation and Implementation Performance of all comparison methods was evaluated on the same testing dataset with the Dice coefficient (Dice) and all the experiments are conducted on a GTX 1080 Ti GPU with 12 GB memory.

4 Results 4.1 Comparison Performance Comparison results are shown in Table 1. As one can see that No Adaptation model completely failed to recognize semantic structure with an average Dice of 0.12% while the Fully Supervised model achieved an average Dice of 88.10%, which demonstrates the excessive domain shift between CT and MRI. CyCMIS achieved an average Dice of 76.67%, which was much better than SIFA, implies the online data perturbation under consistency regularization of the labeled source domain is beneficial for model learning. Our method shows its effectiveness with an average Dice of 78.38%, which were much better than other two SOTA methods. The left part of Fig. 4 shows the qualitative comparison results with the ground truth labels overlayed. Table 1. Comparison results on the test dataset. Method

Dice (%)

No adaptation

0.120 ± 0.420

Fully supervised

88.10 ± 2.780

SIFA [6]

74.38 ± 4.810

CyCMIS [5]

76.67 ± 2.440

Ours

78.38 ± 2.030

4.2 Ablation Studies Effectiveness of Semantics Preservation Regularization: Table 2 shows the results of the ablation studies. The right part of Fig. 4 qualitatively shows the capability of the proposed

134

X. Gao et al.

Fig. 4. Left: Qualitative comparison results. Right: Comparison visualization of image translation w/o and w/ the proposed SPR

SPR. As one can see that the model without SPR over-adapted to the source domain and ignores the original semantic structure characteristics during image translation, especially on the spinous process. Such deformation will generate inaccurate pseudo labels for the target domain. However, by adding our proposed semantics preservation loss, the spinous process structures are well preserved. As is shown in Table 2, the BaseModel achieved an average Dice coefficient of 76.13%. By adding SPR, the Dice coefficient is improved to 76.89%. Table 2. Results of the ablation studies. BaseModel

SPR

TDP

TTE



Dice(%) 76.13 ± 2.160











76.89 ± 2.680







77.45 ± 2.220 ✓

78.38 ± 2.030

Effectiveness of Target Diverse Perturbation: With the proposed TDP, the evaluation Dice coefficient is further improved to 77.45%. This illustrates that the proposed TDP can improve models’ robustness and capacity. The right part of Fig. 2 shows examples of diverse perturbation. Effectiveness of Test-time Ensembling: The Test-time ensembling strategy provides more confident and accurate prediction results. The Dice coefficient is increased by nearly one percentage point as shown in Table 2.

5 Conclusion In this paper, we proposed an end-to-end cross-domain segmentation model based on semantics-preserved image translation and target diverse perturbation for effective learning of the target domain. We additionally introduced a test-time ensembling strategy for further improvement of the model capacity to difficult and out-of-distribution cases. The experimental results demonstrated the proposed method achieved better results than the SOTA methods.

Semantics-Preserved Domain Adaptation with Target Diverse

135

Acknowledgment. This study was partially supported by the National Natural Science Foundation of China via project U20A20199 and by Shanghai Municipal Science and Technology Commission via Project 20511105205.

References 1. Liu, X., et al.: A review of deep-learning-based medical image segmentation methods. Sustainability 13(3), 1224 (2021) 2. Wilson, G., Cook, D.J.: A survey of unsupervised deep domain adaptation. ACM Trans. Intell. Syst. Technol. (TIST) 11(5), 1–46 (2020) 3. Wilson, G., Cook, D.J.: A survey of unsupervised deep domain adaptation. ACM TIST 11(5), 1–46 (2020) 4. Liu, X., Yoo, C., et al.: Deep unsupervised domain adaptation: a review of recent advances and perspectives. APSIPA Trans. Inf. Process. 11(1) (2022) 5. Wu, F., Zhuang, X.: CF distance: a new domain discrepancy metric and application to explicit domain adaptation for cross-modality cardiac image segmentation. IEEE Trans. Med. Imaging 39(12), 4274–4285 (2020) 6. Li, Y., et al.: Bidirectional learning for domain adaptation of semantic segmentation. CVPR, pp. 6936–6945 (2019) 7. Wang, R., Zheng, G.: Cycmis: cycle-consistent cross-domain medical image segmentation via diverse image augmentation. Med. Image Anal. 76, 102328 (2022) 8. Chen, C., Dou, Q., et al.: Unsupervised bidirectional cross-modality adaptation via deeply synergistic image and feature alignment for medical image segmentation. TMI 39(7), 2494– 2505 (2020) 9. Huang, X., et al.: Multimodal unsupervised image-to-image translation. In: ECCV, pp. 172– 189 (2018) 10. Diakogiannis, F.I., et al.: ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogramm. Remote. Sens. 162(2020), 94–114 (2020) 11. Pang, S., Pang, C., et al.: Spineparsenet: spine parsing for volumetric MR image by a two-stage segmentation framework with semantic image representation. TMI 40(1), 262–273 (2020)

Biomechanics

A New Mathematical Model for Assessment of Bleeding and Thrombotic Risk in Three Different Types of Clinical Ventricular Assist Devices Yuan Li and Zengsheng Chen(B) Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China [email protected]

Abstract. In this study, our new shear-induced thrombosis and bleeding mathematical model (integrates variations in platelet and vWF morphology, function, and binding ability) was utilized to identify regions at high risk of induced blood damage and assess the bleeding and thrombosis risk in three different types of clinical ventricular assist devices (VADs) (HVAD, CentriMag and HeartMate II). It was found that extracorporeal VADs require a higher pressure head to overcome body and pipeline resistance compared to intracorporeal VADs. Centrifugal VADs have a better work ability compared to axial VADs. In vitro VADs are more highly blood damaged than in vivo VADs. For bleeding probability, CentriMag > HVAD > HeartMate II. Narrow regions of the VAD (such as the hydrodynamic clearance in HVAD, the side clearance in CentriMag and the blade tip clearance in HeartMate II) contribute significantly to device-induced bleeding. For thrombotic potential, CentriMag > HeartMate II > HVAD. The distribution of regions of high thrombotic potential in VADs is similar to the distribution of long residence time regions (such as the clearance between rotor and guide cone in HVAD, the back clearance and impeller eye in CentriMag, and the straightener and rotor inlet in HeartMate II). Flow separation regions in VADs resulting in residence time and shear stress pairs also contributed to the risk of bleeding and thrombosis. Further studies found that the hemocompatibility of VADs was strongly negatively correlated with efficiency (r < −0.80). This study found that CentriMag > HVAD > HeartMate II in terms of bleeding probability and CentriMag > HeartMate II > HVAD in terms of thrombosis potential. Narrow regions and flow separation regions in VADs contribute to blood damage. The efficiency of VADs is highly negatively correlated with hemocompatibility. Keywords: Ventricular assist devices · Bleeding · Thrombosis · Shear stress · Efficiency

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 139–152, 2024. https://doi.org/10.1007/978-3-031-51485-2_17

140

Y. Li and Z. Chen

1 Introduction Heart failure (HF) has a high morbidity and mortality rate, with a mortality rate of more than 40% within 5 years [1, 2]. To address this problem, ventricular assist devices (VADs), have been proposed and have evolved over time to become the standard clinical therapy for the treatment of HF [3]. The continuous technological advances in VADs and the constant updating of VAD types have led to a one-year survival rate of approximately 90% for patients using VADs [1, 2, 4], which is close to the survival rate of heart transplant recipients [5]. However, the complex geometry and mechanical rotation of the VADs led to the emergence of non-physiological flow patterns such as shear stress and flow stagnation [3, 6, 7]. These nonphysiologically hemodynamic phenomena are often associated with serious complications, such as hemolysis [8, 9], thrombosis [10], bleeding [11, 12], and inflammation [13, 14], which seriously affect clinical outcomes and patient recovery. In order to better develop high-performance, low-blood-damage VADs, it is essential to evaluate existing VADs to understand their differences in hemocompatibility and identify the high-risk regions that lead to blood damage. Computational fluid dynamics (CFD) methods have enabled researchers to evaluate and improve existing VADs or design new VADs using this tool due to its reliable predictive and detailed analytical abilities, alleviating the cost of multiple experiments [15]. The hemocompatibility evaluation of VADs is achieved through blood damage models. Most of the current blood damage models associated with VADs are implemented by shear stress thresholds or hemolysis models. However, blood damage is a complex biochemical process involving functional variations of multiple cells and proteins, and a single threshold shear stress can only partially describe the blood damage caused by VADs, making the conclusions obtained from these threshold models may deviate from experimental or clinical results [16]. On this basis, more accurate mathematical models that can describe biochemical processes need to be constructed. The most constructed model of blood damage is the model of red blood cell damage (hemolysis) based on shear stress and residence time, which is usually in the form of a power-law function, and the difference between the hemolysis models constructed by different researchers is the variation of the coefficients. The predicted results of hemolysis models correlate well with experimental results [17–20] and are commonly used for blood damage assessment and design optimization of VADs, improving well the performance of VADs in terms of hemolysis [7, 21, 22]. Bleeding and thrombosis as more serious clinical complications than hemolysis, the thrombotic risk models constructed so far are mostly applicable to microchannels or stenotic vessels and rarely applicable to predict thrombosis in VADs [10, 23–26]. Even more, there are no mathematical models that can predict bleeding risk induced by VADs. In the present study, our new shear-induced thrombosis and bleeding numerical model was utilized to evaluate identify regions that lead to a high risk of blood damage, predict the of thrombosis and bleeding risk in three different types of clinical VADs. Our model considers the contribution of platelet activation, platelet receptor shedding, vWF unfolding and HMWM-vWF cleavage, and variations in platelet-vWF binding ability to the risk of bleeding and thrombosis. Representative second and third generation, intracorporeal and extracorporeal VADs (HVAD, HeartMate II and CentriMag) were employed to compare their hemodynamic performance and hemocompatibility, and to identify

A New Mathematical Model for Assessment of Bleeding

141

regions within the pump that are at high risk of causing blood damage. The different types of these VADs include different support modes (intracorporeal long-term support: HVAD and HeartMate II, extracorporeal short-term support: CentriMag), different rotor type (centrifugal: HVAD and CentriMag, axial: HeartMate II), different levitation modes (full magnetic levitation: CentriMag, mechanical levitation: HeartMate II and hydrodynamic levitation: HVAD). By analyzing flow conditions and blood damage, comparing the differences in blood damage of different types of VADs and elucidating the reasons behind these differences, it helps to better understand the blood damage of different types of VADs and provides assistance for clinical VADs selection as well as VADs design and optimization.

2 Materials and Methods 2.1 Studied Models Three-dimensional geometric models of three clinical VADs were obtained by inverse modeling. HVAD (Medtronic/HeartWare, Framingham, MA, USA) is a hydrodynamic levitated centrifugal VAD for intracorporeal long-term support. The rotor of the HVAD consists of four thick blades that keep the rotor suspended in casing through a very narrow hydrodynamic clearance (Fig. 1a). CentriMag (Abbott, Thoratec, Pleasanton, CA, USA) is a fully magnetically levitated centrifugal VAD for extracorporeal short-term support. The rotor of the CentriMag consists of four main blades and four splitters that maintain rotor stability in the axial and radial directions and drive rotor rotation by magnetic levitation (Fig. 1b). HeartMate II (Abbott, Pleasanton, CA, USA) is a mechanically levitated axial VAD for intracorporeal long-term support. The rotor of the HeartMate II consists of a rotor with two blades, a straightener and a diffuser (Fig. 1c). All of these pumps are representative products and are commonly used in clinical practice. This study is of great interest to explore the complications caused by VADs and the newer iterations of VADs by comparing these three blood pumps that differ in terms of support mode (intracorporeal long-term support: HVAD and HeartMate II, extracorporeal short-term support: CentriMag), rotor type (centrifugal: HVAD and CentriMag, axial: HeartMate II), different levitation modes (full magnetic levitation: CentriMag, mechanical levitation: HeartMate II and hydrodynamic levitation: HVAD), to clarify the hemocompatibility differences between the different types of VADs and to summarize the high risk regions for VAD-induced blood damage. The typical clinical working conditions for HF heart failure support models of these three VADs are shown in Table 1. 2.2 Hydraulic Performance Prediction The pressure head is the useful work output of the blood pump and is defined as follows: P = Poutlet − Pinlet where, Pinlet and Poutlet are the total pressure at the inlet and outlet of the blood pump, respectively.

142

Y. Li and Z. Chen

Fig. 1. Study models: a HVAD, b CentriMag, c HeartMate II.

Table 1. Flow and pressure head that VADs need to deliver in heart failure support mode VAD name

Flow rate [L/min]

Pressure head [rev/min]

HVAD

4.5

75

CentriMag

4.5

150

HeartMate II

4.5

75

Efficiency is defined as the ratio of the useful work output of the blood pump to the total work input to the blood pump: η[%] =

Q × P × 100 T ×ω

where Q is the blood pump flow rate, P is the pressure head of blood pump output, T is the torque, and ω is the angular velocity of the rotor. 2.3 Shear Stress and Residence Time Predication The shear stress generated by the blood pump depends on the velocity gradient and viscosity and is calculated as follows [27]: τ=

    2    1/ 2 1 σii − σjj + σij σij 6

(4)

where σ is the shear stress tensor, which is calculated by multiplying the shear rate tensor with blood viscosity.

A New Mathematical Model for Assessment of Bleeding

143

Residence time can be used to describe the flow and recirculation of cells or particles in the blood. The source of residence time defiance as [28]: SRT = 1

(5)

The source term is solved by the transport equation with a diffusion coefficient DRT is 1.14 × 10–11 m2 /s [28]. 2.4 Bleeding and Thrombosis Prediction Model Building The device-induced risk of bleeding and thrombosis is reflected by describing the variation in platelet-vWF binding ability. The platelet-vWF binding process is simplified into four processes to facilitate the construction of mathematical models: (1) in the absence of shear stress, platelets are not activated (resting platelets) and vWF is not unfolded (globular vWF) (Fig. 2a); (2) in the influence of shear stress, globular vWF is unfolded and the platelet binding site as well as the enzymatic cleavage site are exposed. A portion of resting platelets are activated and become activated platelets. Activated platelets release stimulants to further activate resting platelets (Fig. 2b). (3) Regardless of platelet activation, the GPIbα receptors on the surface of platelet can bind to the A1 site of vWF. When platelets are activated, in addition to GPIbα receptors binding to the A1 site of vWF, GPIIb/IIIa receptors on the platelet surface bind to the C1-6 site of vWF (Fig. 2c, d). (4) Under shear condition, the receptors on the platelet surface are lost/shedding, making the platelets less able to adhere with vWF (Fig. 2c, d). At the same time, ADAMS13 enzymatic cleavage enzyme acts on the unfolded vWF at the enzymatic cleavage site A2, causing the degradation of HMWM-vWF and weakening the ability of vWF binding with platelets (Fig. 2b). The difference between the true binding ability of platelet-vWF (considering platelet receptor shedding and HMWM-vWF degradation) and the ideal binding ability (considering platelet receptor not shedding and HMWM-vWF not degradation) was used to describe the bleeding risk. The actual binding ability between platelet and vWF was employed to assess thrombotic potential. The above process is constructed as functions and embedded in the commercial software ANSYS CFX, where the transport equations are used to solve them. A more detailed description can be found in our previous study [29]. 2.5 Mesh Details and Sensitivity Analysis The tetrahedral mesh is generated by the commercial software Ansys meshing (ANSYS Inc., Canonsburg, PA, USA). A six-layer prismatic mesh was added in the near-wall region to capture the flow conditions in the near-wall region and to meet the requirements for solving the turbulence model. The mesh numbers for the three studied VADs are 11.2 million (HVAD), 12 million (CentriMag) and 3.6 million (HeartMate II), respectively. Mesh-independence validation was performed, the details of which can be found in our previous studies [10].

144

Y. Li and Z. Chen

Fig. 2. Platelet-vWF adhesion processes: a resting platelets and unfolded vWF; b platelet activation, vWF unfolding, platelet-vWF adhesion and HMWM-vWF degradation; c platelet GPIba receptor binding to the A1 site of vWF and shedding; d platelet GPIIb/IIIa receptor binding to the C1-6 site of vWF and shedding.

2.6 CFD Methods The density and viscosity of blood were defined in the commercial software ANSYS CFX as 1055 kg/m3 and viscosity as 0.0035 Pa s [10, 21]. For VADs, the convective terms were solved in high-resolution form and the SST k-ω turbulence model was employed. A mass flow rate inlet boundary condition and a pressure (0 mmHg) outlet boundary condition were employed. By adjusting the rotating speed, HVAD, CentriMag and HeartMate II achieve pressure head of 75 mmHg (intracorporeal VADs) or 150 mmHg (extracorporeal VADs).

3 Results 3.1 Hydraulic Performances and Flow Field Among the three VADs, HVAD and HeartMate II are intracorporeal support VADs, so they need to provide a pressure head of 75 mmHg to meet the requirements (Table 1). The CentriMag, as an extracorporeal support VADs, needs to overcome more resistance from the body and pipeline, so it needs to provide a pressure head of 150 mmHg to meet the support requirements (Table 1). Among the three VADs, HVAD and CentriMag, as centrifugal VADs, require significantly lower speed to achieve the target pressure head than the axial bleed pump HeartMate II (Fig. 3a). This is due to the larger rotor diameter of the centrifugal VADs, which has a higher work ability. Of the three VADs, the HVAD and HeartMate II have similar and higher efficiencies than the CentriMag (Fig. 3b), which means that the flow losses in the CentriMag are greater. Flow details inside the blood pump are shown to further localize the regions causing flow losses. For HVAD, flow separation is developed in the impeller passages (Fig. 3c). For CentriMag, flow separation is mainly found in the secondary flow passages (such as the blade tip clearance, impeller eye and so on.), and flow separation in the impeller

A New Mathematical Model for Assessment of Bleeding

145

passages and diffuser pipe regions is also identified (Fig. 3d). For HeartMate II, flow separation was evident in the straightener and rotor inlet regions, and flow separation was also developed in the region near the diffuser domain (Fig. 3e).

Fig. 3. Hydraulic performance and flow fields of three VADs: a comparison of rotating speeds required to achieve the target pressure head (compared with experiments [16, 17]); b comparison of efficiency; c flow fields of HVAD; d flow fields of CentriMag; d flow fields of HeartMate II.

3.2 Shear Stress and Residence Time Shear stress is an important cause of blood damage [3, 17]. The distribution of shear stress in the three VADs was demonstrated. The high shear stress region of HVAD was mainly found in the blade tip clearance and blade back clearance (Fig. 4a). This is because HVAD is a hydrodynamic levitated VADs, which has a very narrow clearance between the rotor and the casing. Some high shear stress regions are also present in the trailing edge of the blades of HVAD (Fig. 4a). In the domain of HVAD, the intersection region between the flow separation region and the main flow region also leads to the presence of high shear stress due to the large velocity gradient, such as the impeller passage regions (Fig. 4a). The high shear stress regions of CentriMag are mainly found in the trailing edge of the blade and side clearance (Fig. 4b). The high shear stress in the side clearance appear due to the high velocity blood accelerated by the blade enters the

146

Y. Li and Z. Chen

narrow clearance resulting in an increased velocity gradient. In addition, shear stress of CentriMag is also present in the regions where the flow separation meets the main flow, such as blade tip clearance, impeller eye, impeller passages and diffuser pipe (Fig. 4b). High shear stress in HeartMate II is mainly identified in the regions of blade tip clearance (Fig. 4c), because of the narrow clearance and high blood velocity in this region. Second, regions of high shear stress were found in the diffuser region (Fig. 4c). This is caused by the blood with high velocity entering the diffuser without being fully decelerated and pressurized. In addition, shear stress distribution is also found in the region where the flow separation meets the main flow, such as the straightener and rotor inlet areas, and the diffuser region (Fig. 4c). By comparing the high shear stress volumes of the three VADs, it can be found that for shear stress volumes greater than 50 and 100 Pa, CentriMag is the largest, followed by HVAD, and HeartMate II is the lowest. (Fig. 4). This correlates with the highest pressure head output of CentriMag (Table 1). For volumes of shear stress greater than 150 Pa, HVAD > CentriMag ≈ HeartMate II. This correlates with the narrow hydrodynamic clearance of HVAD (Fig. 4).

Fig. 4. Shear stress distribution of three VADs: a HVAD; b CentriMag; c HeartMate II

Residence time is another important indicator of blood damage [10]. The long residence time region of the HVAD is identified in the clearance between the rotor and the guide cone (Fig. 5a). The reason for the long residence time in this region is due to the thick blades of the HVAD impeding the flow of blood, causing the blood to constantly recirculate in this region (Fig. 3c). Secondly, the flow separation region of the impeller flow passage of HVAD also has a longer residence time due to blood stagnation (Fig. 5a). The long residence time of CentriMag was observed in the back clearance (Fig. 5b). This is due to the slower flow velocity of the blood in the back clearance (Fig. 3d). Secondly,

A New Mathematical Model for Assessment of Bleeding

147

flow separation regions such as the impeller eye, the blade tip clearance, the diffuser pipe and the impeller passage were also identified as some long residence time regions (Fig. 5b). The long residence time regions of the HeartMate II were closely related to the expected flow separation regions (Fig. 5c). The most significant residence time is found in the straightener and rotor inlet, followed by the region near the diffuser. Overall, HeartMate II exceeded the theoretical passage time (0.6 s) by the highest percentage, followed by CentriMag, and HVAD by the lowest (Fig. 5).

Fig. 5. Residence time distribution of three VADs: a HVAD; b CentriMag; c HeartMate II

3.3 Bleeding Probability The bleeding probability caused by the three VADs was evaluated, CentriMag > HVAD > HeartMate II (Fig. 6a). The bleeding probability of HVAD and HeartMate II was consistent with the clinical statistics based on 8000 cases [30]. The regions where HVAD contributed more to bleeding were identified in the blade tip clearance and blade back clearance due to the high shear stress in this region (Fig. 6b). In addition, the impeller passage of HVAD also contributed to bleeding due to the superimposed region of shear stress and residence time (Fig. 6b). The side clearance and blade trailing edge of CentriMag had high contribution to bleeding due to high shear stress (Fig. 6d). The superimposed regions of residence time and shear stress in CentriMag (such as impeller eye, blade tip clearance, and so on) were observed to contribute to bleeding (Fig. 6c). The regions where HeartMate II contributed more to bleeding were identified in the blade tip clearance and diffuser regions because of the high shear stress in these regions (Fig. 6e). The superimposed regions of residence time and shear stress (rotor inlet and diffuser) also contributed to the bleeding (Fig. 6e).

148

Y. Li and Z. Chen

Fig. 6. Comparison of bleeding risk for three VADs: a bleeding probability for the three VADs (compared with clinical statistics [30]); b regional distribution of contributing bleeding within HVAD; c regional distribution of contributing bleeding within CentriMag; d regional distribution of contributing bleeding within HeartMate II

3.4 Thrombotic Potential The thrombotic potential of the three VADs was evaluated, CentriMag > HeartMate II > HVAD (Fig. 7a). The thrombotic potential of HVAD and HeartMate II was consistent with clinical statistics based on 2000 cases [31]. Regions of the clearance between the rotor and guide cone as well as in the impeller passage of HVAD was defined as the high thrombosis region (Fig. 7b) due to the long residence time. Thrombotic potential was also observed in the blade tip clearance of the HVAD (Fig. 7b), due to the high shear stress lead to the activation of platelets. In addition, the thrombotic potential in the impeller passage was also identified (Fig. 7b). The region of high thrombotic potential in CentriMag is consistent with its long residence time region or the superposition of residence time and shear stress, such as back clearance, blade tip clearance, impeller eye, impeller passages and diffuser pipe (Fig. 7c). The region of high thrombotic potential in HeartMate II was considered to be in the straightener and rotor inlet (Fig. 7d). Thrombotic potential was also identified in the diffuser and the rotor region (Fig. 7d). These are regions of long residence time or superimposed regions of residence time and shear stress. The predicted regions of high thrombotic potential were consistent with those observed in experiments and in the clinic [32–34].

A New Mathematical Model for Assessment of Bleeding

149

Fig. 7. Comparison of thrombotic risk for three VADs: a thrombotic potential for the three VADs (compared with clinical statistics[31]); b regional distribution of contributing thrombosis within HVAD (comparison with experimental/clinical statistical regions[32]); c regional distribution of contributing thrombosis within CentriMag; d regional distribution of contributing thrombosis within HeartMate II (comparison with experimental/clinical statistical regions [33, 34])

4 Disscusion Bleeding and thrombosis are two of the most important clinical complications. Up to now it is still lacking the numerical models to evaluate the risk of bleeding and thrombosis and identify the regions contributing to these two complications. In the present study, a VAD related thrombosis and bleeding numerical model was utilized to evaluate identify regions that lead to a high risk of blood damage, predict the risk of thrombosis and bleeding in three different types of clinical VADs. Three representatives clinical VADs are HVAD (in vivo long-term supported hydrodynamic levitated centrifugal blood pump), CentriMag (in vitr short-term supported fully magnetically levitated centrifugal blood pump) and HeartMate II (in vivo long-term supported mechanically levitated axial blood

150

Y. Li and Z. Chen

pump). Mathematical models integrating platelet activation, platelet receptor shedding, vwf unfolding, HMWM-vWF degradation and platelet-vWF adhesion ability were constructed to assess the risk of bleeding and thrombosis induced by VADs. The high and low risk of bleeding and thrombosis predicted by the constructed bleeding and thrombosis models for the different VADs was consistent with the clinical statistics. Further, the regions of high thrombotic risk predicted by the mathematical models were consistent with experimental or clinical results. These demonstrate the reliability of the constructed mathematical models for assessing bleeding and thrombotic risk and the reliability of the conclusions of this study. This study found that extracorporeal VADs require a higher pressure head to overcome the resistance of the body and the pipeline compared to intracorporeal VADs. Centrifugal VADs have a better ability to do work compared to axial VADs. The hemocompatibility of the in vitro VADs is lower than that of the in vivo VADs. For bleeding probability, CentriMag has the highest bleeding probability, followed by HVAD and HeartMate II has the lowest. The regions where VADs contribute to bleeding are similar to the high shear stress regions. Narrow regions in VADs (such as the hydrodynamic clearance in HVAD, the side clearance in CentriMag and the blade tip clearance in HeartMate II) contribute significantly to device-induced bleeding. The region of interface between flow separation and main flow in the VADs also contributes to bleeding. For thrombotic potential, CentriMag had the highest thrombotic potential, followed by HeartMate II, and HVAD had the lowest. The regions of high thrombotic potential in VADs are similar to that of long residence time regions (such as the clearance between rotor and guide cone in HVAD, back clearance and impeller eye in CentriMag, straightener and rotor inlet in HeartMate II). The contribution of flow separation in VADs to thrombotic potential is high. The superimposed regions of residence time and shear stress also contributed to the thrombotic potential. Further studies found that the hemocompatibility of VADs was strongly negatively correlated with efficiency, with a correlation of r = −0.80 between efficiency and bleeding probability and r = −0.93 between efficiency and thrombotic potential. It is known from this study that the presence of narrow regions should be avoided as much as possible in VADs to avoid the generation of high shear stress and blood damage. VADs should also be concerned with flow separation regions. On the one hand, flow separation regions in VADs can lead to blood stagnation and increase thrombotic potential. On the other hand, flow separation regions in VADs can interfere with the flow field, leading to flow losses in VADs and increasing shear stress. It also decreases the efficiency of VADs, making them require higher rotating speed to output the target pressure head, increasing the shear stress generated by the mechanical motion of the rotor. This study also has some limitations. First, only three VADs were evaluated in this study, and more VADs need to be introduced to further refine the conclusions of the study. Secondly, for the hydrodynamic and full magnetic levitation VADs, the rotor axial position in casing varies with the rotating speed and flow rate, but this factor was not considered in this study. Finally, although some experimental or clinical findings support the conclusions of this study, further validation of the conclusions of this study is needed.

A New Mathematical Model for Assessment of Bleeding

151

5 Conclusions It was found that CentriMag > HVAD > HeartMate II in terms of bleeding probability and CentriMag > HeartMate II >. HVAD in terms of thrombosis potential. Narrow regions and flow separation regions in VADs contribute to blood damage. The efficiency of VADs is highly negatively correlated with hemocompatibility. Acknowledgment. This work was supported by the National Key R&D Program of China (Grant no. 2020YFC0862900, 2020YFC0862902, 2020YFC0862904 and 2020YFC0122203), the Beijing Municipal Science and Technology Project (Grant no. Z201100007920003), and the Fundamental Research Funds for the Central Universities and the National Natural Science Foundation of China (Grant no. 32071311).

References 1. Benjamin, E.J., et al.: Heart disease and stroke statistics-2017 update: a report from the American heart association. Circulation 135(10), e146–e603 (2017) 2. Slaughter, M.S., et al.: Advanced heart failure treated with continuous-flow left ventricular assist device. N. Engl. J. Med. 361(23), 2241–2251 (2009) 3. Xi, Y., et al.: The impact of ECMO lower limb cannulation on the aortic flow features under differential blood perfusion conditions. Med. Novel Technol. Dev. 16 (2022) 4. Najjar, S.S., et al.: An analysis of pump thrombus events in patients in the HeartWare Advance bridge to transplant and continued access protocol trial. J. Heart Lung. Transp. 33(1), 23–34 (2014) 5. Starling, R.C., et al.: Results of the post-U.S. Food and Drug Administration-approval study with a continuous flow left ventricular assist device as a bridge to heart transplantation: a prospective study using the INTERMACS (Interagency Registry for Mechanically Assisted Circulatory Support). J. Am. Coll. Cardiol. 57(19), 1890–1898 (2011) 6. Chen, Z., et al.: Flow features and device-induced blood trauma in CF-VADs under a pulsatile blood flow condition: a CFD comparative study. Int. J. Numer Method Biomed. Eng. 34(2) (2018) 7. Wu, P., et al.: On the optimization of a centrifugal maglev blood pump through design variations. Front. Physiol. 12, 699891 (2021) 8. Lansink-Hartgring, A.O., et al.: Changes in red blood cell properties and platelet function during extracorporeal membrane oxygenation. J. Clin. Med. 9(4) (2020) 9. Chandler, W.L.: Platelet, red cell, and endothelial activation and injury during extracorporeal membrane oxygenation. ASAIO J. 67(8), 935–942 (2021) 10. Li, Y., et al.: The impact of a new arterial intravascular pump on aorta hemodynamic surrounding: a numerical study. Bioengineering (Basel) 9(10) (2022) 11. Valladolid, C., Yee, A., Cruz, M.A.: Von Willebrand factor, free Hemoglobin and Thrombosis in ECMO. Front. Med. (Lausanne) 5, 228 (2018) 12. Mazzeffi, M., et al.: Von Willebrand factor concentrate administration for acquired Von Willebrand syndrome-related bleeding during adult extracorporeal membrane oxygenation. J. Cardiothorac. Vasc. Anesth. 35(3), 882–887 (2021) 13. Meyer, A.D., et al.: Effect of blood flow on platelets, leukocytes, and extracellular vesicles in thrombosis of simulated neonatal extracorporeal circulation. J. Thromb. Haemost. 18(2), 399–410 (2020)

152

Y. Li and Z. Chen

14. Ki, K.K., et al.: Current understanding of leukocyte phenotypic and functional modulation during extracorporeal membrane oxygenation: a narrative review. Front. Immunol. 11, 600684 (2020) 15. Birschmann, I., et al.: Ambient hemolysis and activation of coagulation is different between HeartMate II and HeartWare left ventricular assist devices. J. Heart Lung. Transp. 33(1), 80–87 (2014) 16. Thamsen, B., et al.: Numerical analysis of blood damage potential of the HeartMate II and HeartWare HVAD rotary blood pumps. Artif. Organs. 39(8), 651–659 (2015) 17. Fraser, K.H., et al.: A quantitative comparison of mechanical blood damage parameters in rotary ventricular assist devices: shear stress, exposure time and hemolysis index. J. Biomech. Eng. 134(8), 081002 (2012) 18. Heuser, G., Opitz, R.: A Couette viscometer for short time shearing of blood. Biorheology 17(1–2), 17–24 (1980) 19. Zhang, T., et al.: Study of flow-induced hemolysis using novel Couette-type blood-shearing devices. Artif. Organs. 35(12), 1180–1186 (2011) 20. Song, X., et al.: Computational fluid dynamics prediction of blood damage in a centrifugal pump. Artif. Organs. 27(10), 938–941 (2003) 21. Li, Y., et al.: Investigation of the influence of blade configuration on the hemodynamic performance and blood damage of the centrifugal blood pump. Artif. Organs. 46(9), 1817–1832 (2022) 22. Li, Y., et al.: A new way to evaluate thrombotic risk in failure heart and ventricular assist devices. Med. Novel Technol. Dev. 16 (2022) 23. Sorensen, E.N., et al.: Computational simulation of platelet deposition and activation: II. Results for Poiseuille flow over collagen. Ann. Biomed. Eng. 27(4), 449–458 (1999) 24. Wu, W.T., et al.: Multi-constituent simulation of thrombus deposition. Sci. Rep. 7, 42720 (2017) 25. Wu, W.T., et al.: Simulation of thrombosis in a stenotic microchannel: The effects of vWFenhanced shear activation of platelets. Int. J. Eng. Sci. 147 (2020) 26. Zhussupbekov, M., et al.: Influence of shear rate and surface chemistry on thrombus formation in micro-crevice. J. Biomech. 121, 110397 (2021) 27. Taskin, M.E., et al.: Evaluation of Eulerian and Lagrangian models for hemolysis estimation. ASAIO J 58(4), 363–372 (2012) 28. Menichini, C., Xu, X.Y.: Mathematical modeling of thrombus formation in idealized models of aortic dissection: initial findings and potential applications. J. Math. Biol. 73(5), 1205–1226 (2016) 29. Li, Y., et al.: A mathematical model for assessing shear induced bleeding risk. Comput. Methods Programs Biomed. 231, 107390 (2023) 30. Cho, S.M., Moazami, N., Frontera, J.A.: Stroke and Intracranial Hemorrhage in HeartMate II and HeartWare left ventricular assist devices: a systematic review. Neurocrit. Care. 27(1), 17–25 (2017) 31. Suarez-Pierre, A., et al.: Early outcomes after heart transplantation in recipients bridged with a HeartMate 3 device. Ann. Thorac. Surg. 108(2), 467–473 (2019) 32. Schalit, I., et al.: Accelerometer detects pump thrombosis and thromboembolic events in an in vitro HVAD circuit. ASAIO J. 64(5), 601–609 (2018) 33. Rowlands, G.W., Antaki, J.F.: High-speed visualization of ingested, ejected, adherent, and disintegrated thrombus in contemporary ventricular assist devices. Artif. Organs. 44(11), E459–E469 (2020) 34. Koliopoulou, A., et al.: Bleeding and thrombosis in chronic ventricular assist device therapy: focus on platelets. Curr. Opin. Cardiol. 31(3), 299–307 (2016)

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker and Its Relationship with the TAZ Expression Shan He, Rushuang Xu, Qing Luo, and Guanbin Song(B) Key Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Shapingba District, No. 174, Shazheng Street, Chongqing, People’s Republic of China [email protected]

Abstract. Background: YAP1 is a transcription factor closely related to mechanical stress and participates in the Hippo signal pathway. However, the prognostic functions and clinical significance of YAP1 in many cancers are unknown. In this study, by using bioinformatics from public databases, we purpose to comprehend its expression profile, prognostic value, immune infiltration pattern, and biological function. Methods: In this research, the Kaplan–Meier survival estimate and Mantel–Cox test indicated that YAP1 expression predicted prognosis. Using CIBERSORT and TIMER showed the components of the immune microenvironment. YAP1 pathways were then identified with Gene Set Enrichment Analysis (GSEA). Additionally, we analyzed YAP1 and TAZ potential Protein-Protein Interaction (PPI) by using GEO, STRING, and CYTOSCAPE databases. Results: Diverse cancers express YAP1 heterogeneously and it is genetically regulated. YAP1 was enrichment in pathways associated with immunity, metabolism, cancer, cell apoptosis, and cytoskeleton in various malignant cancers. Especially, YAP1 and TAZ analysis showed that immunity, metabolism, and cytoskeleton were highly enrichment pathways. Conclusions: These discoveries feature that YAP1 assumes a significant part in many kinds of cancers and establish the framework for an in-depth investigation into the function of YAP1 and TAZ. Keywords: YAP1 · TAZ · Immune · Biomarker · Pan-cancer analysis

1 Introduction As a downstream molecule of the Hippo signaling pathway, YAP1 is known to play a role in cancer cell proliferation, invasion, and apoptosis [1, 2]. It has been shown in previous studies that YAP1 performs different roles in kinds of tumors, as well as influencing the resistance to anti-tubulin drugs in most cancers [3]. Due to their functional similarities, YAP1 and TAZ are frequently referred to as single organisms. YAP1/TAZ can participate in mechanical regulation, and F-actin cytoskeleton tension and structural organization affect cell metabolism [4, 5]. YAP1/TAZ also interacts with a variety of signaling pathways, including WNT, transforming growth factor, and epidermal growth © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 153–166, 2024. https://doi.org/10.1007/978-3-031-51485-2_18

154

S. He et al.

factor receptor signaling, to respond to cell signals [6, 7]. Furthermore, using the GEO database, we investigated the relation between YAP1 and TAZ expression in human malignancies and discovered the genes and pathways that these two proteins control. Because it can recognize and destroy most cancer cells in their early stages, the immune system is critical to tissue homeostasis. As a result, it plays a role in the multilayered defense against cancer that occurs at the tissue level. Recent studies reveal that YAP1 and TAZ may regulate the transcription of PD-L1 in cancer cells, including mesothelioma, lung, and melanoma. Furthermore, YAP1 deletion was linked to a lower presence of Treg cells in the tumor as well as increased production of interferon-γ and tumor necrosis components [8]. Although the YAP1 oncogene has emerged in many tumor types, its specific mechanism of action remains unclear. We used bioinformatics analysis tools to investigate the transcriptional expression of YAP1, immune infiltration, and prognosis of multiple cancer types. Moreover, we also demonstrated the genetic alterations and potential pathways of YAP1 in different cancers. Consequently, our research highlights the vital position of YAP1 in human cancers and supplies a potential relationship between YAP1 and tumor microenvironment, and aims to give additional information so that people can comprehend the importance of YAP1 and TAZ in various malignancies.

2 Materials and Methods 2.1 UCSC Xena Dataset Analysis UCSC Xena is a user-friendly cancer genomics visualization tool and Over 1,600 gene expression datasets from 50 cancer types are available, including TCGA, ICGC, and TCGA Pan-Cancer Atlas data [9]. UCSC Xena data centers also provide GTEx and TCGA gene expression data and clinical stages for a pan-cancer differential expression of YAP1. 2.2 GEO Dataset Acquirement and Analysis To investigate the upstream and downstream regulatory genes of YAP1/TAZ, two distinct datasets from the GEO database were retrieved (GSE66949 and GSE59230). 2.3 cBioPortal Dataset Analysis A free tool for interactively exploring multidimensional cancer genomics data sets is the cBio Cancer Genomics Portal, which currently includes more than 5,000 tumor samples [10]. We examined YAP1 mutations and copy number changes (CNA) in various malignancies. 2.4 UALCAN Dataset Analysis An interactive web portal called UALCAN allows users to do in-depth studies of TCGA gene expression data. We employed UALCAN to confirm the differential DNA methylation expression of YAP1 across cancer types.

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker

155

2.5 Kaplan–Meier Plotter Analysis Using the Kaplan–Meier plotter, we evaluated the prognostic values of YAP1 mRNA expression in various tumors mostly from TCGA. 2.6 TISIDB Immune Analysis TISIDB is a website for gene and tumor-immune interaction. We forecasted how the YAP1 gene would be expressed in various immunological subtypes using the TISIDB, consisting of C1 (wound healing), C2 (IFN-γ dominant), C3 (inflammatory), C4 (lymphocyte deplete), C5 (immunologically quiet), and C6 (TGF-β dominant) subtypes. 2.7 CIBERSORT Immune Analysis We utilized the ESTIMATE algorithm to calculate immune ratings for each tumor sample using 22 different immune cell types estimated using CIBERSORT. 2.8 GSEA Analysis GSEA was performed on YAP1 and its co-expression genes to identify pathways based on KEGG and GO terms. The results were considered statistically significant when the normalized enrichment score (NES) ≥ 1.0 and the false discovery rate (FDR) < 0.25. 2.9 TIMER Database Analysis TIMER2.0 uses six cutting-edge algorithms to provide a more accurate estimation of immune infiltration levels for tumor profiles supplied by users or The TCGA. 2.10 STRING Protein Network Analysis The STRING database is a useful tool for examining protein functional relationships. We predicted the upstream and downstream regulatory proteins of YAP1/TAZ as well as the regulatory relationships between these proteins in GSE66949 and GSE59230 using the STRING database. 2.11 CYTOSCAPE Network Analysis Molecular interaction networks and biological pathways can be visualized using the CYTOSCAPE platform, which also combines these networks with gene expression profiles and other data. In this study, CYTOSCAPE was used to build the PPI networks for YAP1 and TAZ. 2.12 GENPIA Dataset Analysis GENPIA provides interactive analysis and profiling of cancer and normal gene expression. The prognostic value of YAP1 was analyzed by GENPIA.

156

S. He et al.

2.13 Statistical Analysis The data was examined using GraphPad Prism 8, TB tools, and Hiplot. Survival rates were examined using the Mantel-Cox test and Kaplan-Meier graphs. In all studies, the Student’s t-test was used to compare the two groups. Statistical significance was found to be p < 0.05.

3 Results 3.1 Pan-Cancer Analysis of YAP1 Expression, Stage, Molecular Subtypes, and Methylated Level in Human Cancers We analyzed tumor and normal samples from the public database to assess whether YAP1 transcriptional expression correlates with cancer and to find the characteristics of YAP1 mRNA expression. YAP1 transcriptional expression was downregulated in most cancers, including bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), stomach adenocarcinoma (STAD) and uterine corpus endometrial carcinoma (UCEC) while YAP1 transcriptional expression used to be upregulated in cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), glioblastoma multiforme (GBM), liver hepatocellular carcinoma (LIHC) and thyroid carcinoma (THCA) (Fig. 1a). Additionally, we explored various clinical stages and relationship between YAP1 and cancer related phenotypes. The results of the relationship between YAP1 and cancer molecular subtypes showed that YAP1 expression in different molecular subtypes of BLCA, COAD, head and neck squamous cell carcinoma (HNSC), KIRP, brain lowergrade glioma (LGG), ovarian serous cystadenocarcinoma (OV), PCPG, PRAD, STAD and UCEC was significantly (Fig. 1b). Moreover, the result showed that YAP1 expression varied significantly in different clinical stages of adrenocortical carcinoma (ACC), COAD, pancreatic adenocarcinoma (PAAD) (Fig. 1c). Then, using UALCAN and TCGA, we examined the relationship between the DNA methylation levels and the expression of YAP1. We found that YAP1 is low-expressed in BRCA, KIRC, KIRP, LUAD and LUSC, but the DNA methylation level was higher than adjacent normal tissues (Fig. 1d). This suggests that abnormal methylation levels may lead to altered YAP1 expression, and YAP1 expression in some cancers may be altered due to abnormal levels of methylation, stages, and molecular subtypes. 3.2 Genetic Alterations in YAP1 in Human Cancers Genetic mutations of YAP1 members were investigated by using the cBioPortal database. The result showed that a low mutation ratio of YAP1 was noticed in cancers (Fig. 2a). To evaluate the variation in gene expression brought on, we investigated the link between gene expression and relative linear copy number values. The outcomes showed that

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker

157

Fig. 1. The transcription levels of YAP1 in human cancers. a The level of YAP1expression in different tumor types from the TCGA database in UCSC Xena; b YAP1 expression was different in molecular subtypes of cancers via TISIDB; c Association with the YAP1 expression and clinical stage; d DNA-methylation beta values ranging from 0 (un-methylated) to 1 (fully methylated) were determined by UALCAN (*p < 0.05, **p < 0.01, ***p < 0.001).

a relationship between YAP1 expression and CNA in ACC, BLCA, cervical squamous cell carcinoma, and endocervical adenocarcinoma (CESC), BRCA, ESCA, HNSC, KICH, KIRC, LIHC, lung adenocarcinoma (LUAD), PRAD, STAD, and THCA that was positive. However, LGG showed a negative association (Fig. 2b). 3.3 Prognostic Value of YAP1 Expression in Cancers To determine the predictive value of YAP1 for prognosis, we used the Kaplan-Meier plotter database and GEPIA to investigate the relationship between YAP1 mRNA expression and OS (Overall Survive) and RFS (Relapse-Free Survival) prognostic value. In the KM database, an increase in YAP1 expression was related to a longer OS in ESCA and KIRC with longer RFS in ESCA. In contrast, elevated YAP1 expression was linked to shorter OS in BLCA, LUAD, PAAD, and Thymoma (THYM) with shorter RFS in BLCA, Sarcoma (SARC), PAAD, and Testicular Germ Cell Tumors (TGCT). It was not shown that YAP1 had a prognostic value for other cancers (Fig. 3a). In order to advance examine whether YAP1 expression is the prognostic component for patients, the Cox regression model used to be employed in GEPIA for survival analysis. The results showed high expression of YAP1 was related to shorter OS in ACC, LGG, and PAAD with shorter RFS in ACC and BLCA. Additionally, in ESCA and KIRC, greater YAP1 expression

158

S. He et al.

Fig. 2. Genetic alterations of YAP1 in cancers. a Genetic alteration of YAP1 in different cancer samples and their proportions; b The correlation between relative liner copy-number values and YAP1 expression (Spearman correlation test, p < 0.05 was considered significant).

was linked to longer OS, which is consistent with the findings of the KM plotter database (Fig. 3b). 3.4 Correlation Between YAP1 Expression and Immune Cells Infiltration and Immune Subtypes in Human Cancers The state of immune cells that are infiltrating was associated with YAP1 expression. We used two algorithms for evaluation, TIMER predicts the abundance of immune cells in the wider tumor microenvironment, whereas CIBERSORT infers the relative fractions of immune subsets in the total leukocyte population. Therefore, using TIMER and CIBERSORT, we find out the correlation between levels of immune cell infiltration in malignancies and YAP1 expression. The findings demonstrated a strong relationship between YAP1 expression and TIICs. The clustering heat-map by CIBERSORT showed YAP1 was positively correlated with CD4 + T cell memory resting in most cancers, especially HNSC (rho = 0.31, p = 6.63E-11), SKCM (rho = 0.3, p = 7.7E-10) and LUSC (rho = 0.25, p = 3.78E-07). However, YAP1 was negatively correlated with CD8 + T cells in most cancers, especially HNSC (rho = −0.35, p = 6.42E-14) and BRCA (rho = −0.21, p = 1.73E-10). In some other cancers, YAP1 also has an important relationship with immune cells. For instance, YAP1 expression in LIHC was positively correlated with infiltrating levels of B cell memory (rho = 0.15, p = 5.15e-03) and Macrophage M0

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker

159

Fig. 3. The prognostic value of YAP1 expression in various cancers. a Prognostic HRs of YAP1 in different cancers assessed by Kaplan-Meier plotter; b YAP1 expression with patient OS and RFS in pan-cancer by using Mantel–Cox test based on GEPIA database (p < 0.05 was considered significant).

(rho = 0.14, p = 9.45e-03), was negatively correlated with infiltrating levels of CD8 + T cell (rho = -0.165, p = 2.17e-03) and NK cell activated (rho = −0.125, p = 2.06e-02) (Fig. 4a). The TIMER platform also verified the association between the expression of YAP1 and immune cell infiltration (Fig. 4b). Then, we evaluated YAP1 mRNA expression in several immunological subtypes using TISDB. Then we observed significant differences in BRCA, LGG, LIHC, LUAD, PCPG, PRAD, SARC, STAD, TGCT, and UCEC across C1, C2, C3, C4, C5, and C6 subtypes (Fig. 4c). This also confirms that the different functions of YAP1 in human cancers may be related to immune subtypes. 3.5 Significant Pathways Influenced by YAP1 in Different Cancers We chose YAP1 and its co-expressed genes for GSEA exploration to determine achievable pathways. The result of KEGG showed that “the Regulation of actin cytoskeleton,” and “TGF-beta signaling pathway” were highly enriching scores. It also affected some cancer-related pathways, including the “Focal adhesion,” “WNT signaling pathway,” and “Vascular smooth muscle contraction” (Fig. 5a). In addition, “Regulation of cellular response to stress,” “oxidative phosphorylation,” “Tube development,” “NOTCH signaling pathway,” and “Apoptotic process” in GO BP were observed in ACC, BLCA, KIRC, and MESO. YAP1 also affected “Hippo signaling,” “DNA metabolic process,” “Regulation of WNT signaling pathway” and “Cell-cell signaling” in several cancers (Fig. 5b). In

160

S. He et al.

Fig. 4. The connection between YAP1 expression and tumor-infiltrating immune cells. a Immune infiltration level and YAP1 expression were correlated using CIBERSORT; b immune infiltration level and YAP1 expression were correlated using TIMER; c YAP1 mRNA expression in different immune subtypes from TISIDB (p < 0.05 was considered significant).

terms of GO MF, “Transcription regulator activity,” “Cell adhesion molecule binding,” “Transcription coactivator activity,” “oxidoreductase activity,” and “NADH dehydrogenase activity” were highly enrichment scores (Fig. 5c). GO CC analysis showed that “oxidoreductase complex,” “NADH dehydrogenase complex,” and “Membrane protein complex” were observed in ACC, BLCA, KICH, KIRC, LIHC, and THCA (Fig. 5d). These results indicated that YAP1-related pathways had significant correlations with immune-related, cell metabolism, and cell assemble. 3.6 The Correlation Between YAP1 Expression and TAZ Expression and Regulate Downstream Genes In cancer, metastasis, embryonic development, and tissue regeneration, YAP1 and its transcriptional co-activator with a PDZ-binding region (TAZ) play a role as transcriptional regulators. We focused on the association between YAP1 expression and TAZ expression in human tumors as a consequence. The outcomes exhibited that YAP1 was positively related to TAZ in ACC, COAD, LIHC, TGCT, and Uveal Melanoma (UVM). However, YAP1 was negatively correlated with TAZ in BRCA Basal, BRCA LumA, BRCA LumB, BRCA, HNSC HPV+ HNSC, KIRC, KIRP, LGG, LUAD, PRAD, Rectum adenocarcinoma (READ), SARC, SKCM Metastasis, Skin Cutaneous Melanoma (SKCM), THCA, THYM, UCEC (Fig. 6a).

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker

161

Fig. 5. Significant pathways impacted through YAP1. a The connections between YAP1 and KEGG pathways from TCGA analyzed through GSEA; b the connections between YAP1 and GO BP terms from TCGA analyzed through GSEA; c the connections between YAP1 and GO MF terms from TCGA analyzed through GSEA; d the connections between YAP1 and GO CC terms from TCGA analyzed through GSEA (NES ≥ 1.0, FDR < 0.25).

To further discover the achievable function of YAP1/TAZ in tumors, we choose the GEO database to evaluate genes and differential signaling pathways that interact with YAP1/TAZ. First, we assessed YAP1/TAZ differentially expressed genes by using GEO2R, then we construct a PPI network by using STRING and CYTOSCAPE databases. As vividly shown in Fig. 6b, YAP1/TAZ had physical interactions with CYR61, ANKRD1, the MMP gene family, and others. Furthermore, ClueGO was carried out to discover the useful enrichment of YAP1/TAZ. This research exhibited that, “the collagen catabolic process,” “response to amyloid-beta,” “regulation of mononuclear cell migration,” “granulocyte chemotaxis,” “interleukin-12 production,” “wound healing,” “killing of cells of other organisms,” “monocyte chemotaxis” and “collagen binding” were enrichment in OSCC (Oral squamous cell carcinoma) (Fig. 6b). According to Fig. 6c, YAP1/TAZ interacts with the TFF gene family, COL6A3, CDCA7, and others. The results of ClueGO showed that “cell cycle DNA replication,” “condensed chromosome, centromeric region,” and “metaphase plate congression,” were highly enrichment pathways (Fig. 6c).

162

S. He et al.

Fig. 6. The correlation between YAP1 expression and TAZ expression and regulate downstream genes. a Relationships between YAP1 and TAZ in human cancers (Spearman correlation test, *p < 0.05 was considered significant); b PPI network for YAP1/TAZ was constructed in STRING and CYTOSCAPE, p-values are used to color a bar graph of GO and KEGG enriched words from GSE66949; c STRING and CYTOSCAPE were used to build the PPI network for YAP1/TAZ. GO and KEGG enriched terms were colored using p-values from GSE59230. Red represents core genes

4 Discussion The Hippo signaling system regulates YAP1 activation, a transcriptional coactivator. YAP1 is involved in the development of cancer aggressiveness, metastasis, and therapeutic resistance, as well as normal tissue homeostasis and regeneration [11]. YAP1 expression and nuclear localization have been observed in a variety of human malignancies, including liver cancer, colon cancer, ovarian cancer, lung cancer, and prostate cancer [12]. Consistently, the proliferation, survival, stemness, invasiveness, metastatic behavior, and therapeutic resistance of tumor cells in various cancers have all been found to be positively impacted by increased YAP1 activity [13]. YAP1 is strongly expressed in self-renewing embryonic stem cells during stem cell function (ESCs). Even under conditions of differentiation, overexpression of YAP1 retains stem-like characteristics and self-renewal while inhibiting ESC differentiation [14, 15]. Furthermore, YAP1 activity influences the function of liver cell fate, skin, heart, and nervous system [16]. YAP1/TAZ are known to have a variety of effects on immune cell lineage commitment, differentiation, and function, particularly CD4+T cell differentiation, and function [8], Dendritic cell activity, CD8+T cell differentiation, and Macrophage recruitment and differentiation [17]. Furthermore, YAP1/TAZ is required for the control of ferroptosis in cancer cells [18].

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker

163

The involvement of YAP1/TAZ in the growth and dissemination of malignancies has been demonstrated in numerous investigations. Studies have shown that YAP1 exerts its transcriptional function mainly through nucleocytoplasmic transfer. In the normal physiological state, YAP1 stays in the cytoplasm or is degraded and cannot produce transcriptional activity [19]. However, YAP1 enters the nucleus in a non-phosphorylated form and binds to transcription factors similar to the TEAD protein family when the Hippo signaling pathway has been shut down or inactivated, which in turn regulates downstream gene expression [20]. In LIHC [21], COAD [22], and BRCA [23], F-actin, the cytoskeleton’s structural arrangement, and stress are all regulated by YAP1, and tumor development and metastasis by receiving mechanical stimuli from integrins, focal adhesions, and other mechanosensory receptors. YAP1 can also regulate numerous signaling pathways, including WNT [24], transforming growth factor-β [25], Hedgehog [26], and epidermal growth factor receptor signaling [27]. Existing studies have focused on the functional study of YAP1 in tumor cells, while for the downstream signaling molecules of YAP1, the spatiotemporal characteristics of YAP1, the regulatory mechanism of YAP1, and TAZ as a whole are still unclear. More sophisticated mouse models, and more novel technologies (e.g. single cell sequencing, spatiotemporal omics technologies) will be required in the future to investigate the function of YAP1 in cancer. We performed a systematic bioinformatics study of available data to get more innovative insights into the clinical significance and biological functions of YAP1. We observed that YAP1 is abnormally expressed in most cancers and correlates with tumor stage and molecular type. In our data, YAP1 is strongly expressed in COAD and has a strong correlation with stage and subtype. However, we found that there were articles that reported that in BRCA and STAD, the expression of YAP1 was correlated with tumor stage, which was different from our data, and we think it is due to the different samples, and tumor heterogeneity. Recent evidence also indicates that DNA methylation has an important relationship with the expression of YAP1. Our study discovered that YAP1 DNA methylation levels in cancer tissues were considerably higher than in normal tissue. In addition, In BRCA, KIRC, KIRP, LUAD, and LUSC, we observed that YAP1 is low-expressed, while methylation levels are increased. This also shows that abnormal methylation in tumors may lead to changes in the expression of YAP1. YAP1 is a key molecule in tumor methylation. We analyzed genetic alteration data from tumors derived from cBioPortal. We found a low mutation rate for YAP1, whereas mutations in TP53 and EGFR may result in altered YAP1 expression. We propose that the impact of YAP1 on tumor initiation and progression is due to the nucleocytoplasmic transfer characteristic of YAP1, rather than mutations. DNA copy number is strongly associated with disease progression, in most tumors, YAP1 expression correlates positively with DNA copy number, consistent with our findings. Thus, targeting gene products that are amplified/deleted in a given disease may have a significant therapeutic impact on the disease. It has been observed that YAP1 expression has been associated to a poor prognosis. To figure out whether YAP1 could be employed as a cancer prognostic marker. we investigate the prognostic of YAP1 expression in tumors. In our cohort, we observed that aberrant expression of YAP1 leads to poor prognosis. Surprisingly, research have found that YAP1 has been connected to the prognosis of STAD, HNSC, and BRCA.

164

S. He et al.

This suggests that data from TCGA still have limitations and further human samples are needed in the future to identify the specific mechanism of YAP1 in tumors. Immune infiltration ability is substantially correlated with YAP1 expression. We found a strong link between YAP1 expression and B cell, CD4+T cell, CD8+T cell, neutrophil, macrophage, and DC infiltration. The difference between CIBERSORT and TIMER data may be due to selection bias. Similar to this, depending on the type of cancer, YAP1 shows a variety of immune subtypes and tumor-infiltrating immune cells. Studies show that YAP1 is an important immunosuppressant and that disrupting YAP1 activity results in enhanced T-cell activation, differentiation, and tumor infiltration. Our results also showed that YAP1 may be a key inducer of CD4+T cells and CD8+T cells activation. Furthermore, we identified pathways involved in various immune-related processes including the TGF-β signal pathway, chemokine signaling pathway, immune effector process, and interleukin6-mediated pathway. These findings indicate that YAP1 targeting might be a tactic to boost immunotherapy effectiveness. In addition to playing a part in immunological control, YAP1 was linked to carcinogenic pathways, according to GSEA (e.g. MAPK, WNT, Hippo, and p53). YAP1 is closely linked to cell metabolism and was the most important signaling pathway in the modulation of glycolysis in PAAD and LIHC. How each cell reacts to the physical signals that it gets from its immediate environment depends on how YAP1 and TAZ interpret those signals, depends on the structural organization and tension of the cytoskeleton [2]. Our results of GSEA exhibited that cell metabolism and cell assemble-related pathways are enriched in YAP1. We employed the GEO RNA-seq expression for validation in this study. We have identified 25 and 160 genes, and subsequently, we found that the SLC2A3 gene was filtered and positively associated with YAP/TAZ in both GEO data. Studies have shown that activation of Glut3-YAP in colorectal cancer can regulate metabolic reprogramming to promote metastasis, but this has not been reported in other cancers. SLC2A3 may therefore be a crucial locus in YAP1’s regulation of tumor metabolism, and its function can be further studied. Furthermore, our research has some limitations. For example, firstly, our data comes from public databases, and more accurate human data is needed for further verification in the future. Secondly, we did not explore how YAP1 affects tumor immune processes, involved in genes and mechanisms.

5 Conclusions In conclusion, this mining research revealed the clinical significance of YAP1 in human cancers. Our study looked at the characteristics of YAP1 expression, its prognostic significance, and its relationship to immune cells that invade tumors, and relation in a variety of malignancies. Our findings provide evidence that YAP1 may be a useful target for cancer treatment and shed light on the crucial roles that YAP1 plays in tumor defense, metabolic activity, and prognostic significance. Future functional investigations can build on these results to confirm YAP1’s critical function in malignancies. Acknowledgment. The financial supports of the Key Program from the National Natural Science Foundation of China (11832008) and the Natural Science Foundation of Chongqing (cstc2020jcyjmsxmX0545).

Analysis of YAP1 Gene as a Potential Immune-Related Biomarker

165

References 1. Zhang, J., Deng, X.: Effects of miR-599 targeting YAP1 on pro-liferation, invasion and apoptosis of bladder urothelial carcinoma cells. Exp. Mol. Pathol. 118, 104599 (2021). https://doi. org/10.1016/j.yexmp.2020.104599 2. Zanconato, F., Cordenonsi, M., Piccolo, S.: YAP and TAZ: a signalling hub of the tumour microenvironment. Nat. Rev. Cancer 19, 454–464 (2019). https://doi.org/10.1038/s41568019-0168-y 3. Kim, M.H., Kim, J.: Role of YAP/TAZ transcriptional regulators in resistance to anti-cancer therapies. Cell Mol. Life 74, 1457–1474 (2017). https://doi.org/10.1007/s00018-016-2412-x 4. Panciera, T., Azzolin, L., Cordenonsi, M., et al.: Mechanobiology of YAP and TAZ in physiology and disease. Nat. Rev. Mol. Cell Biol. 18, 758–770 (2017). https://doi.org/10.1038/ nrm.2017.87 5. Koo, J.H., Guan, K.L.: Interplay between YAP/TAZ and metabolism. Cell Metab. 28, 196–206 (2018). https://doi.org/10.1016/j.cmet.2018.07.010 6. Hansen, C.G., Moroishi, T., Guan, K.L.: YAP and TAZ: a nexus for Hippo signaling and beyond. Trends Cell Biol. 25, 499–513 (2015). https://doi.org/10.1016/j.tcb.2015.05.002 7. Lee, Y., Finch-Edmondson, M., Cognart, H., et al.: Common and unique transcription signatures of YAP and TAZ in gastric cancer cells. Cancers (Basel) 12, 3667 (2020). https://doi. org/10.3390/cancers12123667 8. Ni, X., Tao, J., Barbi, J., et al.: YAP Is essential for Treg-mediated suppression of antitumor immunity. Cancer Discov. 8, 1026–1043 (2018). https://doi.org/10.1158/2159-8290 9. Goldman, M.J., Craft, B., Hastie, M., et al.: Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 38, 675–678 (2020). https://doi.org/10.1038/s41 587-020-0546-8 10. Gao, J., Aksoy, B.A., Dogrusoz, U., et al.: Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal 6, l1 (2013). https://doi.org/10.1126/sci signal.2004088 11. Szulzewsky, F., Holland, E.C., Vasioukhin, V.: YAP1 and its fusion proteins in cancer initiation, progression and therapeutic resistance. Dev. Biol. 475, 205–221 (2021). https://doi.org/ 10.1016/j.ydbio.2020.12.018 12. Zhao, B., Li, L., Lei, Q., et al.: The Hippo-YAP pathway in organ size control and tumorigenesis: an updated version. Genes. Dev. 24, 862–874 (2010). https://doi.org/10.1101/gad.190 9210 13. Thompson, B.J.: YAP/TAZ: drivers of tumor growth, metastasis, and resistance to therapy. BioEssays 42, e1900162 (2020). https://doi.org/10.1002/bies.201900162 14. Wang, Y., Yu, A., Yu, F.X.: The Hippo pathway in tissue homeostasis and regeneration. Protein Cell 8, 349–359 (2017). https://doi.org/10.1007/s13238-017-0371-0 15. Tamm, C., Böwer, N., Annerén, C.: Regulation of mouse embryonic stem cell self-renewal by a Yes-YAP-TEAD2 signaling pathway downstream of LIF. J. Cell Sci. 124, 1136–1144 (2011). https://doi.org/10.1242/jcs.075796 16. Lian, I., Kim, J., Okazawa, H., et al.: The role of YAP transcription coactivator in regulating stem cell self-renewal and differentiation. Genes. Dev. 24, 1106–1118 (2010). https://doi.org/ 10.1101/gad.1903310 17. Huang, Z., Zhou, J., Leung, W.T., et al.: The novel role of Hippo-YAP/TAZ in immunity at the mammalian maternal-fetal interface: opportunities, challenges. Biomed. Pharmacother. 126, 110061 (2020). https://doi.org/10.1016/j.biopha.2020.110061 18. Dai, C., Chen, X., Li, J., et al.: Transcription factors in ferroptotic cell death. Cancer Gene. Ther. 27, 645–656 (2020). https://doi.org/10.1038/s41417-020-0170-2

166

S. He et al.

19. Yu, F.X., Zhao, B., Guan, K.L.: Hippo pathway in organ size control, tissue homeostasis, and cancer. Cell 163, 811–828 (2015). https://doi.org/10.1016/j.cell.2015.10.044 20. Marti, P., Stein, C., Blumer, T., et al.: YAP promotes proliferation, chemoresistance, and angiogenesis in human cholangiocarcinoma through TEAD transcription factors. Hepatology 62, 1497–1510 (2015). https://doi.org/10.1002/hep.27992 21. van der Stoel, M., Schimmel, L., Nawaz, K., et al.: DLC1 is a direct target of activated YAP/TAZ that drives collective migration and sprouting angiogenesis. J. Cell Sci. 133, jcs239947 (2020). https://doi.org/10.1242/jcs.239947 22. Yu, M., Cui, R., Huang, Y., et al.: Increased proton-sensing receptor GPR4 signalling promotes colorectal cancer progression by activating the hippo pathway. EBioMedicine 48, 264–276 (2019). https://doi.org/10.1016/j.ebiom.2019.09.016 23. Zheng, L., Xiang, C., Li, X., et al.: STARD13-correlated ceRNA network-directed inhibition on YAP/TAZ activity suppresses stemness of breast cancer via co-regulating Hippo and RhoGTPase/F-actin signaling. J. Hematol. Oncol. 11, 72 (2018). https://doi.org/10.1186/s13045018-0613-5 24. Abduch, R.H., Carolina Bueno, A., Leal, L.F., et al.: Unraveling the expression of the oncogene YAP1, a Wnt/beta-catenin target, in adrenocortical tumors and its association with poor outcome in pediatric patients. Oncotarget 7, 84634–84644 (2016). https://doi.org/10.18632/ oncotarget.12382 25. Zhang, X., Fan, Q., Li, Y., et al.: Transforming growth factor-beta1 suppresses hepatocellular carcinoma proliferation via activation of Hippo signaling. Oncotarget 8, 29785–29794 (2017). https://doi.org/10.18632/oncotarget.14523 26. Wang, C., Cheng, L., Song, S., et al.: Gli1 interacts with YAP1 to promote tumorigenesis in esophageal squamous cell carcinoma. J. Cell. Physiol. 235, 8224–8235 (2020). https://doi. org/10.1002/jcp.29477 27. Kurppa, K.J., Liu, Y., To, C., et al.: Treatment-induced tumor dormancy through YAPmediated transcriptional reprogramming of the apoptotic pathway. Cancer Cell 37, 104–122 (2020). https://doi.org/10.1016/j.ccell.2019.12.006

Morphological Feature Recognition of Induced ADSCs Based on Deep Learning Ke Yi1 , Cheng Xu1 , Guoqing Zhong1 , Zhiquan Ding1 , Guolong Zhang1 , Xiaohui Guan2 , Meiling Zhong3 , Guanghui Li1 , Nan Jiang1 , and Yuejin Zhang1(B) 1 School of Information Engineering, East China Jiaotong University, Nanchang, China

[email protected]

2 The National Engineering Research Center for Bioengineering Drugs and the Technologies,

Nanchang University, Nanchang, China 3 School of Materials Science and Engineering, East China Jiaotong University,

Nanchang 330031, China

Abstract. In order to accurately identify the morphological features of different differentiation stages of induced Adipose Derived Stem Cells (ADSCs) and judge the differentiation types of induced ADSCs, a morphological feature recognition method of different differentiation stages of induced ADSCs based on deep learning is proposed. Using the super-resolution image acquisition method of ADSCs differentiation based on stimulated emission depletion imaging, after obtaining the super-resolution images at different stages of inducing ADSCs differentiation, the noise of the obtained image is removed and the image quality is optimized through the ADSCs differentiation image denoising model based on low rank nonlocal sparse representation; The denoised image is taken as the recognition target of the morphological feature recognition method for ADSCs differentiation image based on the improved Visual Geometry Group (VGG-19) convolutional neural network. Through the improved VGG-19 convolutional neural network and class activation mapping method, the morphological feature recognition and visual display of the recognition results at different stages of inducing ADSCs differentiation are realized. After testing, this method can accurately identify the morphological features of different differentiation stages of induced ADSCs, and is available. Keywords: Deep learning · Induced ADSCs · Differentiation · Morphological features · Recognition

1 Introduction In the 1950s, Trowell found that adipocytes are similar to fibroblasts and suggested that under appropriate conditions, adipocytes can differentiate into fibroblasts or revert to the developmental direction of fibroblasts to adipocytes [1]. In 2001, Zuk et al. extracted adipose tissue suspension through liposuction for the first time, cultured pluripotent stem cells and named them Processed Lipoaspirate Cells (PLC), which laid a deep foundation for the research of adipose stem cells [2]. There are many names of such extracted © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 167–175, 2024. https://doi.org/10.1007/978-3-031-51485-2_19

168

K. Yi et al.

cells, which can be described as various-adipose derived stromal cells, adipose tissue derived mesenchymal stem cells, preadipocytes and Stromal Vascular Fraction (SVF) of adipose tissue, etc. [3]. In 2004, at the second Annual Conference on International Adipose Application Technology, this cell was uniformly named Adipose Derived Stem Cells (ADSCs) [4]. Adipose stem cells have the advantages of easy extraction, strong proliferation, good biocompatibility, less pain, autologous transplantation and no ethical problems [5]. The differentiation ability of adipose stem cells can be used to repair defects, while paracrine function can regulate the function of surrounding tissues [6]. A large number of preclinical studies and preliminary clinical trials have been carried out in the fields of bone, myocardium, nerves and blood vessels, which has accumulated rich laboratory data for the use of adipose stem cells in treatment. The biological characteristics and advantages of adipose stem cells have broad application prospects in tissue engineering, cell transplantation and treatment, epidermal wound repair, plastic surgery and so on [7]. However, at present, the research on adipose stem cells is mostly carried out in vitro or in animals, and the inducer often contains growth factors related to tumor occurrence and metastasis. Adipose stem cells themselves also secrete many biological molecules with high biological activity. It is unclear whether they have potential growth promoting effect on malignant cells. Most importantly, the local application of ADSCs can trigger systemic reactions and increase the level of signal molecules in serum. Therefore, there are still many tests for the real clinical application of ADSCs. In addition, the specific influencing factors and regulatory mechanism of adipose stem cell proliferation and differentiation are not completely clear, and the safety, feasibility, reliability and effectiveness of its clinical application need further research and discussion. With the continuous in-depth exploration and research of ADSCs and a large number of clinical trials related to various fields, the understanding of ADSCs is constantly deepening. It is believed that its use in clinical treatment is no longer a distant thing. With the gradual development of science and technology, the differentiation state of ADSCs can be displayed in the form of super-resolution images [8]. In order to accurately identify the differentiation characteristics of ADSCs and judge the differentiation state of ADSCs [9], this paper takes the deep learning technology as the technical support, proposes the morphological feature recognition method of inducing ADSCs differentiation in different stages based on deep learning, and verifies its application effect in experiments [10]. The results show that this method can be used as an effective technique to identify the differentiation state of induced ADSCs.

2 Materials and Methods This paper mainly uses Stimulated Emission Depletion(STED) super-resolution imaging system to obtain super-resolution ADSCs differentiation images [11]. In order to make the fluorescent substance on the surface of the samples, using a laser beam can be used to stimulate it, then the loss by stimulated emission of radiation method will be most peripheral quenching of fluorescence material of light, by a bunch of other close to the high-energy laser stimulated radiation loss (wavelengths longer ring) to reduce the

Morphological Feature Recognition of Induced ADSCs

169

diffraction of light spot area of fluorescent substances, to obtain high resolution image [12]. A diagram of STED excitation schematic is shown in Fig. 1. Excitation light spots

Laser spots are lost by stimulated emission

Lens

Detector

Laser of excitation

Effective fluorescent spot (PSF)

Stimulated emission consumes the laser

Modulation of phase

y

x z

Quenching process

Fig. 1. A diagram of STED excitation schematic.

While the blue spot is the excitation laser, shown in the adjacent panel in blue, the orange spot is the stimulated emission loss laser. The STED beam is phase-modulated to form the focal doughnut shown in the top right panel. The blue excitation light spot is quenched by the orange STED laser, which significantly reduced the size of the green spot of excited phosphorescent molecules; the profile of the Green Fluorescent Protein (GFP) has an FWHM of 66 nm [13]. When STED stimulated emission depletion imaging system obtains the superresolution image of differentiated ADSCs, there is inevitably noise [14]. In order to improve the recognition accuracy of ADSCs differentiation morphological features, this paper uses the ADSCs differentiation image denoising model based on low rank nonlocal sparse representation to denoise the ADSCs differentiation image. Taking the denoised ADSCs differentiation super-resolution image as the recognition sample data set of the morphological feature recognition method of ADSCs differentiation image based on the improved VGG-19 convolutional neural network, firstly, the improved VGG-19 convolutional neural network model is constructed to recognize the morphological features of ADSCs differentiation image, and then realize the visualization of the morphological feature recognition results of ADSCs differentiation image through the class activation mapping method. In this paper, when identifying the morphological features of different stages of ADSCs differentiation, combined with super-resolution ADSCs differentiation images were mainly divided into five types, the data set is five classification tasks. The amount of 3-layer full connection layer parameters of traditional VGG-19 is too large, which can not improve the classification accuracy of the model [15]. Therefore, this paper improves the VGG-19 full connection layer to ensure the efficiency and accuracy of network classification and recognition while saving time and space cost. Improved FC2_VGG-19 network structure is shown in Fig. 2. This paper attempts to improve the full connection layer structure and optimize the classification performance of the network. The three-layer full connection structure [4096, 4096, 1000] of VGG-19 is improved to two-layer full connection structure [T , 5].

170

K. Yi et al.

T

5 mesenchymal stem cells adipoblasts precursor adipocytes

Conv2-1 Conv2-2

Conv3-1 Conv3-2 Conv3-3 Conv3-4

Conv4-1 Conv4-2 Conv4-3 Conv4-4

Conv5-1 Conv5-2 Conv5-3 Conv5-4

immature adipocytes mature adipocytes

Convolution + ReLU max pooling

Conv1-1 Conv1-2

fully connected + ReLU softmax

Fig. 2. Improved FC2_VGG-19 network structure.

T is the number of neurons in the full connection layer, and the optimized FC2_VGG-19 network is obtained. After getting FC2_VGG-19 network, in the backward propagation process of superresolution ADSCs differentiation image to input FC2_VGG-19 network, each convolution layer will produce three-dimensional data blocks after convolution operation. In this paper, these three-dimensional data blocks are collectively referred to as the morphological characteristic map of ADSCs differentiation, and the characteristic map is composed of two-dimensional data obtained by convolution operation of different convolution cores. In this paper, these two-dimensional data and the results after visual processing are collectively referred to as activation mapping map, which is distinguished from it. This method is class activation mapping method. The class activation mapping method is used to generate the visual display results, to realize the visualization of the recognition results, enhance the reliability of the recognition results, and provide a reference basis for the differentiation feature recognition of induced ADSCs.

3 Results In order to test the application effect of the method in this paper, its application effect is tested in MATLAB software. When acquiring super-resolution induced ADSCs differentiation image, it is inevitable that there is noise, which will lead to adverse changes in image quality. Therefore, the proposed method can denoise the ADSCs differentiation image through the ADSCs differentiation image denoising model based on low rank nonlocal sparse representation. Comparing the original collected images of five stages of ADSCs differentiation and the denoised images of this method, it can be seen that the super-resolution image quality of the five differentiation stages is poor and there are noise points before denoising of the super-resolution induced ADSCs differentiation image. After denoising by the proposed method, the super-resolution image clarity of the five differentiation stages is improved and the image quality is clear. When recognizing the morphological features of the five differentiation stages of induced ADSCs, the loss function Loss is used to judge the recognition effect. The smaller the loss is, the higher the accuracy of the recognition result is. If the predicted

Morphological Feature Recognition of Induced ADSCs

171

value is equal to the real value, there is no loss. Loss = −

m 

  xj lg qj

(1)

j=1

where xj is the morphological feature label; m represents the number of morphological feature classification categories; qj represents the probability that the prediction category is j. When the identification category is the same as the real label, the value of xj is 1, otherwise the value of xj is 0. When identifying the morphological features of ADSCs in different stages of differentiation, the method in this paper mainly improves the VGG-19 convolution neural network model, and improves the full connection structure of VGG-19* to two layers of [T , 5]. In order to verify the applicability of the model to different data sets, the superresolution images of the five differentiation stages of ADSCs are flipped horizontally, vertically and rotated 180° to obtain the super-resolution ADSCs differentiation image samples after data amplification. Figure 4 shows the influence of the number of neurons in the whole connection layer on the morphological feature recognition performance of the five differentiation stages of induced ADSCs.

Fig. 3. Effects of the number of neurons in fully connected layeron classification performance.

It can be seen from Fig. 3 that with the increase of the number of neurons in the whole connection layer, after the morphological features of the five differentiation stages of induced ADSCs are recognized by the method in this paper, the loss function of the recognition result gradually decreases, and the accuracy of the recognition performance of the method in this paper gradually improves. T value is the number of neurons, fwhen the number of neurons is 70, the loss function is the minimum, and the recognition accuracy is higher. Since then, with the increase of the number of neurons, the recognition

172

K. Yi et al.

accuracy of morphological features in the five differentiation stages of induced ADSCs tends to be stable, and the increase of full connection layer parameters can no longer improve the recognition accuracy. Therefore, the improved VGG-19* full connection layer structure is [70, 5]. Figure 4 is a schematic diagram of the loss function change of the method in this paper. As shown in Fig. 4, when the number of iterations gradually increases, the maximum value of the loss function of the training set and the maximum value of the loss function in the test set of the method in this paper are 0.020 and 0.015. The loss values of training and testing are basically the same, and there is no fitting and under fitting.

Fig. 4. Schematic diagram of loss function curve of this method.

After identifying the morphological features of different differentiation stages of ADSCs, the morphological feature recognition results of ADSCs differentiation are directly displayed next to the image to directly display the differentiation stage type of the image, and the visualization effect is good.

4 Discussion Combined with the research content of this paper, the application prospect of ADSCs is discussed as follows: (1) Seed cells as tissue engineering The repair of various tissues, especially those lacking regenerative ability, such as cartilage, muscle, myocardium and nerve, has always been a difficult problem in the field of medicine. Tissue engineering based on stem cells provides the possibility to solve this

Morphological Feature Recognition of Induced ADSCs

173

problem. It is considered that the ideal stem cells for regenerative medicine or tissue engineering must meet the following conditions: (a) rich content; (b) The separation method is harmless; (c) Multi differentiation potential is adjustable and renewable; (d) Autologous or allogeneic transplantation is efficient and safe; (e) Have operational standards in line with current medical practice guidelines. It is known that ADSCs are rich in content, simple separation method, safe liposuction, less harm to patients, and adipose tissue is easy to regenerate and can be obtained repeatedly. ADSCs do not express major histocompatibility complex II and human leukocyte DR antigen, it has been reported recently that ADSCs can exert negative immune regulation by acting on dendritic cells of the body and participate in the induction of immune tolerance, moreover, when ADSCs are co cultured with allogeneic peripheral blood monocytes, they can not stimulate the response of mixed lymphocytes, which indicates that ADSCs can be used for allogeneic transplantation. Although the differentiation regulation of ADSCs is not known, it is known that it can regulate the adipogenic and osteogenic differentiation of ADSCs through Wnt signaling pathway. Moreover, some scholars reported that the skull defect of patients was successfully repaired by ADSCs autologous transplantation. Therefore, the application of ADSCs may be safe. Therefore, ADSCs may be stem cells that meet the requirements of regenerative medicine and are more suitable for tissue engineering than other stem cells. (2) For direct cell therapy ADSCs not only have multidirectional differentiation potential, but also secrete a variety of cytokines, such as TGF–β, Interleukin 6, II 8, vascular endothelial growth factor, hepatocyte growth factor, tumor necrosis factor, etc. Therefore, ADSCs may also promote tissue repair through paracrine cytokines. Intravenously injected ADSCs can spontaneously gather at the lesion and participate in the repair of injury. The specific mechanism of this “homing” effect of ADC is unknown. It has been reported that the migration of ADSCs can be enhanced by increasing the expression of membrane type I matrix metalloproteinase (MTI-MMP) and promoting the activation of MMP-2 zymogen. Recent studies have found that ADSCs can release new mitochondria to damaged cells, so as to save the process of aerobic metabolism, which shows that ADSCs can participate in tissue repair through a variety of mechanisms. (3) Vector of gene therapy In addition, because foreign genes are easy to be introduced into ADSCs in vitro and can be stably expressed in vivo, ADSCs can also be used as gene vectors for gene therapy. Some scholars have used ADSCs as drug cell carrier to successfully inhibit the growth of mouse melanoma through its “homing” effect. Therefore, the morphological features extracted in this paper have practical significance and can be used as a reference technology for the study of ADSCs differentiation.

174

K. Yi et al.

5 Conclusions Morphological feature recognition is one of the key fields in the application of computer vision technology. Among them, computer vision technology represented by deep learning has strong ability of feature recognition and analysis, which has become the research trend in the field of computer vision in recent years. This paper takes the induced ADSCs as the research object, and uses the improved VGG-19 classification network to complete the morphological feature recognition of different differentiation stages, to realize the high accuracy detection of the morphological features of induced ADSCs differentiation. The class activation mapping method is used to generate the visual display results, to realize the visualization of the recognition results, enhance the reliability of the recognition results, and provide a reference basis for the differentiation feature recognition of induced ADSCs. Acknowledgment. This work was supported in part by the National Natural Science Foundation of China under Grant 92159102, Grant 11862006, Grant 62062034, Grant 62172160.

References 1. Sabol, R.A., Giacomelli, P., Beighley, A., Bunnell, B.A.: Concise review: adipose stem cells and cancer. Stem Cells. 37, 1261–1266 (2019). https://doi.org/10.1002/stem.3050 2. Ebihara, M., Katayama, K.: Anomalous charge carrier decay spotted by clustering of a timeresolved microscopic phase image sequence. J. Phys. Chem. C 124, 23551–23557 (2020). https://doi.org/10.1021/acs.jpcc.0c07609 3. Zheng, T., Zhang, T.B., Zhang, W.X., et al.: Icariside II facilitates the differentiation of ADSCs to schwann cells and restores erectile dysfunction through regulation of miR-33/GDNF axis. Biomed. Pharmacother. 125, 10 (2020). https://doi.org/10.1016/j.biopha.2020.109888 4. Liu, M., Lei, Y.Z., el Yu, L., al.: Super-resolution optical microscopy using cylindrical vector beams. Nanophotonics. 11, 3395–3420 (2022). https://doi.org/10.1515/nanoph-2022-0241 5. Wang, J.L., Zhang, J., et al.: Dual-color STED super-resolution microscope using a single laser source. J. Biophotonics 13, 10 (2020). https://doi.org/10.1002/jbio.202000057 6. Wang, Z., Zhao, Q.S., Li, X.Q., et al.: MYOD1 inhibits avian adipocyte differentiation via miRNA-206/KLF4 axis. J. Anim. Sci. Biotechnol. 12, 13 (2021). https://doi.org/10.1186/s40 104-021-00579-x 7. Zongda, W., Xuan, S., et al.: How to ensure the confidentiality of electronic medical records on the cloud: A technical perspective. Computers in Biology and Medicin. 147, 105726 (2022). https://doi.org/10.1016/j.compbiomed.2022.105726 8. Nachar, R., Inaty, E.: An effective segmentation method for iris recognition based on fuzzy logic using visible feature points. Multimed. Tools Appl. 81, 9803–9828 (2022). https://doi. org/10.1007/s11042-022-12204-8 9. Gu, Y., Vyas, K., et al.: Transfer recurrent feature learning for endomicroscopy image recognition. IEEE Trans. Med. Imaging 38, 791–801 (2019). https://doi.org/10.1109/TMI.2018. 2872473 10. Ye, N., Yang, Y.F., et al.: Ghrelin promotes the osteogenic differentiation of rMSCs via miR206 and the ERK1/2 pathway. Cytotechnology 72, 707–713 (2020). https://doi.org/10.1007/ s10616-020-00413-8

Morphological Feature Recognition of Induced ADSCs

175

11. Zongda, W., Shen, S., et al.: A dummy-based user privacy protection approach for text information retrieval. Knowl.-Based Syst. 195, 105679 (2020). https://doi.org/10.1016/j.knosys. 2020.105679 12. Jahr, W., Velicky, P., et al.: Strategies to maximize performance in STimulated Emission Depletion (STED) nanoscopy of biological specimens. Methods 174, 27–41 (2020). https:// doi.org/10.1016/j.ymeth.2019.07.019 13. Wang, S., et al.: Multi-scale context-guided deep network for automated lesion segmentation with endoscopy images of gastrointestinal tract. IEEE J. Biomed. Health Inform. 25, 514–525 (2021). https://doi.org/10.1109/JBHI.2020.2997760 14. Daradkeh, Y.I., et al.: Development of effective methods for structural image recognition using the principles of data granulation and apparatus of fuzzy logic. 9, 13417–13428 (2021). https://doi.org/10.1109/ACCESS.2021.3051625 15. Kurdi, B., Ababneh, N et al.: Use of conditioned media (CM) and xeno-free serum substitute on human adipose-derived stem cells (ADSCs) differentiation into urothelial-like cells. PeerJ., 9, e10890 (2021). https://doi.org/10.7717/peerj.10890

Micromechanical Properties Investigation of Rabbit Carotid Aneurysms by Atomic Force Microscopy Guixue Wang1,2(B) , Jingtao Wang1 , Xiangxiu Wang1,2 , Juhui Qiu1,2 , and Zhiyi Ye1,2 1 Key Laboratory for Biorheological Science and Technology of Ministry of Education, State

and Local Joint Engineering Laboratory for Vascular Implants, Bioengineering College of Chongqing University, Chongqing 400030, China [email protected] 2 JinFeng Laboratory, Chongqing 401329, China

Abstract. The lack of research on the micromechanical properties of cerebral aneurysms limits observing the mechanism of aneurysm rupture. The aim of this biomechanical study is to explore the differences in micromechanical properties between the fusiform aneurysm and the healthy artery. Rabbit models were used to study the hemodynamics of intracranial aneurysms. Atomic force microscopy (AFM) was applied to press the media and adventitia in between aneurysm and healthy arteries. The elasticity modulus for vascular middle and adventitia in aneurysm were larger than those in healthy arteries. This study provides valuable data for understanding the micromechanical properties of the fusiform aneurysm. Keywords: Rabbit carotid aneurysm · Atomic force microscopy · Elasticity modulus · Finite element method · Biomechanics

1 Introduction Intracranial aneurysm is classified into cystic and non-cystic according to morphological classification [1, 2]. Fusiform aneurysm, as a kind of non-cystic aneurysm, is a relatively rare type of intracranial aneurysm, accounting for about 3−13% of intracranial aneurysms [3]. Loss of the elastic plates with degraded vascular media and vascular intima proliferation is a structural feature of the fusiform aneurysm. The existence of elastin and collagen leads that the elasticity and tensile strength being in the charge of the medial and adventitial layers. The vascular wall components modify during the development of a fusiform aneurysm. The tissue stiffness increased as a result of the changes in fusiform aneurysm wall components. Biaxial tensile tests have measured the elastic moduli of the fusiform aneurysm, and previous similar exploration have revealed the micro biomechanical property of fusiform aneurysms and healthy arteries are explicitly different [4]. For nanoscale micro biomechanical property, AFM has been applied to measure the nanoscale biomechanical property of artery tissue [5, 6]. However, there is no research on measuring stratified © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 176–182, 2024. https://doi.org/10.1007/978-3-031-51485-2_20

Micromechanical Properties Investigation of Rabbit Carotid

177

fusiform aneurysms by AFM. Our team has applied AFM to study the cells and tissues biomechanical properties in the past few years, and has made a series of achievements, providing a good foundation for the research in the early stage [7–9]. The goal of this project is to understand the differences in micromechanical properties between the fusiform aneurysm and the healthy artery. The vascular media and outer layer elasticity modulus in aneurysm and healthy artery samples were analyzed by AFM using nano-grade probe.

2 Materials and Methods 2.1 Carotid Aneurysm Creation There were three rabbits used to create the aneurysm and the other three served as control. Rabbits were fasted and weighed before the experiment and muscle relaxin was intramuscularly injected at a dose of 0.25 ml/kg. Anesthesia was induced with 10% chloral hydrate 2.5 ml/kg intraperitoneally. The right common carotid artery (RCCA) was found and separated along the lateral margin of the right sternocleidomastoid muscle. A 2-cm segment of the RCCA was exposed and isolated. A 2 cm × 1.2 cm piece of aseptic latex sheet was gingerly wrapped around the 2 cm segment of the RCCA. Surgical sutures were used to ligate the latex sheet at both ends to form an arc-shaped groove with a length of 2 cm. 1ml of a solution composed of 0.5 mL of elastase (90 units) and 0.5 mL of CaCl2 (0.5 M) was injected into the groove. The 2-cm segment of the RCCA was bathed for 20 min. Then the aseptic latex sheet was removed and the artery was washed with normal saline. The surgical incision was sutured and the rabbits were woken up. 2.2 Histological Examination Histological was performed at 3 weeks after the operation, all rabbits were killed. Both healthy and aneurysm specimens were made into slices with a thickness of 5 µm and fixed at a concentration of 4% polymerized formaldehyde beyond 24 h. Next, the sample slices were washed and dehydrated to make slices with a thickness of 4 µm. The samples were dyed by applying H&E, and Elastic Van Gieson dyeing, respectively. 2.3 AFM Experiment Each healthy or aneurysm sample was cut into flat circular sections with a height of 40 µm. The sections were stuck to the petri dish containing PBS. The circular sample was measured by indentation test with the pyramidal probe.5 points of the layer to be tested were measured, and each point contained ten indentation stress curves. Therefore, there was a total of 50 curves. A similar measurement method was used on the healthy or aneurysm sample. The AFM experiment was conducted by using JPK Nano Wizard II at room temperature [8]. The scientific equation [8] is shown: √ 2F(1 − v2 ) (1) Epyramidal = δ 2 − tanθ θ represents the half-angle to the probe face in the equation. δ represents impression depth for the probe in the equation. Epyramidal represents the probe elasticity modulus in the equation. υ is Poisson’s ratio in the equation.

178

G. Wang et al.

2.4 Finite-element Method Simulation Simulation software (Abaqus/CAE 6.14–2) was applied to explore the experimental results of AFM indentation test. Elastic modulus was 2 kPa, and Poisson’s ratio of vessel-model was 0.49, and vessel geometric model was assumed linear elasticity. Then, the vascular model height was 40 µm. The pyramidal probe height and radius were 2 µm and 60 nm respectively. The indentation depth (axial displacement) of rigid probes was 20 nm or 40 nm respectively.

3 Results 3.1 Histology The boundaries of the inner layer, medium stratum and outer layer in the healthy arterial vessel wall were clearly revealed by HE staining (Fig. 1A). And, the aneurysm narrowed wall became thin and lost native vascular structure, which led to the enlarged lumen (Fig. 1B). Smooth muscles in healthy arterial vessel were abundant (Fig. 1C). EVG staining showed elastic fibres of aneurysm were destroyed (Fig. 1D).

Fig. 1. Histological sections of aneurysms and healthy arteries. A and B represent HE-stained sections of healthy arteries and aneurysms, respectively. C and D represent EVG-stained sections of healthy arteries and aneurysms, respectively.

3.2 Comparison Between Aneurysm Arteries and Healthy Arteries The aneurysm elastic moduli were greater than that of the healthy arterial vessel (Fig. 2A and B). And, the medium stratum elastic moduli of the arterial vessel and aneurysm were 18.18 ± 35.80 kPa and 44.53 ± 39.83 kPa (Table 1). Moreover, the outer layer elastic moduli of the arterial vessel and aneurysm were 7.529 ± 7.67 kPa and 52.19 ± 43.91, respectively. The biomechanical property of the aneurysm was obviously broken, especially for the adventitia.

Micromechanical Properties Investigation of Rabbit Carotid

179

Fig. 2. Comparison between medium stratum and arterial adventitia elasticity modulus in aneurysm arteries and healthy arterial vessel. A the media stratum elasticity modulus results. B the arterial adventitia elasticity modulus results. The elasticity modulus results are shown as the mean ± SD; **** represents P < 0.0001. Table 1. The average of elastic modulus of Aneurysm and healthy artery Sample

Elasticity modulus (kPa)

Medium

Adventitia

Aneurysm

Mean

44.53

52.19

Healthy

Mean

18.18

7.529

3.3 Comparison of Indentation at Different Depths by FEM The results revealed that the stress concentration was more obvious with deeper indentation, and the stress increased with displacement (Fig. 3A and B). It revealed that the AFM experiment applied to measuring the microscopic mechanical properties of arteries on the nanoscale.

Fig. 3. Simulation results of indentation depth at 40 and 20 nm. A represents the results of indentation depth at 40 nm. B represents the results of indentation depth at 20 nm.

180

G. Wang et al.

4 Discussion The arterial wall displays nonlinear elasticity and anisotropic force due to the structural shape and composition of the arterial wall [10]. The carotid artery is composed of three main components: elastin, vascular smooth muscle and collagen. The carotid artery is composed of three main components: elastin, vascular smooth muscle, and collagen, which are heterogeneously distributed throughout the artery. These collagens are heterogeneously distributed throughout the artery. Elastin provides vascular compliance and collagen provides the stretch resistance required for high pressure required for stretch resistance. In this study, from the method of aneurysm creation, elastase causes the artery to dilate, and after elastase dissolves and destroys the elastic layer of the arterial vessels, the compliance of the tissue is almost lost, which leads to the beginning weakening of the tissue stiffness and an increased risk of rupture due to Calcified deposits within the aneurysm. Indentation experiments using AFM are performed to investigate micromechanical properties and to obtain more detailed microstructural mechanical data. In response to the fact that many previous studies conducted to investigate macroscopic aneurysms biomechanical property, most experiments are based on stress-strain results obtained from uniaxial or biaxial tensile tests [11]. But these experiments can only characterize the biomechanical properties of blood vessels and cannot be closely related to the physiological structure of blood vessels. And, considering the given the anisotropy and heterogeneity of the aneurysm wall structure, tensile experiments to estimate the biomechanical parameters of aneurysms do not have a better spatial resolution. From the results of some micromechanical studies, ruptured region Young’s modulus in the aneurysm appears to be higher than that in the unruptured part. The significant difference in Young’s modulus proves the weakening of tissue compliance. From the physiological structure of blood vessels, this study revealed that the mean values of Young’s modulus of the middle membrane of aneurysms were 26 kPa higher than those of normal vessels, and those of the outer membrane were 45 kPa higher than those of normal vessels. The increasing for Young’s modulus for middle membrane reflects that predominance of decreasing in its internal elastin, while the decrease in collagen fibre is caused more by the damage caused by calcium chloride entering the outer membrane to form deposits. Since aneurysm creation is digested by elastase and calcium chloride in the outer membrane and not from infusing elastase into the vessel interior, the incremental Young’s modulus is somewhat higher in the outer membrane than in the middle membrane. In addition. We found no obvious difference in Young’s modulus between the middle and outer membranes of aneurysms, and there is a disparity in Young’s modulus in the outer membranes of normal vessels, which indicates that the vessel wall biomechanical property in its normal physiological state are intact with the structural components of each layer. The formation of aneurysms caused the disorderly arrangement and disorganized connection of fiber in each layer.

Micromechanical Properties Investigation of Rabbit Carotid

181

5 Conclusions In summary, this study applied AFM to concretely study the differences in micromechanical property for aneurysm arteries and healthy arteries. For the aneurysm arteries, the experimental results revealed that the elasticity modulus of vascular middle and adventitia increased. The results of histology confirmed the loss of the elastic plates, which led to changes in mechanical properties. After aneurysm formation, the micromechanical properties of the vessel wall change significantly, and Young’s modulus of the middle and adventitia of the aneurysmal wall is higher than that of a typical vessel. The increase in elastic modulus increases vascular flow resistance leading to material deposition and vascular blockage. This increase in elastic modulus may increase the risk of aneurysm rupture by increasing blood flow obstruction, resulting in material deposition and intravascular blockage. The increase in Young’s modulus of the mesothelium and epithelium also means that the vessel wall loses its initial modulus. It also means that the vessel wall loses its original elasticity, and under the impact of blood flow, the vessel wall will continue to thin and may cause. Other complications may arise. Moreover, the FEM simulation results revealed stress was more concentrated with increased depth of the indentation. In the final, our study provides valuable information for understanding the micromechanical properties of the fusiform aneurysm. Acknowledgment. This study was supported by the National Natural Science Foundation of China (12032207, 31971242), Science and Technology Innovation Project of JinFeng Laboratory, Chongqing, China (jfkyjf202203001).

References 1. Fusiform aneurysm after surgery for Craniopharyngioma. J. Neurosurg. (1991). https://doi. org/10.3171/jns.1991.75.4.0670a 2. Shoji, T., Breuer, C., Shinoka, T.: Tissue-engineered vascular grafts for children. Tissue-Eng. Vasc. Grafts, 533–548 (2020). https://doi.org/10.1007/978-3-030-05336-9_19 3. Carver, W., Esch, A.M., Fowlkes, V., Goldsmith, E.C.: The biomechanical environment and impact on tissue fibrosis. Immune Response Implant. Mater. Devices 169–188 (2016). https:// doi.org/10.1007/978-3-319-45433-7_9 4. Meekel, J.P., Mattei, G., Costache, V.S., et al.: A multilayer micromechanical elastic modulus measuring method in ex vivo human aneurysmal abdominal aortas. Acta Biomater. 96, 345– 353 (2019). https://doi.org/10.1016/j.actbio.2019.07.019 5. How, T.V.: Mechanical properties of arteries and arterial grafts. Cardiovasc. Biomater. 1–35. https://doi.org/10.1007/978-1-4471-1847-3_1 6. Sicard, D., Haak, A.J., Choi, K.M., et al.: Aging and anatomical variations in lung tissue stiffness. Am. J. Physiol.-Lung Cell. Mol. Physiol. (2018). https://doi.org/10.1152/ajplung. 00415.2017 7. Lundkvist, A., Lilleodden, E., Siekhaus, W., et al.: Viscoelastic properties of healthy human artery measured in saline solution by AFM-based indentation technique. MRS Proc. (1996). https://doi.org/10.1557/proc-436-353 8. Peng, X.B., Kristi, N., Gafur, A., et al.: Micromechanical property analyses of decellularized vessels by atomic force microscopy. J. Phys. D Appl. Phys. 52(42), 425401 (2019). https:// doi.org/10.1088/13616463/ab33ce

182

G. Wang et al.

9. Zhao, Y., Zang, G., Yin, T., et al.: A novel mechanism of inhibiting in-stent restenosis with arsenic trioxide drug-eluting stent: Enhancing contractile phenotype of vascular smooth muscle cells via yap pathway. Bioactive Materials 6, 375–385 (2021). https://doi.org/10.1016/j. bioactmat.2020.08.018 10. Ben Bouallègue, F., Vauchot, F., Mariano-Goulart, D.: Comparative assessment of linear least-squares, nonlinear least-squares, and Patlak graphical method for regional and local quantitative tracer kinetic modeling in cerebral dynamic18F-FDG pet. Med. Phys. 46, 1260– 1271 (2019). https://doi.org/10.1002/mp.13366 11. Yadav, U., Ghosh, S.: An atomistic-based finite deformation continuum membrane model for monolayer transition metal dichalcogenides. J. Mech. Phys. Solids 168, 105033 (2022). https://doi.org/10.1016/j.jmps.2022.105033

The Development of the “Lab-In-Shoe” System Based on an Instrumented Footwear for High-Throughput Analysis of Gait Parameters Ji Huang1 , Xin Ma1,3 , and Wen-Ming Chen1,2,3(B) 1 Academy for Engineering & Technology, Fudan University, Shanghai, China

[email protected]

2 Institute of Biomedical Engineering & Technology, Fudan University, Shanghai, China 3 Department of Orthopedics, Huashan Hospital Affiliated to Fudan University, Shanghai, China

Abstract. Acquisition of spatiotemporal parameters in gait outside the laboratory remains a difficult challenge in the current clinical setting. In this study, an instrumented footwear, namely “Lab-in-shoe” system, was implemented based on inertial (IMU) and distributed plantar pressure sensors, to facilitate high-throughput analysis of gait parameters. The system uses the plantar pressure sensor to determine the moment when the velocity of IMU reaches zero, and then uses the zerovelocity update algorithm (ZUPT) to mitigate sensor drift effects. In addition, simultaneous visualization of foot positioning and plantar pressure data in gait is achieved, which allows clinicians to obtain more intuitive information about gait alterations. To confirm the reliability and validity of the system, the estimates of spatiotemporal parameters were compared with the gold standard optical motion capture system (Vicon). Inter-system validity was assessed by BlandAltman plots, and reliability was assessed by intra-class correlation (ICC 2, k). The results showed that “Lab-in-shoe” system has good consistency in stride length compared to the Vicon regardless of the walking speed (0.68–1.4 m/s), with an average error approximately at 1%. The overall correlation coefficients for ICC were larger than 0.961, indicating an almost perfect agreement. The results suggest that the reliability of spatiotemporal gait parameters with the “lab-in-shoe” system are acceptable and offered fundamental evidence that the system could be used to quantify the gait parameters in an open environment without the constraint of laboratory settings. Keywords: Biomechanics · Spatiotemporal parameters · Gait · ZUPT · Inertial sensors

1 Introduction Gait analysis plays an important role in the quantitative assessment of gait disorders due to aging-related diseases. Alterations in gait parameters such as walk speed, stride length, width, period, and plantar pressure patterns, as well as limb asymmetries, are © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 183–191, 2024. https://doi.org/10.1007/978-3-031-51485-2_21

184

J. Huang et al.

closely associated with Parkinson’s disease [1], stroke [2, 3], and knee osteoarthritis [4]. Spatiotemporal gait parameters can be obtained using the commercial equipment, such as optical motion capture system (VICON), force plates (AMTI), and electronic walkway (GAITRite). However, these types of equipment are expensive and restrictive in laboratories, and cannot be easily accessible to patients for daily use. With the development of miniature inertial measurement units (IMU), many researchers try to obtain gait parameters through IMU-based wearables. One method is to establish correlations between gait parameters and inertial sensor measurements through machine learning [5]. However, this approach is not readily applicable to patients with gait abnormalities or individuals with large variabilities in gait functions. Another method is to directly integrate the acceleration and angular velocity data to obtain the displacement and direction of the foot stride [6]. The direct-integration method is less likely to be affected by the abnormal gait patterns of the user. While the drift problems of inertial sensors are hard to tackle. The drift in orientation may be reduced by the Kalman filter, and the acceleration drift may be solved with the ZUPT (zero velocity update) algorithm [7, 8]. However, the kinetic data in particular to the dynamic plantar pressure analysis are usually unavailable in current IMU-based wearables, which significantly limit its clinical impacts. In this study, an instrumented footwear, called the “Lab-in-shoe” system, was developed by combining inertial with distributed pressure sensors. The pressure sensor can be placed in-shoe to accurately delineate the temporal phases of gait, in particular the intervals when gait velocity reaches zero. Thus, with the aid of ZUPT algorithm, we were able to obtain a large amount of gait data without the drift effects. The proposed wearable gait detection system “Lab-in-shoe” and data analysis method can be used to quantify walking performance in the daily life of healthy and diseased people in an open environment without laboratory space limitations.

2 Materials and Methods 2.1 “Lab-in-shoe” Design The developed “Lab-in-shoe” system consists of a 9-DOF inertial measurement unit (IMU, WitiMotion Shenzhen Ltd) placed on the dorsal side of the shoe and an array of pressure-sensing films placed in-shoe in direct contact with the plantar surface (Fig. 1). An insole-based plantar pressure mapping sensor was customized for the “Lab-inshoe” gait detection system. This plantar pressure sensor is composed of polyester film with excellent comprehensive mechanical properties, highly conductive materials, and nano-level pressure-sensitive materials. It consists of two layers, the bottom layer is a flexible film and a conductive layer laminated thereon, and the top layer is a flexible film and a pressure-sensitive material laminated thereon. When the force sensing area is pressed, the disconnected lines at the bottom layer will be conducted through the pressure-sensitive layer on the top layer, and the resistance output value decreases with the increasing pressure. As shown in Fig. 1, this sensor uses a 15-row, 7-column sensing area layout with a total of 22 output interface leads connected to the data acquisition. There are a total of 99 independent sensing zones on the sensor.

The Development of the “Lab-In-Shoe” System

185

Fig. 1. he schematics of the “Lab-in-shoe” system in the optical motion capture lab. The IMU’s coordinate system is shown as x’, y’ and z’, while the coordinate system of the walking direction is shown as x, y and z.

2.2 Implementation of the Algorithm The algorithm for the reproduction of the foot position during gait was developed based on four main components: (a) Gait phase division, (b) Orientation estimation and acceleration transformation, (c) Zero-velocity update, and (d) Decoupled the stride width from the stride length, as shown in Fig. 2. Details were as follows: Firstly, the gait phase is delineated according to the temporal patterns of pressure distribution. Secondly, the attitude of IMU is solved by Kalman filter based on the angular velocity, gravity vector and magnetic field distribution. The row, pitch and yaw angle are converted into a rotation matrix to convert the acceleration data to the Earth coordinate system. Thirdly, the velocity is calculated by integrating the transformed acceleration over each stride. A zero-velocity update is then applied, a linearity error resulting from the accumulation of sensor noise is subtracted from the previously calculated velocity [7]. The ZUPT algorithm essentially required zero velocity intervals to be detected from repetitive walking trials. Conventional foot-mounted IMUs often had large displacement errors due to poor detections of the zero velocity intervals. In this study, the temporal patterns of pressure distribution data obtained in-shoe were used to delineate the temporal phases of gait, such that the zero velocity intervals can be accurately detected. Finally, the trajectory of the foot is calculated by integrating the corrected velocity. The motion of the IMU in the Earth coordinate system was obtained with the previously used sensor fusion algorithm, but the motion of the IMU relative to the forward direction is still unknown. Therefore, principal component analysis was used to decouple the displacement in the stride length direction from the stride width direction. Specifically, if the user’s walking route is straight then the principal direction of his motion trajectory can be regarded as his forward direction, which is also the direction of stride length. All these calculations are implemented by custom MATLAB code (MathWorks, Inc., Natick, MA). By combining the plantar pressure distribution and the foot trajectory, the foot position during gait can be reproduced. So that we can calculate a large amount of gait parameters data, including stride length, stride width, stride speed, support phase time,

186

J. Huang et al.

Fig. 2 Flow chart of the algorithm for reproduction of the foot position data in gait. The algorithm consists of four components: a Gait phase division, b Orientation estimation and acceleration transformation, c Zero-velocity update, and d Decoupled the stride width from length.

plantar pressure distribution and COP (center of pressure) curve as shown in Fig. 3, Tables 1 and 2.

Fig. 3. A typical reproduction of foot positioning and plantar pressure data in gait as well as the automatically detected gait phases

2.3 Experimental Validations To confirm the validity of proposed system, the estimates of spatiotemporal parameters were compared with the gold standard Vicon motion capture system. Two young healthy males (Age: 24.5 ± 0.5 years; Height: 176.0 ± 2 cm) participated in the study. After calibrating Vicon cameras and preparing the system for data acquisition in an 8-m-long motion capture lab, we placed the IMUs, insole-based plantar pressure mapping sensor and markers on the shoes, as shown in Fig. 1. Afterwards, we asked participants to walk

The Development of the “Lab-In-Shoe” System

187

Table 1. Bilateral gait parameters obtained from “Lab-in-shoe” system Bilateral parameters

Left(n = 15)

Right(n = 14)

Cycle Time(s)

1.14 ± 0.07

1.13 ± 0.09

Stride Length(m)

1.19 ± 0.15

1.21 ± 0.09

Swing(%GC)

40.81 ± 4.52

41.03 ± 1.76

Support(%GC)

59.19 ± 4.52

58.97 ± 1.76

Table 2. Gait parameters obtained from “Lab-in-shoe” system Parameters Stride width (m)

0.27 ± 0.17

Single support (%GC)

78.57%

Double support (%GC)

21.43%

Distance (m)

17.32

Velocity (m/s)

1.21

Step frequency (step/min)

105

straight within the capture volume (about 4 m), during which both Vicon and “Lab-inshoe” data were collected. At the beginning of the experiment, the participant needed to stomp his right foot to enable event synchronization of Vicon data and “Lab-in-shoe” data later on. Participants repeated the experiments three times at three different walking speeds, namely, the comfortable (1.1 ± 0.07 m/s), slower (0.68 ± 0.05 m/s) and faster (1.4 ± 0.13 m/s) (self-selected), respectively.

3 Results 3.1 Validity of the System Validity between the gait detection system “Lab-in-shoe” and the golden stanfdard optical motion capture system (Vicon) were evaluated using Bland-Altman 95%f limits of agreement as shown in Fig. 4. Bland-Altman plots for all outcomes (n = 6f7) showed a very small bias as 0.03 m, approximately 3% relative error of stride flength. The majority of the “Lab-in-shoe” measures fall within the 95% confidefnce interval demonstrating its good validity. In addition, the differences in the meafsurements between the two systems at slow speeds are more concentrated at zero whifle those at fast speeds appear more diffuse. 3.2 Measurement Reliability Analysis Reliability was assessed with intraclass correlation (ICC 2, k). As shown in Fig. 5, the stride length from “Lab-in-shoe” is plotted against motion capture’s measurement, along

188

J. Huang et al.

Fig. 4. Bland-Altman Plots for stride length from the “Lab-in-shoe” system and golden standard motion capture system (Vicon)

with intra-class correlation (ICC) and the corresponding linear fit. Perfect agreement would yield a line of unity slope. The ICC value for the stride length was 0.99 at a lower speed (n = 22), 0.984 at a comfortable speed (n = 28), and 0.961 at a faster speed (n = 17). This certainly indicated that the “Lab-in-shoe” estimates of the gait parameters agree reasonably well with the motion capture’s measurements, regardless of the walking speed. ICC values were interpreted as >0.75 being excellent, 0.40–0.75 as good, and < 0.40 as poor [9]. However, it also shows that the ICC value decreases as the walking speed increases.

4 Discussion Most existing studies have been focused on inertial sensors to measure gait kinematic parameters [11, 12], but little has been done to incorporate plantar pressure sensors into a single platform [13]. However, changes in dynamic plantar pressure data have been associated with many diseases, especially in neurological diseases such as diabetes, stroke and Parkinson’s disease [14]. This study presented an instrumented footwear as a wearable gait detection system. The main advantage of the system is the combination of plantar pressure distribution sensors with the IMU, which expands the data richness of the current gait detection system and simplifies the complexity in finding the zero-velocity moment for the application of ZUPT algorithm. In addition, by simultaneously reproducing the foot positioning and plantar pressure data, one can easily obtain more intuitive information about subtle alterations in gait parameters. We used the “Lab-in-shoe” system to quantify the gait parameters of a healthy subject (age:25 years, Height:178 cm) in an open environment for long walking distances(17.32 m) without the constraint of the laboratory environment. And the results are shown in Tables 1 and 2. The gait parameters obtained are generally consistent with the previous study by Huang et al. [10], except for the stride width data, which is due to the unavoidable cumulative error in the stride width. To some extent, this also confirmed the measurement accuracy of the developed system. It is interesting to note that, contradicting to our

The Development of the “Lab-In-Shoe” System

189

Fig. 5. Correlation analysis between stride measurements using “Lab-in-shoe” and golden standard motion capture system (Vicon).

findings, the gait data obtained in the previous study by Ma et al. [15] shown to be more accurate at a faster walking speed [15]. However, in our study the error increased with increasing walking speed. This may be attributed to differences in detecting zero velocity intervals. The study by Ma et al. was based on IMU data for zero velocity moment judgments, while the IMU data of zero velocity moment was more evident at faster walking speed. In contrast, we estimate the zero-velocity moment by using the

190

J. Huang et al.

plantar pressure data, and due to the sampling rate of the plantar pressure sensor, the accuracy of the zero-velocity moment in fast walking may be compromised. A limitation of this study is that the experiments were conducted in healthy adults, further studies should be conducted on the elderly and patients with gait disorders to further consolidate the current results. And cautious should also be exercised when extrapolating the findings from this study to other populations.

5 Conclusions In conclusion, the “Lab-in-Shoe” system was developed for wearable gait detection purposes, and the system can obtain both kinematic and kinetic gait parameters with acceptable reliability and validity in healthy individuals. In addition, the system can visualize the spatiotemporal parameters and plantar pressure data simultaneously, which may aid clinical diagnosis and fast assessment for gait disorders. In the future, the system should also be tested for longer time periods in both the elderly and gait-impaired populations to fully understand the clinical efficacies of the proposed system. Acknowledgment. This study was funded by the National Key R&D Program of China (2021YFC200235), the Shanghai Science and Technology Development Funds (No. 20S31901000 & No.21511102200), Medical Engineering Fund of Fudan University (No. yg2021–019 & yg2022–5).

References 1. Sofuwa, O., Nieuwboer, A., Desloovere, K., Willems, A.M., Chavret, F., Jonkers, I.: Quantitative gait analysis in Parkinson’s disease: comparison with a healthy control group. Arch. Phys. Med. Rehabil. 86(5), 1007–1013 (2005) 2. Boudarham, J., Roche, N., Pradon, D., Bonnyaud, C., Bensmail, D., Zory, R.: Variations in kinematics during clinical gait analysis in stroke patients. PLoS ONE 8(6), e66421 (2013) 3. Mohan, D.M., Khandoker, A.H., Wasti, S.A., Alali, I.I.I., S., Jelinek, H. F., & Khalaf, K.: Assessment methods of post-stroke gait: A scoping review of technology-driven approaches to gait characterization and analysis. Front. Neurol. 12, 650024 (2021) 4. Kaufman, K.R., Hughes, C., Morrey, B.F., Morrey, M., An, K.N.: Gait characteristics of patients with knee osteoarthritis. J. Biomech. 34(7), 907–915 (2001) 5. Baghdadi, A., Megahed, F.M., Esfahani, E.T., Cavuoto, L.A.: A machine learning approach to detect changes in gait parameters following a fatiguing occupational task. Ergonomics 61(8), 1116–1129 (2018) 6. Jiménez, A.R., Seco, F., Zampella, F., Prieto, J.C., Guevara, J.: PDR with a foot-mounted IMU and ramp detection. Sensors 11(10), 9393–9410 (2011) 7. Foxlin, E.: Pedestrian tracking with shoe-mounted inertial sensors. IEEE Comput. Graphics Appl. 25(6), 38–46 (2005) 8. Rebula, J.R., Ojeda, L.V., Adamczyk, P.G., Kuo, A.D.: Measurement of foot placement and its variability with inertial sensors. Gait Posture 38(4), 974–980 (2013) 9. Lynall, R.C., Zukowski, L.A., Plummer, P., Mihalik, J.P.: Reliability and validity of the protokinetics movement analysis software in measuring center of pressure during walking. Gait Posture 52, 308–311 (2017)

The Development of the “Lab-In-Shoe” System

191

10. Huang, P., Zhong, H.M., Chen, B., Qi, J., Qian, N.D., Deng, L.F.: Three-dimensional gait analysis in normal young adults: temporal, kinematic and mechanical parameters. Chin. J. Tissue Eng. Res 19(24), 3882 (2015) 11. Lu, C., Uchiyama, H., Thomas, D., Shimada, A., Taniguchi, R.I.: Indoor positioning system based on chest-mounted IMU. Sensors 19(2), 420 (2019) 12. Guimarães, V., Sousa, I., Correia, M.V.: Orientation-invariant spatio-temporal gait analysis using foot-worn inertial sensors. Sensors 21(11), 3940 (2021) 13. Lin, F., Wang, A., Zhuang, Y., Tomita, M.R., Xu, W.: Smart insole: A wearable sensor device for unobtrusive gait monitoring in daily life. IEEE Trans. Industr. Inf. 12(6), 2281–2291 (2016) 14. Shalin, G., Pardoel, S., Nantel, J., Lemaire, E. D., Kofman, J.: Prediction of freezing of gait in Parkinson’s disease from foot plantar-pressure arrays using a convolutional neural network. In: 2020 42nd Annual international conference of the IEEE engineering in medicine & biology society (EMBC), pp. 244–247. IEEE 15. Ma, M., Song, Q., Gu, Y., Li, Y., Zhou, Z.: An adaptive zero velocity detection algorithm based on multi-sensor fusion for a pedestrian navigation system. Sensors 18(10), 3261 (2018)

3D-Printed Insole Designs for Enhanced Pressure-Relief in Diabetic Foot Based on Functionally-Graded Stiffness Properties Xingyu Zhang1 , Pengfei Chu1 , Xin Ma1,3 , and Wen-Ming Chen1,2,3(B) 1 Academy for Engineering & Technology, Fudan University, Shanghai, China

[email protected]

2 Institute of Biomedical Engineering & Technology, Fudan University, Shanghai, China 3 Department of Orthopedics, Huashan Hospital Affiliated to Fudan University, Shanghai, China

Abstract. To effectively reduce the incidence of foot ulcerations in people with diabetes, a novel workflow was proposed to design, manufacture, and evaluate the offloading insoles for diabetic foot. In this study, the stiffness and morphological aspects of the insole were altered to accommodate the diabetic foot and four types of insoles were designed, namely the ordinary flat insole, flat insole with gradedstiffness, total contact insole (TCI) and TCI with graded-stiffness. Porous elements with different elastic moduli were designed and used in different sub-regions of the insoles to realize functionally-graded stiffness properties. The relationship between structural parameters and its modulus was determined through mechanical tests and finite element analysis. The insoles were manufactured using a fused deposition modelling (FDM) 3D-printing technology and its unloading capacities were evaluated using patient-specific plantar pressure analysis. The results revealed that the TCI with graded-stiffness reached the optimal offloading effects by reducing the peak pressure of the foot by 52.8% in static standing state and pressure-time integral by 18.43% in gait conditions. The methodology framework built in the study can serve as a useful foundation to produce functional insole designs to meet patient-specific needs. Such functional-graded-stiffness insoles can potentially be made as an off-the-shelf product, which are critical for the prevention and treatment of foot ulcers in people with diabetes. Keywords: Biomechanics · Diabetic foot · Plantar pressure · Graded-stiffness · 3D printing

1 Introduction For diabetic patients who are at high risk of foot ulceration, excessive plantar pressure is one of the key contributing factors to ulcer formation [1, 2]. Orthoses, such as therapeutic insoles can relieve the symptoms and protect patients by redistributing the contact loads, to further reduce the focal pressures under the foot [2–5]. Recent clinical trials show that ulcer recurrence risk could be reduced by orthoses when the peak plantar pressure and pressure-time integral could be reduced by 13.6−44% (especially in high-risk areas vulnerable to ulcers) and 33−51%, respectively [3–8]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 192–199, 2024. https://doi.org/10.1007/978-3-031-51485-2_22

3D-Printed Insole Designs for Enhanced Pressure-Relief

193

The main design factors for diabetic foot insoles include geometrical contour and stiffness characteristics [3, 4]. It has been well documented that compared to ordinary flat insoles, total contact insoles (TCI) can be more effective in reducing the peak plantar pressure (by up to 56.8%) [6]. More specifically, the reduced pressure could be redistributed to the midfoot region which is known as a low-risk area for ulcerations in the patients [7]. However, diabetic foot may be affected by structural deformity (such as Charcot fractures and clawed toes), plantar tissue stiffness (due to glycosylation), gait patterns, and aging (decreased tissue thickness) [2]. These pathological features of the diabetic foot may result in considerably localized pressure with high patient-specificity, causing abnormal stress concentration, especially in the forefoot and rearfoot regions. In this case, we cannot simply adjust the geometry contour of the insole as a whole [3, 4]. In contrast, use of regional structural or material designs with graded-stiffness may help in decreasing the pressure peaks in a specific area of the foot. The mechanism is to place lower stiffness materials under the highly pressurized area, causing adjacent elements to share the loads [2, 5]. In addition, the combination of geometry and graded-stiffness properties may contribute more to the stress reduction to the foot-insole interface than the single factor (e.g., by adding the graded-stiffness to the TCI can result in an additional 5–10% reduction in pressure) [3, 4, 8], which have brought more exciting opportunities to further optimize the insoles. Unfortunately, the clinical use of Insoles with graded-stiffness remains constrained because of its high patient-specificity in manufacturing. Nowadays, with the development of 3D printing technology, such as fused deposition modeling (FDM), it is now possible to make use of porous microstructures in fabricating structures with different mechanical properties using homogeneous soft materials [9]. In this work, we designed, manufactured, and evaluated therapeutic insoles with customized geometrical contour and functionally-graded stiffness properties for people with diabetes. The goal is to explore the feasibility for producing novel insole designs with enhanced pressure-reducing capacities at a patient-specific level.

2 Materials and Methods 2.1 3D Geometry of the Diabetic Foot Insoles An optical scanning device UPOD-S (ScanPod3D Inc., Canada) was used to acquire the geometric contour of the foot from a male diabetic patient (height of 185 cm and weight of 65 kg, following the approval by the Ethics Committee of Huashan Hospital). A flat insole matching with the footbed size was drawn based on anthropometry data. Then, the mesh corresponding to the flat insole was mapped onto the plantar surface of the patient’s foot (Fig. 1). The deformed mesh defined the customized upper surface of the total contact insole (TCI). 2.2 Functionally-Graded Stiffness Design The thermoplastic urethane (TPU) used in FDM is a homogeneous, isotropic, and soft elastic material with a modulus of 11.7 MPa and a Poisson’s ratio of 0.45. However,

194

X. Zhang et al.

Fig. 1. Geometrical design workflow of the insoles. a Plantar contour and flat insole, b Mesh mapping procedure, c TCI.

with porous structural design, the mechanical properties of the insole can be adjusted by changing the topology the porous structure, i.e., unit type and dimensions of the rod’s cross-section. We then used FE analysis and experimental tests to validate the calculated mechanical properties of the designed structures. An assembly model with 64 porous units (8 × 8 × 1) was created using ABAQUS and manufactured by FDM 3D printing. 3D Tetrahedral mesh was generated for the entire model using FE pre-processor Hypermesh (Altair). The average element edge length was set at 2 mm based on the model convergence analysis [10]. The stress-strain curve of the assembly was determined by mechanical tests. An Instron 5966 universal electronic universal testing machine (Instron Corporation, USA) was used to perform mechanical tests according to ISO 604–2002 standard with a compression speed of 1 mm/min [11] and a sampling frequency of 50 Hz (Fig. 2a−c). The porous units with different cross-sectional diameters of support rods are shown in Fig. 2d.

Fig. 2. Mechanical tests on porous specimens manufactured by the FDM printing. The nominal strain levels up to 0.5 were achieved (a−c). Dimension (i.e., diameter) of the rod in the porous structure (d).

The experimental stress-strain curves of the specimens at different rod diameters are shown in Fig. 3. In this study, five porous structural units were designed. The FEpredicted modulus is used as a reference. It can be seen from Fig. 4 that the equivalent elastic modulus of the porous unit is related to the cross-sectional size (i.e., rod diameter). The root-mean-square error (RMSE) between the two curves was approximately at 0.03 MPa. The inconsistency appears at the large deformation, presumably because the spacing between the rods become smaller and local fusion occurs in the large-rod cases. These five structural units were then formed into a design library for assembling the insole model. 2.3 Patient-Specific Plantar Pressure Analysis Experiments to measure foot-insole contact pressure were conducted with the film-type pressure sensors (XSensor Inc., Canada). The measurement system had 230 flexible

3D-Printed Insole Designs for Enhanced Pressure-Relief

195

Fig. 3. Stress-strain curves of the specimens composed of porous units designed at different rod diameters (d = 1.6 mm, 1.8 mm, 2.0 mm).

Fig. 4. Mechanical testing results compared with the FE predictions for design validation purposes.

pressure sensors evenly distributed over the surface area with a thickness of (1.8 ± 0.2) mm. The sampling frequency was set to 100 Hz, and the load measurement range was from 7 to 880 kPa. All four types of fabricated insoles were evaluated: the ordinary flat insoles, flat insoles with graded-stiffness, TCI and TCI with graded-stiffness. In previous studies, a method was proposed to adapt materials with graded-stiffness using iterative optimization [5]. The results showed that the porous units with the lowest modulus should be placed at the high-pressure zones, while at the boundary of the zone is a circle of high-modulus cells. Based on the results of plantar pressure distributions at static standing (Fig. 5a), the insole was divided into zones according to the plantar pressure graded-variation. Porous units with different moduli were used in different regions to obtain the flat insole with graded-stiffness (Fig. 5b). The TCI with graded-stiffness was subjected to Boolean operation with the TCI (Fig. 5c).

3 Results During static measurements, the patient was instructed to maintain a standing posture and plantar pressure distributions data was acquired with the four types of designed insoles. Dynamic walking tests were performed at a constant speed of 5 km/h on a treadmill. The experimental results showed that compared with the ordinary flat insoles,

196

X. Zhang et al.

Fig. 5. Design process of flat insoles with graded-stiffness and TCI with graded-stiffness: a Plantar pressure distributions measured by film-type pressure sensors. b Material properties of different units in flat insoles with graded-stiffness. c TCI with graded-stiffness properties.

the static plantar peak pressures were decreased by 38.5, 41.1 and 52.8% and the contact area increased by 19.3, 52.4 and 54.8% for the flat insole with graded-stiffness, TCI and TCI with graded-stiffness, respectively (Fig. 6). In static conditions, TCI with gradedstiffness properties has the greatest pressure-relieving capacities in the reduction of peak plantar pressure during stance phase. In the gait conditions, the complete gait cycle was divided and then normalized to obtain the force-time integral. Compared with the normal flat insole, the peak pressure with time integral decreased by 8.84, 2.47, and 18.43% for the flat insole with graded-stiffness, TCI and TCI with graded-stiffness, respectively (Fig. 7).

Fig. 6. Static plantar pressure distributions with different types of insoles.

Fig. 7. The fluctuation of the peak plantar pressure during a gait cycle (about 450 ms) when the subject wearing different insoles.

3D-Printed Insole Designs for Enhanced Pressure-Relief

197

4 Discussions This paper presented a novel workflow for designing, manufacturing, and evaluating the patient-specific insoles for people with diabetes. The main strength of the study is use of porous microstructures as building blocks to realize variable stiffness properties for therapeutic insoles that can effectively relieve regional focal pressure under the patient’s foot. Existing studies have shown that TCI can reduce the peak contact pressure during static standing by 30−40% (vs. 41.1% in this study) [12]. In our experiments, it was found that the TCI with graded-stiffness resulted in a 52.8% (37.14 and 28.47% more than the flat insole with graded-stiffness and TCI) reduction in static peak pressure and 54.8% (183.94 and 4.58% more than the flat insole with graded-stiffness and TCI) increases in the contact area. While these two methods could reduce the volumetric exposure of the soft tissue to stresses, they have different mechanisms. Graded-stiffness redistributes the heel stresses via large deformation under the bony prominence (where high pressure will occur, i.e., high-risk area), allowing adjacent elements to share the loads. This results in the so-called accommodative behavior of shoes for the neuropathic foot [2]. In contrast, the TCI increases the contact area, especially in the arch region, to reduce the mean pressure acting on the entire plantar surface, but not for a specific area [3, 4]. Due to the different mechanisms, the combination will not conflict but rather result in an enhancement for unloading [2–4]. Cheung found the geometry contour was the dominant factor in the determination of the peak contact pressure, and the stiffness was the secondary factor [13]. The optimal combination of the two factors remains to be explored in the future. In recent literatures, the graded-stiffness design has provoked increased research interests. Research reveals optimal insoles with a continuously variable stiffness could cause a smooth pressure distribution for maximum comfort via accurate finite element analysis [3, 4]. However, it remains difficult to continuously change stiffness in manufacturing because the side length of porous structures is at 6 mm due to the printing limitations of FDM [5]. In the future, the development of 3D printing technologies is essential to improve the continuity in insole stiffness variations. Moreover, there are other factors contributing to the unloading efficacies that should be considered. Cheung shows that an increasing insole thickness larger than 9 mm could lead to increased foot pressure [13]. It may be the cause of a diminished arch support function of the orthosis. It was also found that the body mass index (BMI) of participants also should be taken into consideration, i.e., heavier subjects required relatively larger stiffness of the insole to minimize stress [5, 14]. Furthermore, Bus found considerable variabilities in the efficiency for the custom insole design among individuals, although most pressure redistribution occurred from the lateral heel to the medial midfoot regions [1]. In addition, custom-molded orthoses with an additional metatarsal pad or rearfoot wedges were found to be more effective than the use of custom-molded orthosis alone [13]. It should also be noted that the insoles prepared by the FDM printer would have morphological defects at the microscopic level, which may cause fatigue damage of insoles during daily use.

198

X. Zhang et al.

5 Conclusions This study demonstrated a workflow to design and manufacture patient-specific insoles with functionally-graded stiffness properties to enhance pressure-relief capacities in the diabetic foot. The study also showed with the utilization of 3D printing technology and porous structures, more effective insoles can be made as an off-the-shelf product for diabetic patients. Future efforts should be made to build FE foot models with anatomically-accurate tissue structures (i.e., bones, fat, skin, tendon, and cartilage) of the patient to evaluate the internal stress conditions when using therapeutic insoles [3, 4, 15, 16], and to make virtually-optimized insole design even before manufacturing [12, 13]. Acknowledgment. The research is sponsored by the National Key R&D Programmes of China (2021YFC200235), the Shanghai Science and Technology Development Funds (No. 20S31901000 & No. 21511102200) and Medical Engineering Fund of Fudan University (No. yg2021–019& yg2022–5).

References 1. Cavanagh, P.R., Bus, S.A.: Off-loading the diabetic foot for ulcer prevention and healing. J. Am. Podiatr. Med. Assoc. 100(5), 360–368 (2010) 2. Cavanagh, P.R., Ulbrecht, J.S.: The biomechanics of the foot in diabetes mellitus. In: Bowker, J.H., Pfeifer, M.A., (Eds.) Levin and O’Neal’s: the diabetic foot, pp. 115–184 3. Shaulian, H., Gefen, A., Solomonow-Avnon, D., Wolf, A.: Finite element-based method for determining an optimal offloading design for treating and preventing heel ulcers. Comput. Biol. Med. 131, 104261 (2021) 4. Shaulian, H., Gefen, A., Solomonow-Avnon, D., Wolf, A.: A novel graded-stiffness footwear device for heel ulcer prevention and treatment: a finite element-based study. Biomech. Model. Mechanobiol. 21(6), 1703–1712 (2022) 5. Tang, L., Wang, L., Bao, W., Zhu, S., Li, D., Liu, C.: Functional gradient structural design of customized diabetic insoles. J. Mech. Behav. Biomed. Mater. 94, 279–287 (2019) 6. Chen, W.P., Ju, C.W., Tang, F.T.: Effects of total contact insoles on the plantar stress redistribution: a finite element analysis. Clin. Biomech. 18(6), S17–S24 (2003) 7. Telfer, S., Woodburn, J., Collier, A., Cavanagh, P.R.: Virtually optimized insoles for offloading the diabetic foot: A randomized crossover study. J. Biomech. 60, 157–161 (2017) 8. Owings, T.M., Woerner, J.L., Frampton, J.D., Cavanagh, P.R., Botek, G.: Custom therapeutic insoles based on both foot shape and plantar pressure measurement provide enhanced pressure relief. Diabetes Care 31(5), 839–844 (2008) 9. Davia-Aracil, M., Hinojo-Pérez, J.J., Jimeno-Morenilla, A., Mora-Mora, H.: 3D printing of functional anatomical insoles. Comput. Ind. 95, 38–53 (2018) 10. Chen, W.M., Cai, Y.H., Yu, Y., Geng, X., Ma, X.: Optimal mesh criteria in finite element modeling of human foot: the dependence for multiple model outputs on mesh density and loading boundary conditions. Journal of Mechanics in Medicine and Biology 21(09), 2140034 (2021) 11. Wang, L., Kang, J., Sun, C., Li, D., Cao, Y., Jin, Z.: Mapping porous microstructures to yield desired mechanical properties for application in 3D printed bone scaffolds and orthopaedic implants. Mater. Des. 133, 62–68 (2017)

3D-Printed Insole Designs for Enhanced Pressure-Relief

199

12. Chen, W.M., Lee, S.J., Lee, P.V.S.: Plantar pressure relief under the metatarsal heads: therapeutic insole design using three-dimensional finite element model of the foot. J. Biomech. 48(4), 659–665 (2015) 13. Cheung, J.T., Zhang, M.: Parametric design of pressure-relieving foot orthosis using statisticsbased finite element method. Med. Eng. Phys. 30(3), 269–277 (2008) 14. Jafarzadeh, E., Soheilifard, R., Ehsani-Seresht, A.: Design optimization procedure for an orthopedic insole having a continuously variable stiffness/shape to reduce the plantar pressure in the foot of a diabetic patient. Med. Eng. Phys. 98, 44–49 (2021) 15. Chen, W.M., Lee, T., Lee, P.V.S., Lee, J.W., Lee, S.J.: Effects of internal stress concentrations in plantar soft-tissue–A preliminary three-dimensional finite element analysis. Med. Eng. Phys. 32(4), 324–331 (2010) 16. Chen, W.M., Lee, P.V.S.: Explicit finite element modelling of heel pad mechanics in running: inclusion of body dynamics and application of physiological impact loads. Comput. Methods Biomech. Biomed. Engin. 18(14), 1582–1595 (2015)

A Novel Force Platform for Assessing Multidimensional Plantar Stresses in the Diabetic Foot—A Deep Learning-Based Decoupling Approach Hu Luo1 , Xin Ma1,3 , and Wen-Ming Chen1,2,3(B) 1 Academy for Engineering & Technology, Fudan University, Shanghai, China

[email protected]

2 Institute of Biomedical Engineering & Technology, Fudan University, Shanghai, China 3 Department of Orthopedics, Huashan Hospital Affiliated to Fudan University, Shanghai, China

Abstract. The multidimensional force platform has gained popularity in assessing the plantar pressures and shear stresses in the diabetic foot, which is believed to be a promising index for early diagnostics of foot ulcerations and proposing effective protective strategies. The effectiveness of decoupling the 3D force components with minimum cross-talk dictates the capacity of the force detection platform. This work presented a deep-learning-based method for multidimensional force decoupling using an optical-force detection platform. An optical flow algorithm was utilized to track the displacement field of the elastomeric sensing layer of the force platform due to known external loads. A decoupling model was established by training the convolutional neural network of U-Net with 15,000 sets of multi-dimensional force-optical flow datasets collected by the customized force calibration system. After the training, the model can achieve real-time vertical (i.e., normal) and shear force detection with an accuracy of 0.089 and 0.020 N, respectively. Visualization of the multi-dimensional forces indicated good generalization capability of the trained neural network model. The method proposed in this work is expected to be used as an efficient tool for multidimensional force decoupling in assessing plantar stresses, and provided the foundation for further research on the biomechanical etiology of foot ulceration in people with diabetes. Keywords: Biomechanics · Diabetic foot ulcers · Multidimensional force decoupling · Deep learning · Shear stresses

1 Introduction Diabetic foot is one of the serious complications of diabetes, and diabetic foot ulcer (Diabetic Foot Ulcer, DFU) is the most common manifestation of diabetic foot [1]. DFU patients are known to have plantar numbness and sensory loss due to peripheral neuropathy, and they cannot perceive plantar skin trauma caused by repetitive stresses during daily activities. This minor trauma can progress to deep plantar soft tissue injury © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 200–208, 2024. https://doi.org/10.1007/978-3-031-51485-2_23

A Novel Force Platform for Assessing Multidimensional Plantar Stresses

201

(i.e., foot ulcer) and even gangrene, eventually leading to lower extremity amputation [2, 3]. As many as 140.9 million people suffer from diabetes in China, of which about 15−20% will develop foot ulcers throughout the course of the disease. Every 20 s, a limb is amputated due to diabetic foot ulcers [4]. Studies have found that there is a significant relationship between plantar mechanical stresses and DFU risks [5]. A large number of institutions word-wide have now adopted plantar pressure platforms to detect abnormal gait loads in the diabetic foot. However, many have shown that peak plantar pressure and DFU only had moderate correlations, and patients with peak plantar pressure may only have a 65% chance of developing foot ulcers [6]. In fact, the forces acting on the sole of the foot are multidimensional in nature, including the shear forces in the horizontal plane [7]. Many clinicians also believe that shear forces may be more detrimental to plantar tissue injury than plantar pressure [8]. However, there is still a lack of biomechanical sensing systems that can effectively capture the plantar shear force in DFU patients. In particular, there is still a lack of a suitable method for decoupling the 3D force components with minimum cross-talk effects [9]. Tappin et al. reported the world’s first plantar multi-dimensional force detection platform in the literature [10]. The force decoupling method is based on the principle of magnetoresistance. The disadvantage of this type of sensor is that it can only measure shear forces acting in a single direction in the anterior-posterior (AP) or medial-lateral (ML) direction. Chen et al. proposed a gait platform to measure isolated plantar metatarsal forces during walking [11]. The force sensor was designed based on shear-web principle and strain-gauge techniques. During applications, the 3D force vector applied at the sensing surface were decoupled using the different strain field at each shear web structure located at the sensor body. Wang and Ledoux proposed a new method of measuring plantar pressure and shear stress using fiber optic sensors [12]. Based on the observed macroscopic bending of the fiber sensor, they predicted the magnitude of the shear force in the orthogonal direction by measuring the intensity attenuation from two adjacent perpendicular fibers due to physical deformation. However, the optical fiber-based measurement had poor anti-interference abilities, which is not well-suited for clinical applications. Recently, vision-based force detection platforms have been developed. Fonov et al. used optical methods to record the displacements of elastic body structures under external loads through RGB cameras and used the technology of digital image correlation for post-processing analysis [13, 14]. While this work has opened new opportunities for decoupling multidimensional force vectors it only achieved the measurements of the pressure due to normal forces, as for the shear forces in AP, and ML directions, an effective decoupling approach has not been established. Therefore, in this work we proposed a novel decoupling method dedicated for the multidimensional plantar force detection. This method can convert the deformation information into optical flow data in real time, and then calculate the magnitude as well as the direction of the contact forces using a trained neural-network model. Calibration experiments were conducted to demonstrate the robustness and high accuracy of the force platform.

202

H. Luo et al.

2 Proposed Method 2.1 Physical Principle of the Method The core structure of the force detection platform features a vision-based design, with specific emphasis on the assessment of multidimensional forces during foot-ground interactions. As shown in Fig. 1, the force platform consists of an elastomeric sensing layer (including random particles), an RGB camera, and LED lights.

Fig. 1. The structure and schematics of vision-based force platform for assessing multidimensional plantar stresses

In this vision-based force platform, the main function is to realize multi-dimensional force decoupling by establishing the conversion relationship between the displacement field data from the sensing layer and the external impose forces.

Fig. 2. Principles of particle displacement in the sensing layer due to externally applied forces

In order to obtain the 3D force vector from the movement patterns of particles in the sensing layer, we use the theory of elasticity by assuming that black and sensing layer’s material is a semi-infinite linear elastic half-space model [15]. The Cartesian coordinate system of the force platform is established with the camera as a reference, and the image recorded by the camera is the projection of the three-dimensional data onto the two-dimensional plane. As shown in Fig. 2, We set the external force vector − → → u = (ux , uy ) of the interior point as f = (fx , fy , fz ) and set the movement vector − − → r = (x, y, z). The Eqs. 1 and 2 express the relations of the vectors shown above. σ is the Poisson ratio, which is set to 0.48 (for elastomeric materials). E is Young’s modulus

A Novel Force Platform for Assessing Multidimensional Plantar Stresses

and must be appropriately defined according to the actual elastic body used.  ⎧ ⎫ ⎨ xz3 − (1−2σ)x fz + 2(1−σ)r+z fx + ⎬ r(r+x)  r  1+σ r(r+z)

ux = 2π # E⎩ 2r(σ r+z)+z2 x ⎭ xfx + yfy r3  ⎧ ⎫ ⎨ yz3 − (1−2σ)y fz + 2(1−σ)r+z fy + ⎬ r(r+x)  r  1+σ r(r+z)

uy = 2π E⎩ 2r(σ r+z)+z2 y ⎭ xfx + yfy 2 3

203

(1)

(2)

r (r+z)

For RGB camera, the movement vector of the particle in the sensing layer is calculated at certain depth x = x1 , y = y1 , z = z1 , and the movement vector can be ufx = (hxx1 , hyx1 ), ufy = (hxy1 , hyy1 ), ufz = (hxz1 , hyz1 ), and the movement h can be considered as an impulsive response to unit force of each direction from the origin. Thus, the →(x, y) = m (x, y), m (x, y), m (x, y) in point (x, y) in the plane movement vector − m 1 x1 y1 z1 is calculated in the form of convolution. mx1 = hxx1 ∗ fx + hxy1 ∗ fy + hxz1 ∗ fz my1 = hyx1 ∗ fx + hyy1 ∗ fy + hyz1 ∗ fz

(3)

Convert Eq. 3 to matrix form: 

Mx1 My1



⎡ ⎤  Fx Hxx1 Hxy1 Hxz1 ⎣ ⎦ = Fy Hyx1 Hyy1 Hyz1 Fz 

(4)

Equation 4 calculates the movement of the particles in the sensing layer and its inverse becomes a formula, which can be used to calculate the force vector: F = H −1 M

(5)

Based on the Eq. 5, it can be seen that the principle of the multidimensional force decoupling method is to calculated the matrix of H and M . Thus, in the following part B, we proposed an optical flow-based algorithm to calculates matric M , and in part C, a convolutional neural network was used to calculate the matrix H . 2.2 Optical Flow Calculation In part A, the physical model of the force detection platform was presented, including calculating two important matrices H and M . This part we introduced a method, named the optical flow, to quickly calculate the matrix M . Optical flow algorithm is widely used in the field of computer science, for tracking, activity detection and recognition. It can extract meaningful features by processing each pixel’s displacement of the image frame by frame (120 Hz) [16]. As shown in Fig. 3, we implemented the algorithm in OpenCV to calculate the optical flows due to the externally applied vertical and horizontal shear forces. Converting the

204

H. Luo et al.

Fig. 3. The calculated optical flow data under the vertical and shear forces externally applied on the sensing surface

→ movement vector − m = (mx , my ) to polar coordinate system (ρ, θ ), among them, ρ represents the size of the optical flow, and θ represents the direction of the optical flow. Finally, we get a two-dimensional matrix with a size of 640 × 480, and the first and second dimensional information are ρ and θ respectively. In fact, the pixels in the image are equivalent to the displacement data of the particles embedded in the sensing layer, thus the matrix calculated above is the matrix M in Eq. 5. 2.3 Neural Network-Based Calibration of the Force Platform Traditional force decoupling methods need to accurately solve the matrix H , such as the Digital Image Correlation(DIC) method [17] and inverse Finite Element Method (iFEM) [18]. The problems with these two methods were that they need to accurately define the material properties of the sensing layer. But for most elastomeric materials such as silicone rubber, it can be difficult to accurately define the elastic modulus due to its nonlinear material responses. However, the recent development of artificial intelligence (AI) has opened up opportunities to solve this problem. In this study, we can make neural network model continuously learn to solve Eq. 5 inputting a large number of matrices F and M . Consequently, the trained neural networkmodel and matrix H are equivalent to solving Eq. 5. The neural network-based calibration of the force platform was detailed as follows: Firstly, the high-quality dataset was produced to train the neural network models. In this study, as shown in Fig. 4, we built a three-axis force calibration system and collected 15,000 sets of optical flow-multidimensional force dataset (The maximum force is 20 N). The size of the multidimensional force dataset is 640 × 480 × 2. Then, we need to choose appropriate neural network model. The problem that deep learning methods need to solve is essentially an image-to-image segmentation problem. Therefore, in this work we chose convolutional neural network of U-Net [19] as the deep learning model. Figure 5 showed the structure of U-Net and the main contents of the U-net-based model learning includes: (1) learning the position of the applied forces; (2) learning the mapping relationship between the magnitude and direction of the optical flow and the multi-dimensional forces. In this work, we use the root mean square error (RMSE) to evaluate the training outcomes of the U-Net model.

A Novel Force Platform for Assessing Multidimensional Plantar Stresses

205

Fig. 4. Dataset collection by the customized multidimensional force calibration system

Fig. 5. The structure of U-Net for the calibration of the force platform

3 Results During the model training, the RMSE value of vertical and shear force gradually converges to 0.089 and 0.02 N, respectively. These results prove that the model is stable and can be used as an equivalent solution method for the matrix H in multidimensional force decoupling. And under the current model, the detection accuracy of vertical and shear force that can be achieved by the force platform is 0.089 N within a force range at 20N and 0.02 N within a force range at 3.5N, respectively. Figure 6 is the RMSE curve of vertical force during training. Figure 7 showed visualization of the multi-dimensional force detection platform, and it can display the pressure and shear forces in real time. Preliminary results also showed that the model had good generalization capability. For example, the multidimensional force data sets during calibration in part C were all based on single point indentations. However, when we offer the model with two or more points of forces, the model can still perform the calculations well. This shows that the model was able to learn the true regularity represented by the dataset. Meanwhile, the detection platform appears robust as well. In the actual test process, it can work stably and safely in the case of abnormal data and environmental interference.

206

H. Luo et al.

Fig. 6. The RMSE convergency curve of vertical force during training

Fig. 7. Visualization of the multi-dimensional force detection

4 Discussions In this study, the feasibility of using a neural network-based force platform for decoupling 3D force components is demonstrated. This method can realize the measurement of both vertical (i.e., pressure) and shear forces simultaneously, with a detection accuracy of 0.089 N under the 20 N force range and 0.02 N within a force range at 3.5 N, respectively. It is expected to be used in clinical environments for assessing three-dimensional plantar stresses in the diabetic patients. The classical force decoupling methods of the vision-based force platform are mainly based on DIC method [17] and iFEM [18]. The results of this work showed that convolutional neural network model can also be used as a decoupling method to accurately solve the matrix that is required to calculate the 3D forces. Compared with the two methods above, the advantages of the neural network method are that the calibration process is relatively simple, the generalization ability is good and forces can be detected in real time. However, the disadvantage is that due to the use of deep learning method to solve the matrix H , the matrix cannot be accurately defined, which may affect the detection accuracy of the platform. Importantly, designing multidimensional force platform suitable for diabetic foot populations needs to meet the following requirements: (1) Suitable to be used in complex clinical environment. Most piezoresistive, capacitive, inductive and fiber optic multidimensional force detection platforms are susceptible to clinical environment interference [20]. (2) Flexible interface and high spatial resolution for contact force assessments.

A Novel Force Platform for Assessing Multidimensional Plantar Stresses

207

(3) Efficient and robust multidimensional force decoupling. In light of above, the proposed multi-dimensional force platform in this work can meet the above requirements, and it is expected to be eventually applied in clinical practice after future improvements. The limitations of this work are: (1) as a preliminary study, the measuring range of the vertical (i.e., pressure) and shear forces of detection platform is only at 20 N and 3.5 N, respectively. In the future, the stiffness properties of the sensing layer need to be adjusted such that it can meet the multidimensional force detection of the foot sole during gait conditions. (2) There is still lack of in-depth study for comparison of different deep learning methods to select the most suitable learning algorithm. Thus, in the future work, it is necessary compare the pros and cons of various deep learning algorithms to optimize the force detection performance of the platform.

5 Conclusions In this study, we proposed a new force decoupling method for the multi-dimensional force detection. The preliminary results proved that the multi-dimensional force decoupling method could meet the robustness and detection accuracy requirements of the plantar multi-dimensional force measurements. In the future, we will continue to optimize the force detection range and develop a force platform that is truly suitable for clinical environments for early diagnostics of foot ulcerations in people with diabetes. Acknowledgment. The research is sponsored by the National Key R&D Program of China (2022YFC2009500), National Natural Science Foundation of China (NSFC, No. 12372322), the Shanghai Science and Technology Development Funds (No. 20S31901000 & No. 21511102200) and Medical Engineering Fund of Fudan University (No. yg2021–019& yg2022–5).

References 1. Armstrong, D.G., Boulton, A.J.M., Bus, S.A.: Diabetic foot ulcers and their recurrence. N. Engl. J. Med. 376(24), 2367–2375 (2017) 2. Bakker, K., Apelqvist, J., Lipsky, B.A., et al.: The 2015 IWGDF guidance documents on prevention and management of foot problems in diabetes: development of an evidence-based global consensus. Diabetes Metab. Res. Rev. 32, 2–6 (2016) 3. Wang, A., Xue, J., Zhangrong, X.: Progress and prospect of clinical treatment of diabetic foot. Chin. J. Diabetes 07, 643–649 (2022) 4. Gnanasundaram, S., Ramalingam, P., Das, B.N., et al.: Gait changes in persons with diabetes: Early risk marker for diabetic foot ulcer. Foot Ankle Surg. 26(2), 163–168 (2020) 5. Wrobel, J.S., Najafi, B.: Diabetic foot biomechanics and gait dysfunction. J. Diabetes Sci. Technol. 4(4), 833–845 (2010) 6. Henderson, A.D., Johnson, A.W., Ridge, S.T., et al.: Diabetic gait is not just slow gait: gait compensations in diabetic neuropathy. J. Diabetes Res. 2019, 1–9 (2019) 7. Cavanagh, P.R., Ulbrecht, J.S., Caputo, G.M.: New developments in the biomechanics of the diabetic foot. Diabetes Metab. Res. Rev. 16(S1), S6–S10 (2000) 8. Yavuz, M.: Plantar shear stress: Is it the H pylori of diabetic foot ulcers? Clin Biomech (Bristol, Avon) 92, 105581 (2022)

208

H. Luo et al.

9. Jones, A.D., De Siqueira, J., Nixon, J.E., et al.: Plantar shear stress in the diabetic foot: A systematic review and meta—analysis. Diabetic. Med. 39(1), (2022) 10. Tappin, J.W., Pollard, J., Beckett, E.A.: Method of measuring ‘shearing’ forces on the sole of the foot. Clin. Phys. Physiol. Meas. 1(1), 83 (1980) 11. Chen, W.M., Vee-Sin Lee, P., Park, S.B., et al.: A novel gait platform to measure isolated plantar metatarsal forces during walking. J Biomech. 43(10), 2017–2021 (2010) 12. Wang, W.C., Ledoux, W.R., Sangeorzan, B.J., et al.: A shear and plantar pressure sensor based on fiber-optic bend loss. J Rehabil. Res Dev., 42(3), 315–325 (2005) 13. Fonov, S.D., Goss, L., Jones, E.G., et al.: Identification of pressure measurement system based on surface stress sensitive films. In: ICIASF 2005 record international congress on instrumentation in aerospace simulation facilities., Sendai, Japan, pp. 123–127 (2005) 14. Fonov, S.D., Jones, E.G., Crafton, J.W., et al.: Using surface stress sensitive films for pressure and friction measurements in mini- and micro-channels. In: 2007 22nd international congress on instrumentation in aerospace simulation facilities, Pacific Grove, CA, USA, pp. 1–7 (2007) 15. Kamiyama, K., Vlack, K., Mizota, T., et al.: Vision-based sensor for real-time measuring of surface traction fields. IEEE Comput. Graphics Appl. 25(1), 68–75 (2005) 16. Kroeger, T., Timofte, R., Dai, D., Gool, L.V.: Fast optical flow using dense inverse search. In: ECCV2016, 14th European conference, Amsterdam, The Netherlands, pp. 471–488 (2016) 17. Stucke, S., McFarland, D., Goss, L., et al.: Spatial relationships between shearing stresses and pressure on the plantar skin surface during gait. J. Biomech. 45(3), 619–622 (2012) 18. Ma, D., Donlon, E., Dong, S., Rodriguez, A.: Dense tactile force estimation using GelSlim and inverse FEM. In: 2019 international conference on robotics and automation (ICRA), Montreal, QC, Canada, pp. 5418–5424 (2019) 19. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. MICCAI 2015. Munich, Germany 2015, 234–241 (2015) 20. Wang, L., Jones, D., Chapman, G.J., et al.: A review of wearable sensor systems to monitor plantar loading in the assessment of diabetic foot ulcers. IEEE. Trans. Biomed. Eng 67(7), 1989–2004 (2020)

MicroNano Bioengineering

A Nanoparticle Tracking Analysis Algorithm for Particle Size Estimation Song Lang1,2 , Yanwei Zhang1,2(B) , Hanqing Zheng1 , and Yan Gong1,2 1 Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences,

Suzhou 215163, China [email protected] 2 School of Biomedical Engineering (Suzhou), University of Science and Technology of China, Suzhou 215163, China

Abstract. To improve the accuracy of nanoparticle sizes estimation in polydisperse samples and flowing samples, we discuss a novel nanoparticle tracking analysis (NTA) algorithm in detail. A local adaptive threshold segmentation algorithm is used to improve the recognition efficiency of weak light intensity particles, and a new particle matching strategy including multiple optimization steps is proposed to improve the accuracy of trajectory tracking. A flow correction algorithm is introduced to remove the flow vector from the particle motion displacement and retain only the Brownian motion, which enables the NTA algorithm to be applied to flowing samples detection. The particle size measurement experiment of mixed nanosphere solution shows that the polydisperse sample particle sizes measured by the NTA algorithm are close to the true values. The particle size measurement experiment of the nanosphere flow sample shows that the measurement result with the flow correction algorithm is closer to the true value. In summary, we have demonstrated that the proposed NTA algorithm can accurately measure particle sizes of polydisperse samples and flowing samples. Keywords: Nanoparticle tracking analysis · Nanoparticle size estimation · Polydisperse samples · Flowing samples

1 Introduction Nanoparticle tracking analysis (NTA) is a relatively new characterization method for nanoparticles. With this technology, an optical microscopy imaging system is used to record the Brownian motion of particles in solution, and the particle size is obtained by deduction according to the Stokes-Einstein equation. This technology has the advantages of intuitive visualization, high resolution, and fast detection speed, while particle-specific detection can also be realized when it is combined with fluorescence labeling. In recent years, NTA technology has been successfully applied in industry and chemistry, especially in biomedical fields, such as drug delivery, toxicology, biological macromolecules, and extracellular vesicle research [1, 2].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 211–218, 2024. https://doi.org/10.1007/978-3-031-51485-2_24

212

S. Lang et al.

Due to the stochastic nature of Brownian motion, a large statistical error will be introduced if the particle trajectory length is short. It has been proved that the uncertainty in particle size measurements is inversely proportional to the square root of the trajectory length [3]. In response to this problem, researchers have described particle size distribution based on the maximum likelihood estimation (MLE) principle, to reduce the influence of statistical error [4]. Ahram et al. compared the direct calculation method, the finite trajectory length adjustment method based on MLE, and the iterative method based on MLE, and proved that although the latter two methods could somewhat reduce the width of the particle size distribution, the finite trajectory length adjustment method was not suitable for polydisperse samples. Besides, the iterative method was more complex, and the results relied heavily on the original data distribution and initial value selection [5]. In this paper, we discuss a novel NTA algorithm in detail. A local adaptive threshold segmentation algorithm is used to improve the recognition efficiency of weak light intensity particles, and a particle matching strategy including multiple optimization steps is proposed to improve the accuracy of trajectory tracking. In addition, a flow correction algorithm is introduced to enable the NTA algorithm to be applied to flowing samples detection. Finally, nanosphere experiments demonstrate that the proposed NTA algorithm can accurately measure particle sizes of polydisperse samples and flowing samples.

2 Materials and Methods 2.1 Principle of Nanoparticle Tracking Analysis NTA is a light-scattering method based on Brownian motion. The operating principle of the technique is illustrated in Fig. 1. When a laser beam is incident on the sample solution, the illuminated particles scatter light or emit fluorescence. This light is then collected by the microscope objective and then imaged on the camera through the tube lens. To ensure that all particles in the observation field are imaged clearly and there are no defocused particles, a sheet laser is used for illumination. The depth of the illumination beam along the optical axis in the imaging system is equivalent to the focal depth of the objective lens. Videos of the Brownian motion of particles in the solution are obtained by recording the images continuously. Finally, the average diffusion coefficient of the particles is obtained using the particle tracking algorithm, and their hydrodynamic diameters are obtained according to the Stokes-Einstein equation, D=

4kT 3ηπ d

(1)

where D is the average diffusion coefficient of the particles, k is Boltzmann’s constant. T is the temperature, η is the viscosity coefficient of the solvent, and d is the particle’s sphere-equivalent hydrodynamic diameter. 2.2 Algorithm of Nanoparticle Tracking Analysis The NTA algorithm consists of particle recognition, particle matching, and particle size calculation steps.

A Nanoparticle Tracking Analysis Algorithm for Particle Size Estimation

213

Fig. 1. Principle of nanoparticle tracking analysis

(a) Particle recognition The aim of particle recognition is to identify and locate all particles in all image frames. This process involves image preprocessing, image binarization, particle recognition, and centroid localization. Since the size, position, and motion state of each particle are different, some particles in the images have strong light intensity, while others have weak light intensity. If images are binarized by a global threshold, the weak light intensity particles are unlikely to be identified. Here, a local adaptive threshold segmentation method is adopted to accurately identify various types of particles. (b) Particle Matching The aim of particle matching is to ensure which two particles in adjacent frames belonging to the same particle. This process is complex as the actual motion of a particle is very complicated, and changeable. For example, if a particle moves beyond the focal plane or field of view, it cannot be observed continuously. In addition, some particles are located closely, which makes them prone to matching errors. A good matching strategy should be able to deal with these complex situations and ensure trajectory tracking accuracy. The matching strategy adopted in this paper involves multiple optimization steps based on the principle of minimum distance. The distance, size, light intensity, and unit displacement are considered to improve the accuracy of trajectory tracking. A summary of the specific matching process is shown in Fig. 2. An explanation of our particle matching algorithm’s operating principle is as follows. With the number of particles in the current frame defined as M, and the number of particles in the next frame defined as N, a distance matrix, Q, and a matching matrix, A, are established. Element qij of the distance matrix indicates the distance between particle I in the current frame and particle j in the next frame, where i = 1, 2…M, j = 1, 2…N, and the initial values of all elements in the matching matrix A are 0. A distance threshold is then set such that if the distance between two particles is greater than this threshold, these two particles cannot match. Matching particle I in the current frame, means locating the particle with the smallest distance away from particle I in the next frame. If such a particle exists, its particle number is defined as J, and the two particles

214

S. Lang et al.

have been matched successfully. In this case, A (I, J) = 1. If no particles exist, particle matching for particle I fails. After all particles have been matched, the initial matching matrix A is obtained. These are step 1 and step 2 of the particle matching algorithm.

Fig. 2. Summary of the particle matching strategy

Since particle matching is performed in a specific order, it is possible that the initial matches are not the closest in proximity. For instance, if particle I1 and particle J1 are matched first, it can be found that the distance between particle I2 and particle J1 is smaller than the matching distance between I1 and J1. Step 3 adopted in this paper is to re-combine such unmatched particles whose distance is less than the threshold, to optimize the matching matrix according to the minimum distance principle. Due to the randomness of particle motion, during measurement, if particles are close to each other, it’s very likely to lead to matching errors. Hence, an effective screening strategy is included in the matching algorithm, namely, step 4 and step 6. In step 4, the difference between the size and light intensity of the two successfully matched particles is analyzed. If this difference is greater than a set threshold, the two particles are likely to be different, and the matching relationship will be canceled. After completing particle matching between all adjacent image frames, the trajectory of a particle can be drawn. In step 6, the displacement of the particle between adjacent frames on one trajectory is calculated. Since the average diffusion coefficient of a single particle is constant, the unit displacement should fluctuate within a certain range. Obvious mutation nodes indicate that the trajectory node is likely to have a tracking error. Therefore, the mutation nodes must be removed. NTA is based on the statistical principle that states that the uncertainty in particle size is inversely proportional to the square root of the track length. Hence, if the trajectory length is short, a large statistical error will be introduced. Therefore, only trajectories whose length are longer than a specific threshold (the threshold in this paper is set to 20) are selected to calculate the average particle diffusion coefficient. (c) Particle Size Calculation With the NTA, particle size is deduced based on Brownian motion, by analyzing the average diffusion coefficient. However, in actual measurements, due to the temperature

A Nanoparticle Tracking Analysis Algorithm for Particle Size Estimation

215

and pressure difference caused by illumination and channel, particles in solution will flow slowly. In this study, flow rate correction is included in the particle size calculation. The operating principle of this correction is based on the randomness of Brownian motion. Random Brownian motion can be eliminated by summing the motion vectors of all observed particles, as this motion is equally probable in all directions. Consequently, the constant flow vector is obtained following this summation. The flow vector can be subtracted from the particle motion vector and then leaving only Brownian motion vector, which will be used for the calculation of the average diffusion coefficient. As a result, the average diffusion coefficient (D) can be expressed as, 1 K−1

D=

K−1  k=1

t

|sk |2 (2)

where sk is the Brownian motion displacement vector of the particle between adjacent frames, K is the length of the particle’s trajectory, which is the number of frames, and t is the time interval of image acquisition. Finally, the hydrodynamic diameter of the particle can be deduced using the Eq. (1). 2.3 Experimental Setup A self-developed nanoparticle tracking analyzer was used in this study. Here, a 405 nm laser was used for illumination. After shaping, the depth of the beam along the optical axis of the imaging system was 12 μm. A microscope objective with a long working distance, a 0.25 NA, and 10 × magnification was employed in this study. The pixel width of the camera was 5.86 μm, and the target resolution was 1920 pixels × 1200 pixels. Two different sizes of polystyrene nanospheres, obtained from Duke, were tested. The average particle size for the first set of nanospheres was 100 nm ± 4 nm (k = 2), while the average particle size for the second set of nanospheres was 510 nm ± 7 nm (k = 2), as certified by the NIST. The nanospheres were diluted with ultrapure water and then injected into the sample cell. The instrument measured 21 different positions and each position recorded 30 frames continuously. The size distribution of particles in the solution was obtained using the nanoparticle tracking algorithm.

3 Results Both sets of nanospheres were mixed in solution. Since the light intensity scattered by 100 nm nanospheres was far less than that scattered by 510 nm nanospheres, the former was almost invisible in the resulting images, as shown in Fig. 3a and b. To improve the image contrast, Fig. 3c was obtained by gamma transformation (γ = 0.1) of Fig. 3b and then some small particles could be seen. A threshold was calculated for Fig. 3c using the Maximization of interclass variance (Otsu) algorithm, following which, it was binarized, as shown in Fig. 3d. Although small particles were identified using this method, it was not suitable for large particles, as the boundaries of large particles were not clear enough. Hence, the local adaptive threshold segmentation method was employed for binarization

216

S. Lang et al.

in this study, the results of which were shown in Fig. 3e. Here, it can be seen that the recognition efficiency for small particles is improved dramatically, while the outlines of the large particles are clearer and neater. Figure 3f shows the results of the entire images after recognition.

Fig. 3. Result of particle recognition. a Original image; b Local zoom of (a); c Image after gamma transformation; d Binary image obtained using Otsu algorithm; e Binary image obtained using local adaptive threshold segmentation method; f Entire image after recognition

Following particle identification and localization, particles in all 30 frames were matched according to the particle matching strategy. The solution flow was corrected for trajectories that met the length requirement, and then the average diffusion coefficient was estimated. Finally, the particle size of all the observed particles was calculated to obtain the size distribution of the tested samples. The results of this calculation are shown in Fig. 4. Two peaks can be seen in Fig. 4, indicating that the sample contained two kinds of nanospheres. A Gaussian fit was applied on the curve, the peaks are at 107.4 nm and 490.0 nm, respectively. The particle size measured by NTA is hydrodynamic diameter, which is somewhat different from the geometric diameter.

Fig. 4. Size distribution of the polydisperse particle solution

To verify the accuracy of the flow correction algorithm, 510 nm polystyrene nanospheres were diluted with ultrapure water and injected into the sample cell. Due to the temperature difference, the particles in solution flowed in one direction. Four motion videos were recorded at different flow rate and then processed by the algorithm described in this paper. Figure 5a and b show the 2D particle trajectories with the minimum and maximum flow rate respectively. The peaks of particle size distribution were

A Nanoparticle Tracking Analysis Algorithm for Particle Size Estimation

217

Fig. 5. Detection of 510 nm nanospheres in flowing solution. a Particle trajectories with the minimum flow rate; b Particle trajectories with the maximum flow rate; c The peaks of particle size distribution with and without flow correction algorithm; d the particle size distribution with flow correction algorithm when the flow rate is 7.63 pixel/s; e the particle size distribution without flow correction algorithm when the flow rate is 7.63 pixel/s

obtained with (circular markers) and without (square markers) flow correction algorithm, as shown in Fig. 5c. Figure 5d and e show the particle size distribution with and without flow correction algorithm when the flow rate is 7.63 pixel/s. It can be seen from Fig. 5 that the measured particle size is smaller as the increasing of flow rate, while the flow correction algorithm can ensure measurement data validation and accuracy even though the sample has a large flow rate.

4 Discussion The proposed algorithm in this paper can improve the accuracy of NTA system. Note that the hydrodynamic diameter is related to the particle material, surface charge, solvent, etc. We suggest the NTA system be calibrated with a similar standard sample before measurement, so the results are closer to the geometric diameter of particles.

5 Conclusions In this paper, an image processing algorithm for nanoparticle tracking analysis technology was detailed. Initially, the local adaptive threshold segmentation method was used to improve the recognition efficiency and localization precision of particles. Furthermore, a new particle matching strategy was proposed to improve the accuracy of trajectory tracking. Finally, the flow correction algorithm was introduced, which enabled the NTA algorithm to be applied to flowing samples detection. The nanosphere experiments demonstrate that the proposed NTA algorithm can accurately measure particle sizes of polydisperse samples and flowing samples. Acknowledgments. This work was funded by Scientific Research and Equipment Development Project of CAS (YJKYYQ20210031).

218

S. Lang et al.

References 1. Bachurski, D., Schuldner, M., Nguyen, P.H., et al.: Extracellular vesicle measurements with nanoparticle tracking analysis–An accuracy and repeatability comparison between NanoSight NS300 and ZetaView. J. Extracell. Vesicles 8(1), 1596016 (2019) 2. Desgeorges, A., Hollerweger, J., et al.: Differential fluorescence nanoparticle tracking analysis for enumeration of the extracellular vesicle content in mixed particulate solutions. Methods 177, 67–73 (2020) 3. Gardiner, C., Ferreira, Y.J., Dragovic, R.A., et al.: Extracellular vesicle sizing and enumeration by nanoparticle tracking analysis. J. Extracell. Vesicles 2(1), 19671 (2013) 4. Walker, J.G.: Improved nano-particle tracking analysis. Meas. Sci. Technol. 23, 065605 (2012) 5. Kim, A., Ng, W.B., Bernt, W., et al.: Validation of size estimation of nanoparticle tracking analysis on polydisperse macromolecule assembly. Sci. Rep. 9(1), 1–14 (2019)

Biomaterials

Self-adaptive Dual-Inducible Nanofibers Scaffolds for Tendon-To-Bone Interface Synchronous Regeneration A. Haihan Gao1 , B. Liren Wang1 , C. Tonghe Zhu2 , D. Jinzhong Zhao1 , and E. Jia Jiang1(B) 1 Department of Sports Medicine, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao

Tong University School of Medicine, 600 Yishan Road, Xuhui District, Shanghai 200233, China [email protected] 2 School of Chemistry and Chemical Engineering, Shanghai Engineering Research Center of Pharmaceutical Intelligent Equipment, Shanghai Frontiers Science Research Center for Druggability of Cardiovascular Non-coding RNA, Institute for Frontier Medical Technology, Shanghai University of Engineering Science, 333 Longteng Rd., Shanghai 201620, People’s Republic of China

Abstract. It has been established that one major cause of retear following rotator cuff repair is the failure of regeneration of the native tendon-to-bone interface. Therefore, identification of noval strategies to promote synchronous regeneration of different tissues at the inhomogeneous tendon-to-bone interface is a critical challenge for tendon-bone healing after rotator cuff repair. Both chondrogenic and osteogenic induction microenvironment coexist in the tendon-to-bone interface, however, it is inadequate to support the fibrocartilage and bone regeneration during tendon-bone healing. Thus, inspired by intrinsic induction microenvironments present at the tendon-to-bone interface, self-adaptive dual-inducible nanofibers scaffolds (SDNS) were designed to significantly improve chondrogenic and osteogenic ability in response to local induction microenvironments. Strontium-doped mesoporous bioglass nanoparticles (Sr-MBG) were first synthesized by modulated sol-gel method, and then loaded into nanofibers of SDNS. SDNS were capable of releasing strontium ions continuously and thereby improving osteogenic and chondrogenic differentiation of the mesenchymal stem cells in response to specific induction microenvironments. SDNS stimulated synchronous regeneration of the fibrocartilage and bone layers at the tendon-to-bone interface after implantation in the torn rotator cuff, improving the biomechanical strength of the supraspinatus tendon–humerus complexes significantly. Our study provided a novel perspective for promoting inhomogeneous tissue in situ regeneration by regulating the local intrinsic microenvironments through metal ions. Keywords: Rotator cuff tear · Tendon-to-bone interface · Strontium · Mesoporous bioglass nanoparticles · Nanofibers scaffolds

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-3-031-51485-2_25. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 221–239, 2024. https://doi.org/10.1007/978-3-031-51485-2_25

222

A. H. Gao et al.

1 Introduction Rotator cuff tear (RCT) is one of the most common causes of shoulder pain, leading to pain, weakness, and limited range of motion [1]. The incidence of rotator cuff tears in the general population is approximately 30.24%, and it can exceed over 50% in people over 60 years old [1]. The conventional surgical treatment involves reattaching the torn rotator cuff to the humeral head footprint, but a high retear rate occurs after this procedure. According to the statistics, 20–94% of patients who undergo rotator cuff repair will experience a retear [2, 4]. It has been reported that failure of the native tendon-to-bone interface structure to regenerate is the primary cause of retear. The tendon-to-bone interface is composed of four consecutive but heterogeneous layers namely: the tendon layer, uncalcified fibrocartilage layer, calcified fibrocartilage layer, and bone layer [3, 6]. The tendon-to-bone interface connects tendon and bone tissue with different mechanical properties. The interface between the materials having dissimilar mechanical properties can frequently result in the tearing due to stress concentration, but the tendon-to-bone interface can transfer stress from tendon to bone smoothly and reduce the stress concentration [4, 8]. However, even after the surgical repair of RCT, it is difficult for the fibrocartilage layer at the tendon-bone interface to regenerate, and the greater tuberosity of the humerus will experience bone loss, which can markedly diminish mechanical properties [5, 9]. Therefore, promoting the synchronous regeneration of the fibrocartilage and the bone layers in the tendon-to-bone interface is essential for optimal tendon-bone healing. Mesenchymal stem cells (MSCs) are multipotential and capable of differentiating into diverse cell lineages, and microenvironment is the critical factor determining the differentiation direction of MSCs [6, 11]. In an elegant study, Kim et al. co-cultured mesenchymal stem cells and osteoblasts using a novel 3D co-culture system and found that osteoblast could significantly promote the osteogenic differentiation of the mesenchymal stem cells, which may be related to the paracrine effect of osteoblasts [7, 12]. In the process of tendon-bone healing, the mesenchymal stem cells are affected not only by the cells present on the bone side but also by the tendon side. In addition, Lu et al. co-cultured MSCs at the interface between the fibroblasts and osteoblast and observed that the expression of chondrogenic markers such as collagenII of MSCs was significantly increased [8, 13]. The interaction between fibroblasts and osteoblasts can generate a chondrogenic microenvironment for potential fibrocartilage regeneration at the tendon-to-bone interface [8, 13]. These findings imply that osteogenic and chondrogenic microenvironments coexist at the tendon-to-bone interface originating from the bone side to the tendon side. These microenvironments, however, have been found to be insufficient to support fibrocartilage and bone layers regeneration during tendon-bone healing without external intervention. Therefore, designing a novel scaffold that can adapt to the different microenvironments at the tendon-to-bone interface and significantly enhance the chondrogenesis and osteogenesis of mesenchymal stem cells is conducive to the regeneration of the tendon-to-bone interface. Strontium is a trace element that predominantly exists in the human bones [9, 15]. Strontium ions (Sr2+ ) can improve osteogenesis and suppress the process of osteolysis [10, 16]. Consequently, strontium is frequently incorporated into bone tissue engineering scaffolds to alleviate osteoporosis and to stimulate bone regeneration

Self-adaptive Dual-Inducible Nanofibers Scaffolds

223

[11, 18]. However, in recent years, it has been reported that Sr2+ can stimulate chondrogenesis in chondrogenic microenvironment. Yu et al. found that strontium ranelate could effectively promote chondrogenesis of MSCs in vitro and enhance articular cartilage regeneration in vivo [12, 20]. Deng et al. fabricated Sr5 (PO4 )2 SiO4 bioactive ceramic scaffolds through 3D print, which could release Sr2+ to alleviate the cartilage destruction caused by OA and promote articular cartilage regeneration by activating the hypoxia inducible factor-1 pathway [13, 21]. These findings indicated that Sr2+ could adapt to divergent induction microenvironments and thereby enhance the osteogenic and chondrogenic differentiation of MSCs. We anticipate that Sr2+ could markedly stimulate the synchronous regeneration of fibrocartilage and bone layers in response to the local microenvironments during tendon-bone healing. In this study, we aimed to fabricate a scaffold, which can be capable of adapting to the various induction microenvironments and inducing MSCs dual differentiation into fibrochondrocytes and osteocytes. To achieve this aim, we synthesized a novel strontium-doped mesoporous bioglass nanoparticle (Sr-MBG), which was loaded in the nanofibers scaffolds by electrospining to prepare SDNS (Fig. 1). SDNS could adapt to the different induction microenvironments and promote both the osteogenesis and chondrogenesis of MSCs in vitro (Fig. 1). Histological results depicted that SDNS promoted fibrocartilage regeneration at the tendon-to-bone interface. Micro-CT revealed that the bone at the tendon-to-bone interface was effectively restored. Fibrocartilage and bone layers synchronous regeneration significantly increased the mechanical strength of the supraspinatus tendon–humerus complexes. These results confirmed that self-adaptive dual-inducible nanofibers scaffolds were successfully fabricated, which provided a general strategy for promoting in situ tissue regeneration at the heterogeneous interfaces such as the tendon-to-bone and osteochondral interfaces through regulating the local induction microenvironments adaptively through metal ions.

2 Materials and Method 2.1 Reagents and Materials Cetyltrimethylammonium bromide (CTAB), triethanolamine, tetraethyl silicate (TEOS), triethyl phosphate (TEP), calcium nitrate tetrahydrate (Ca(NO3 )2 ·4H2 O), strontium nitrate (Sr(NO3 )2 ) and gelatin were purchased from Sigma-Aldrich (Missouri, USA). Poly (ester urethane) urea (PEUU) was synthesized as mentioned in the previous report [14, 22]. 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP) was obtained from Chembee (Shanghai, China). Dulbecco’s Modified Eagle’s Medium (DMEM), Minimum Essential Medium α (α-MEM), phosphate buffered saline (PBS), penicillin-streptomycin, trypsin, cell culture freezing medium, 4,6-diamidino-2-phenylindole (DAPI) were bought from Gibco (New York, USA). Cell Counting Kit-8 (CCK-8), Calcein/Propidium Iodide (AM/PI) cell viability/cytotoxicity Assay Kit, Alizarin Red Staining solution, and Alkaline Phosphatase Assay Kit were bought from Beyotime (Shanghai, China). Osteogenic and chondrogenic differentiation induction mediums were purchased from Cyagen (Shanghai, China). EZ-press RNA Purification Kit, 2× SYRB Green qPCR Master Mix, and 4× Reverse Transcription were purchased from EZBioscience (Minnesota, USA).

224

A. H. Gao et al.

Fig. 1. Self-adaptive dual-inducible nanofibers scaffolds (SDNS) which released strontium ions sustainably were fabricated by electrospinning. SDNS were able to improve chondrogenesis and osteogenesis adapting to induction microenvironments. The in situ implantation of SDNS for rotator cuff tear promoted fibrocartilage regeneration and new bone formation in vivo.

2.2 Preparation of Sr-MBG Strontium-doped mesoporous bioglass nanoparticles were synthesized through the modulated sol-gel method. Briefly, 2 g CTAB and 0.08 g triethanolamine were dispersed in 20 ml ultrapure water and vigorously stirred for 1 h at 95 °C. Thereafter, 1.5 ml TEOS was added to the above solution in a dropwise manner and stirred for one hour. 0.03 mol Ca(NO3 )2 ·4H2 O, 0.03 mol Sr(NO3 )2, and 0.01 mol TEP were added in turn, followed by one hour of constant stirring. The precipitation was collected and washed with ethanol and ultrapure water. It was then placed in an oven to dry and transferred to a muffle furnace, calcining at 650 °C for three hours. After the above process was completed, Sr-MBG was finally obtained. 2.3 Fabrication of Nanofibers Scaffolds The nanofibers scaffolds were fabricated by electrospinning. 1.2 g PEUU was dissolved in HFIP to form 10 ml 12% w/v solution 1. PEUU and gelatin (75:25) were then dissolved in HFIP to form 10 ml 12% w/v solution 2. Thereafter 100 mg Sr-MBG was placed into solution 2 and stirred until the particles were distributed evenly in the solution, which formed solution 3. Solution 1 and solution 2 were electrospun to fabricate the PEUU nanofibers scaffolds (NS) and PEUU/gelatin nanofibers scaffolds (NS1) respectively. The solution 3 was electrospun to fabricate self-adaptive Dual-inducible nanofibers scaffolds (SDNS). All electrospinning processes were conducted based on the following parameters: applied voltage of 12 kV, syringe needle tip to collector distance of 15 cm, the flow rate of 1 ml/h, and surrounding conditions of 25 °C and 25% ± 5% relative humidity.

Self-adaptive Dual-Inducible Nanofibers Scaffolds

225

2.4 Characterizations of Sr-MBG and Nanofibers Scaffolds The microstructures and nanostructures of Sr-MBG and nanofibers scaffolds were observed by using scanning electron microscopy (SEM, RISE-MANGA, Czech) and transmission electron microscopy (TEM, TALOS F200X, USA). The average diameters of Sr-MBG and nanofibers were measured by Image-J software (USA). The water contact angle was investigated by DSA100 (KrüssGmbH, Germany). The mechanical properties of nanofibers scaffolds were analyzed through a material testing machine (Instron 5969, USA). The porosity of nanofibers scaffolds was measured by Image-J software (USA). The Fourier-transform infrared spectroscopy (FTIR) was obtained with absorption mode at 2 cm−1 intervals, in the wavenumber range 400–4000 cm−1 , by using an infrared spectrometer (Nicolet 6700, Thermo Scientific, USA). The elemental composition and valence distribution of Sr-MBG were detected through X-ray photoelectron spectroscopy (XPS, AXIS UltraDLD, China). 2.5 Release of Ions by Nanofibers Scaffolds 1 × 1 cm2 nanofibers scaffolds were soaked in 2 ml of PBS (PH = 7.4) for 1, 3, 5, 7, 14, 21, and 28 days, respectively. The concentration of Sr, Ca, and Si in the PBS supernatant was then detected using an inductively coupled plasma emission spectrometer (ICP-OES, Avio500, PerkinElmer, Singapore). 2.6 Cytocompatibility of Nanofiber Scaffolds The cytocompatibility of nanofibers scaffolds in vitro was examined by CCK-8 and AM/PI staining. After sterilization, each nanofibers scaffold was placed in a 24-well plate, and 1.0 × 104 rat bone marrow derived stem cells (BMSCs, Pricella, China) were seeded on these scaffolds. BMSCs were cultured on the nanofibers scaffolds for 1, 3, and 5 days and then incubated in CCK-8 working solution for 1 h at 37 °C. The working solution was transferred to the 96-well plates to measure the absorbance at 450 nm by automatic microplate reader (MPR-A9600, Thomas Scientific, USA). BMSCs were seeded on nanofibers scaffolds as described above, and AM/PI staining was conducted at 1 and 5 days after the seeding. The fluorescence microscope (STELLARIS 8, Leica, Germany) was used to observe the cells, the living cells were stained green, whereas the dead cells were stained red. 2.7 In Vitro Cell Induction To investigate the potential effect of nanofibers scaffolds on osteogenesis and chondrogenesis of BMSCs, NS1 and SDNS were soaked in osteogenic and chondrogenic induction medium at a ratio of 100 cm2 /20 ml for 4 weeks to obtain the extract induction medium. Then extract induction medium was employed to assess the osteogenic and chondrogenic effect on BMSCs.

226

A. H. Gao et al.

2.7.1 Osteogenic Induction 1.0 × 107 BMSCs were seeded on the six-well tissue culture plates and cultured with osteogenic extract induction medium for 14 days. Reverse transcription polymerase chain reaction (RT-qPCR) and western blot (WB) were used to detect osteogenic marker genes and protein expression. 5.0 × 104 BMSCs were seeded on 24-well tissue culture plates and cultured with osteogenic extract induction medium for 7 and 14 days. Alizarin Red Staining (ARS), Alkaline Phosphatase Staining (ALP), and immunofluorescent staining were used to analyze the osteogenic ability. 2.7.2 Chondrogenic Induction 1.0 × 107 BMSCs were seeded on the six-well tissue culture plates and cultured with chondrogenic extract induction medium for 21 days. RT-qPCR and WB were employed to detect the expression of various chondrogenic marker genes and proteins. 5.0 × 104 BMSCs were seeded on 24-well tissue culture plates and cultured with chondrogenic extract induction medium for 21 days. Immunofluorescent staining was carried out to analyze the chondrogenic ability. The experiment details for RT-qPCR, WB, and immunofluorescent staining have been described in Sect. S1.1 in the ESM. 2.8 In Vivo Experiments 2.8.1 Establishment of Rat RCT Model All animal experimental procedures were approved and supervised by the Institutional Animal Care and Use Committee (IACUC) of Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine (Animal Experiment Registration number No: DWSY2021-0113). A total of 128 mature male rats (weight 250g ± 50 g) were used in this study. All the rats were randomly allocated into four diverse groups: (1) defect group; (2) simple repair group; (3) repair with NS1 (NS1 group); (4) repair with SDNS (SDNS group). In brief, the right supraspinatus tendon was first detached from the humerus and then reattached to its original footprint area with transosseous suture. The surgical procedure details have been described in Sect. S1.2 in the ESM. At 4 and 8 weeks, six rats were sacrificed for histology, immunofluorescence, Micro-CT, and four other rats were sacrificed for the biomechanical test. 2.8.2 Histology and Immunofluorescence Staining The supraspinatus tendon-humerus complexes were fixed in 10% formalin for 3 days and were decalcified for one month. After dehydration and embedment, the four-micrometer thick serial sections were acquired for hematoxylin and eosin (HE), Safranin O/Fast Green, toluidine blue, and collagenII immunohistochemical staining. All images of HE, Safranin O/Fast Green, toluidine blue and immunohistochemical staining were captured by microscope (DM4000B, Leica, Germany). Then, histological score system was used for the semiquantitative evaluation of the regenerative tendon-to-bone interface [15, 23]. The details of the histological score system have been indicated in Sect. S1.3 in the ESM.

Self-adaptive Dual-Inducible Nanofibers Scaffolds

227

2.8.3 Microcomputed Tomography (Micro-CT) Analysis Bone Mineral Density (BMD) and Bone Volume/Total Volume (BV/TV) of the greater tubercles were measured with the microcomputed tomography (eXplore Locus SP, Canada). Each specimen was scanned at 270 mA, 90 kV, and 18 μm voxel size. CTVox software (USA) was used for three-dimensional reconstruction. The region of interest was selected as the tendon-humerus complex footprint. BMD and BV/TV of each specimen was analyzed by SkyScan CT Analyser software (Bruker). 2.8.4 Biomechanical Testing The cross-sectional area of the tendon at the tendon-to-bone interface was measured by a digital caliper (Tajima) after the supraspinatus tendon–humerus complexes were harvested from the experimental rats. The biomechanical tests were carried out on a universal testing machine (Instron 5569). The supraspinatus tendon was fixed in a clamp and the proximal humerus was fixed by using kirschner wire. All the supraspinatus tendon–humerus complexes were evaluated as described previously [16, 24]. The test was completed when the supraspinatus tendon–humerus complexes broke during the procedure. The failure load, stiffness, and stress were calculated through plotting loadstrain curves. 2.9 Statistical Analysis SPSS 21 software (IBM, USA) and Origin 2021 software (USA) were used to analyze and visualize the experimental data. The results of significant differences were as follows: *P < 0.05, **P < 0.01, and ***P < 0.001.

3 Result 3.1 Characterization of Sr-MBG Figure 2a depicts the schematic diagram for the synthesis of strontium-doped bioglass nanoparticles. The micro morphology of Sr-MBG and its distribution in nanofibers were observed by both SEM and TEM (Fig. 2b, c, d). All strontium-doped bioglass nanoparticles were regular spherical in shape with smooth surfaces and a diameter of about 40.40 ± 4.14 nm, which had evenly arranged pores. Sr-MBG were uniformly dispersed in nanofibers without apparent morphological changes. The results of wide-scan XPS analysis showed that Si, Ca, and Sr elements were contained in the Sr-MBG (Fig. 2f) and the narrow scan of Si 2p (Fig. 2g), Ca 2p (Fig. 2h), and Sr 3d (Fig. 2i) indicated a shift in binding energy in the Sr-MBG. The element composition of the Sr-MBG was detected by energy dispersive spectroscopy (EDS). EDS mapping revealed that Sr, Ca, and Si were distributed within the Sr-MBG (Fig. 2j). The results of XPS and EDS confirmed that strontium was doped in bioglass nanoparticles successfully.

228

A. H. Gao et al.

Fig. 2. Characterization of the Sr-MBG. a The schematic diagram for the synthesis of strontiumdoped bioglass nanoparticles. b SEM image of Sr-MBG. c and d TEM images of Sr-MBG and its distribution in nanofibers. Orange arrow: Sr-MBG in nanofiber. e The average diameter of SrMBG. f Wide-scan XPS of Sr-MBG. g–i Narrow scan of Si 2p, Ca 2p, and Sr 3d. j EDS mapping of Sr-MBG.

3.2 Characterization of Nanofibers Scaffolds Micro morphology and average diameters of NS, NS1, and SDNS were shown in Fig. 3a. SEM images of all samples revealed that the nanofibers were randomly oriented and had three-dimensional porous structures. The average diameter of NS was 0.99 ± 0.15 nm, whereas the average diameter of NS1 increased to 1.47 ± 0.24 nm. The presence of SrMBG appeared not to influence the morphology of nanofibers, and the average diameter of SDNS was 1.46 ± 0.23 nm. The porosity of the NS (0.84 ± 0.03%), NS1 (0.77 ± 0.04%), and SDNS (0.75 ± 0.03%) did not differ significantly (Fig. 3b). PEUU is a hydrophobic material whose nanofibers scaffolds are not favorable to the processes of cell spread and proliferation [17, 25]. Thus, in order to improve the hydrophilicity of PEUU, gelatin was blended with PEUU for electrospining. The water contact angle of NS was 124.35 ± 3.92°, but that of NS1 was 47.35 ± 1.16°. The addition of Sr-MBG further decreased the water contact angle of SDNS (34.63 ± 2.38°) (Fig. 3a, c). The representative mechanical properties of NS, NS1, and SDNS were shown in Fig. 3d. Young’s modulus, tensile strength, and elongation at break were calculated from the strain-stress curves as previous description [18, 26]. Young’s modulus of NS, NS1

Self-adaptive Dual-Inducible Nanofibers Scaffolds

229

Fig. 3. Characterization of the nanofibers scaffolds. a SEM images and average diameters of NS, NS1, and SDNS. b The porosity of NS, NS1, and SDNS. c The water contact angle of NS, NS1, and SDNS. d Mechanical properties of nanofibers scaffolds: Stress-strain curves, Young’s modulus, Tensile strength, and Elongation at break. e FTIR spectra of nanofibers scaffolds and Sr-MBG. f–h The ion release curves of Sr, Ca, and Si from NS, NS1, and SDNS. (*p < 0.05, **p < 0.01, and ***p < 0.001).

and SDNS was 11.40 ± 0.86 Mpa, 6.07 ± 0.24 Mpa and 9.08 ± 0.26 Mpa, respectively. The tensile stress was reduced when gelatin and Sr-MBG were incorporated into the nanofibers. The respective tensile stress of NS, NS1, and SDNS were 17.77 ± 1.04 Mpa, 11.88 ± 1.02 Mpa, and 8.24 ± 0.51 Mpa. The trend of elongation at break was similar to that of tensile stress. Figure 3e showed the FTIR spectra of the Sr-MBG, NS, NS1, and SDNS. The stretching vibration of O-H, present in large quantity in the gelatin, was reflected in the broad peak at 3500–3300 cm−1 . The high intensity of the gelatin peak was observed at 1570 cm−1 in NS1 and SDNS for the high intensity of N–H bonds. However, compared with NS and NS1, an extra peak was detected in SDNS at 1071 cm−1 , which was the stretching vibration of Si-O-Si. The date of FTIR spectra confirmed that gelatin and Sr-MBG were incorporated in SDNS. The strontium, calcium, and silicon ions release curves were shown in Fig. 3f, g, h. It was observed that the rate of ions release was

230

A. H. Gao et al.

fast during the first seven days, whereas it slowed down significantly after that. There were still ions released at 28 days, which indicated SDNS exhibited a controlled release behavior. 3.3 Biocompatibility of Nanofibers Scaffolds PEUU and gelatin were materials displaying good biocompatibility; but it is still necessary to confirm that SDNS was suitable for cell proliferation. Live/Dead staining and CCK-8 assays were carried out to measure the biocompatibility of the nanofibers scaffolds (Fig. 4a, b). After 1 and 5 days of seeding, none of the nanofiber scaffolds contained any dead cells, and BMSCs can proliferate and migrate rapidly. The absorbance index of all the samples did not differ significantly at one and three days. The absorbance index of SDNS (1.24 ± 0.05) was significantly higher than NS1 (0.89 ± 0.17) at five days, which might be related to the fact that strontium can promote BMSCs proliferation [19, 27]. In addition, gradually increased absorbance index indicated that BMSCs could proliferate well on NS1 and SDNS. Taken together, these results confirmed that SDNS could support migration and proliferation of BMSCs.

Fig. 4. Biocompatibility of nanofibers scaffolds. a Fluorescence images of living and dead BMSCs seeded on empty wells, NS1 and SDNS. b The absorbance index of BMSCs seeded on empty wells, NS1 and SDNS at 1, 3, and 5 days. (*p < 0.05).

3.4 Self-adaptive Dual-Inducible Effect of SDNS in Vitro To explore whether SDNS was able to enhance stem cell differentiation adapting to the induction environment, the osteogenic and chondrogenic abilities of BMSCs were examined. The expression of osteogenesis markers was detected by immunofluorescent stain, RT-qPCR, and western blot to assess the ability of nanofibers scaffolds on BMSCs osteogenesis. After 14 days of osteogenic induction with the extract of nanofibers scaffolds, more green fluorescence could be observed in the cytoplasm, thus indicating SDNS promoted the expression of collagenI and osteocalcin (Fig. 5a). The immunofluorescent intensity of Runx2, a transcription factor present in the nucleus, was also increased

Self-adaptive Dual-Inducible Nanofibers Scaffolds

231

with the SDNS extract induction medium (Fig. 5a). The gene expression fold changes of Col1a1, OCN, and Runx2 were significantly increased with the extract induction medium of SDNS (6.94 ± 0.31, 3.01 ± 0.49, 2.09 ± 0.45), in comparison with that of NS1 and control (Fig. 5b, c, d). Western blot also revealed identical result as immunofluorescent stain and RT-qPCR (Fig. 5e). ARS and ALP staining were also used to evaluate the osteogenic ability of the extract of nanofiber scaffolds. After 14 days of induction, increased extent of cellular mineralization and ALP-positive cells were observed cultured with the extract induction medium of SDNS (Fig. S1 in the ESM). The above results confirmed that the extract of SDNS could significantly promote osteogenesis in vitro.

Fig. 5. Effect of nanofibers scaffolds on osteogenesis in vitro. a Immunofluorescent staining of collagenI, OCN, and Runx2 of BMSCs after 14 days of osteogenic induction with the extract of nanofibers scaffolds. b RT-qPCR for Col1a1, OCN, and Runx2 mRNA expression change of BMSCs after 14 days of osteogenic induction with the extract of nanofibers scaffolds. b Western blot for collagenI, OCN, and Runx2 protein expression. (*p < 0.05, **p < 0.01, and ***p < 0.001).

232

A. H. Gao et al.

The extract was used to assess the ability of nanofibers scaffolds on BMSCs chondrogenesis as well. Interestingly, higher immunofluorescent intensity of collagenII and aggrecan in the cytoplasm and Sox9 in the nucleus were observed in immunofluorescent staining images after 21 days of chondrogenic induction with the extract induction medium of SDNS (Fig. 6a). RT-qPCR showed that the extract induction medium of SDNS promoted the chondrogenesis marker gene expression of Col2a1, Aggrecan, and Sox9 (2.20 ± 0.30, 2.23 ± 0.44, 1.68 ± 0.21) compared with the extract induction medium of NS1 and control (Fig. 6b, c, d). The protein expression of collagenII, Aggrecan, and Sox9 was also detected by western blot, and the results were consistent with that of immunofluorescent stain and RT-qPCR results (Fig. 6e). These findings suggested that SDNS could promote chondrogenesis and osteogenesis of BMSCs adapting to induction microenvironments in vitro.

Fig. 6. Effect of nanofibers scaffolds on chondrogenesis in vitro. a Immunofluorescent staining of collageneII, Aggrecan, and Sox9 of BMSCs after 21 days of chondrogenic induction with the extract of nanofibers scaffolds. b RT-qPCR for Col2a1, Aggrecan, and Sox9 mRNA expression change of BMSCs after 21 days of chondrogenic induction with the extract of nanofibers scaffolds. b Western blot for collageneII, Aggrecan, and Sox9 protein expression. (*p < 0.05, **p < 0.01, and ***p < 0.001).

Self-adaptive Dual-Inducible Nanofibers Scaffolds

233

3.5 Effect of SDNS on Tendon-to-Bone Interface Synchronous Regeneration in Vivo It has been reported that after rotator cuff tears, fibrocartilage layers at the tendon-tobone interface fail to regenerate, and the greater tuberosity of the humerus experiences bone loss [20, 28]. Therefore, the most critical aspect of tendon-bone healing involves strategies to promote fibrocartilage layer regeneration while simultaneously enhancing bone regeneration in the greater tuberosity. The fibrochondrocytes were columnar in arrangement, and tidemark could be observed between the uncalcified fibrocartilage layer and calcified fibrocartilage layer in the native tendon-to-bone interface (Fig. S2a in the ESM). Safranin O/Fast Green and Toluidine blue stain showed that there were apparent fibrocartilage layers between the tendon and bone (Fig. S2a in the ESM). At 4 and 8 weeks postoperatively, histology, micro-CT, biomechanical testing, and immunofluorescence were performed to evaluate the regeneration of the tendon-to-bone interface (Fig. S2b in the ESM). Micro CT was used to analyze BMD and BV/TV of the greater tuberosity. It could be observed from the cross-section view at the maximum diameter of the humeral head that bone loss primarily occurred in the greater tuberosity of the humerus in the defect group, and there was only a small degree of bone regeneration in the simple repair and NS1 groups at four weeks. Moreover, in contrast with the other three groups, SDNS significantly promoted the bone regeneration of the greater tuberosity at four and eight weeks (Fig. 7a). The analysis of BMD demonstrated that SDNS (0.63 ± 0.05 g/cm3 ) could increase bone density in comparison with other groups (Fig. 7b). SDNS (58.36 ± 8.54%) were able to increase the BV/TV significantly at four weeks; however, there was no significant differences in BV/TV between the SDNS (64.58 ± 3.85%) and the NS1 group at eight weeks (54.35 ± 5.47%) (Fig. 7c). These results indicated that SDNS facilitated bone regeneration at the tendon-to-bone interface. HE and Safranin O/Fast Green stain showed no fibrocartilage regeneration in the defect group at four weeks. There was significantly more fibrocartilage regeneration in the SDNS group at four weeks in comparison to both the simple repair group and the NS1 group (Fig. 7d, e). Little regenerative fibrocartilage was observed in the defect group, simple repair group, and NS1 group at eight weeks. In contrast, more regenerative fibrocartilage was observed and the tide mark was partially restored in the SDNS group, which indicated that SDNS were able to promote tendon-to-bone interface fibrocartilage layer regeneration (Fig. 7d, e). Toluidine blue stain showed that the metachromatic area of the SDNS group (3.35 ± 0.60 × 105 μm2 ) was larger than the simple repair (1.48 ± 0.43 × 105 μm2 ) and NS1 group (2.03 ± 0.57 × 105 μm2 ) at eight weeks (Fig. S3 in the ESM). SDNS group exhibited the highest histological score (9.50 ± 1.64) compared with the other three groups at eight weeks, implying SDNS promoted tendon-to-bone interface regeneration effectively (Fig. 7f). Immunohistochemical stain was used to evaluate the fibrocartilage layer regeneration of the tendon-to-bone interface. Substantial amount of collagenII was observed in SDNS group in contrast to other three groups (Fig. 8a).

234

A. H. Gao et al.

Fig. 7. Micro-CT and Histology analysis in vivo. a Micro-CT images of the cross-section view of the humeral head at four and eight weeks. b and c BMD and BV/TV of the tendon-to-bone interface. d and e HE and Safranin O/Fast Green staining of the tendon-to-bone interface. f Histological score of the defect group, simple repair group, NS1 group, and SDNS group. T: tendon, FC: fibrocartilage, B: bone. The area circled by the green dashed line represents the fibrocartilage. (*p < 0.05, **p < 0.01, and ***p < 0.001).

As previously reported, the mechanical properties were measured to assess the regenerative effect in vivo [21, 29]. There was a slight difference observed in the cross-section area between the simple repair group and the defect group at four and eight weeks (Fig. 8b). At four weeks, the failure load of the SDNS group (14.40 ± 1.36 N) was significantly higher than that of the defect group (6.48 ± 0.29 N), simple repair group (8.53 ± 0.51 N), and NS1 group (8.93 ± 0.56 N) (Fig. 8c). The failure load of the SDNS group was significantly higher than that of the other three groups at eight weeks as well (Fig. 8c). The stiffness of the SDNS group (5.75 ± 1.10 N/mm) was higher than the other three groups at four weeks (Fig. 8d). But there was no significant difference in

Self-adaptive Dual-Inducible Nanofibers Scaffolds

235

Fig. 8. Fibrocartilage regeneration and mechanical properties of the tendon-to-bone interface. a Immunohistochemical staining of collageneII at the tendon-to-bone interface. b–e Cross-section area, failure load, stiffness, and stress of the tendon-to-bone interface. T: tendon; FC: fibrocartilage; B: bone (*p < 0.05, **p < 0.01, and ***p < 0.001)

stiffness between the SDNS group (5.75 ± 1.10 N/mm) and the NS1 group (5.70 ± 0.60 N/mm) at eight weeks. At four and eight weeks, the stress of the SDNS group (2.71 ± 0.34 N/mm2 , 5.63 ± 0.58 N/mm2 ) was significantly higher than the other three groups (Fig. 8e).

4 Discussion In this study, we have successfully manufactured SDNS which could adaptively promote MSCs osteogenesis and chondrogenesis depending on the specific induction microenvironment and enhance tendon-bone healing in a rotator cuff repair model. SDNS exhibited controlled ions release profile and continued to release strontium ions until 28 days. In the diverse induction microenvironments, the extract of SDNS stimulated the osteogenic and chondrogenic differentiation of MSCs. SDNS promoted synchronous regeneration of fibrocartilage and bone layers and improved the biomechanical strength in vivo. Tendon-to-bone interface is the structure connecting the tendon and the bone, two tissues with considerable discrepancies in mechanical properties [22, 31]. The primary function of the tendon-to-bone interface is to reduce stress concentration, which is closely related to its native inhomogeneous structure [23, 32]. In a prior study, Moffat et al. observed that tissue stiffness and Young’s modulus increased gradually from fibrocartilage layer to bone layer [24, 33]. This region-dependent mechanical inhomogeneity can reduce the mechanical property disparities between the different layers when subjected to an external force, thereby substantially decreasing the stress concentration and preventing excessive stress at the tendon-to-bone interface [24, 33]. However, even after surgical repair, the uncalcified and calcified fibrocartilage layers were found to be displaced by fibrous scar instead regeneration [25, 34]. The fibrous scar is incapable of reducing stress concentrations, thus resulting in the retear occurrence, and adversely

236

A. H. Gao et al.

affecting the surgical prognosis [5, 9]. In addition, the poor prognosis has been also associated with bone loss at the tendon-to-bone interface after injury, which might result from the decreased mechanical loading and raised osteoclast activity [26, 36]. However, some tissue engineering scaffolds for tendon-bone healing mainly focus on the bone regeneration at the tendon-to-bone interface but ignore the role and regeneration of fibrocartilage layers in the tendon-to-bone interface [27, 38]. Consequently, encouraging the synchronous regeneration of fibrocartilage and bone layers at the tendon-to-bone interface to restore its functions can be an optimal strategy for facilitating tendon-bone healing. Due to their multilineage potential, mesenchymal stem cells are ideal for a variety of tissue regeneration [28, 39]. The surrounding microenvironments have a significant impact on MSCs differentiation. For instance, Lu et al. investigated the possible effect of osteoblasts and fibroblasts interaction on MSCs differentiation using a coculture system and discovered that the expression of fibrocartilage specific markers collagenII and Sox9 of MSCs increased at the interface between osteoblasts and fibroblasts, implying a chondrogenic environment might existed at the tendon-to-bone interface [8, 13]. Chae et al. prepared tendon decellularized matrix (TdECM bioink) and bone decellularized matrix (BdECM bioink) as bioinks loaded with mesenchymal stem cells [29, 40]. 3D print was used to create a gradient multi-tissue scaffold with a biomimetic tendon-to-bone interface structure, in which both ends of the scaffold were printed with TdECM bioink or BdECM bioink, and the middle was printed with a combination of the two bioinks [29, 40]. In their in vitro experiments, the mRNA and protein expression levels of collagenII and Sox9 were significantly higher in the middle part of the scaffold than at either end, indicating the existence of a chondrogenic microenvironment between the tendon and bone [29, 40]. MSCs at the tendon-to-bone interface are not only influenced by the tendon side, but also the bone side. The osteogenic differentiation ability of MSCs was significantly elevated after coculturing with osteoblasts, which may be related to growth factors secreted by osteoblasts and cell-to-cell interaction [5, 12]. Cell-to-cell interaction and factors secreted by the different cells exert synergetic effects on the behaviors of MSCs. Hence, we hypothesized that chondrogenic and osteogenic microenvironments coexisted at the tendon-to-bone interface from tendon to bone side. Nevertheless, the microenvironments were insufficient for supporting MSCs chondrogenesis and osteogenesis, resulting in arduous tendon-to-bone interface regeneration. Inspired by local microenvironments and challenges of tendon-bone healing, we fabricated SDNS that can adapt to the different microenvironments and enhance the osteogenic and chondrogenic differentiation of MSCs, which promoted fibrocartilage and bone layers synchronous regeneration in vivo. Strontium was first reported to promote the process of osteogenesis [30, 42]. It was found that incorporating strontium into bone tissue engineering scaffolds could stimulate osteogenic differentiation, inhibit osteoclast formation, and consequently, encourage bone regeneration [31, 44]. Strontium can stimulate osteogenesis by activating the calcium sensing receptor as well as Wnt/β-catenin pathway and can suppresses osteoclast

Self-adaptive Dual-Inducible Nanofibers Scaffolds

237

formation by inhibiting the Receptor activator of nuclear factor-kappaB/Receptor activator of nuclear factor-kappaB ligand/osteoprotegerin system. Recent studies have demonstrated that strontium can induce MSCs chondrogenesis in vitro, alleviate osteoarthritis, and enhance articular cartilage regeneration in vivo [32]. It has been established that effects of strontium on enhancing chondrogenesis and easing osteoarthritis can be attributed to the activation of hypoxia inducible factor-l pathway and inflammation suppression [13]. These findings indicated that strontium can regulate the biological behavior of mesenchymal stem cells in response to their surroundings. At 21 days, the ions release curve of SDNS revealed that the strontium concentration was approximately 0.25 mM, which was close to the reported working concentration of strontium [11]. Therefore, we postulated that the self-adaptive and dual-inducible effects of SDNS were could be primarily related to the released strontium ion. Metal ions can regulate the tissue homeostasis and regeneration, but their therapeutic window is narrow [33]. Nonetheless, their concentration often exceeds the toxicity threshold by oral or injection, leading to adverse effects [34]. To achieve controlled ion release and reduce the associated systemic side effects, we doped strontium in mesoporous biogasses nanoparticles and loaded Sr-MBG in SDNS by electrospining. Sr-MBG was initially released from the nanofibers during the degradation process, and then SrMBG was further degraded to release strontium ion, thereby achieving a gradual and sustained release of ions. However, there are few limitations with our study. Firstly, the underlying mechanisms of adaptive regulation of MSCs differentiation by SDNS remains to be explored, which limits further research on SDNS. Secondly, the blood concentration of strontium ions and potential side effects were not measured. Thirdly, rats acute rotator cuff injury was used as animal model in this study. However, the regeneration ability of rats is better than humans and the most common rotator cuff tear in clinics is chronic. Therefore, further testing and validation in a large animal model of chronic rotator cuff tears can further enhance the credibility of SDNS in promoting tendon-bone healing. In our current study, we successfully fabricated nanofibers scaffolds which could adapt to surrounding induction microenvironments and improve MSCs osteogenesis as well as chondrogenesis. Strontium ions released from SDNS can function as the key regulatory factor for MSCs differentiation. After implanting SDNS in the torn rotator cuff, fibrocartilage and bone synchronous regeneration were observed in the tendon-to-bone interface. These results confirmed that SDNS was ideal for rapid and efficient tendonbone healing. In addition to the tendon-to-bone interface, the motor system has other inhomogeneous interfaces tissue, such as osteochondral and myotendinous interfaces. By regulating distinct induction microenvironments, our findings offer a novel strategy for promoting the in situ regeneration of inhomogeneous interfaces which are difficult to regenerate through metal ions.

5 Conclusion In summary, this study successfully synthesized strontium-doped mesoporous bioglass nanoparticles through a modulated sol-gel method. Sr-MBG were loaded in nanofibers by electrospining to fabricate self-adaptive Dual-inducible nanofibers scaffolds. SDNS

238

A. H. Gao et al.

exhibited good biocompatibility and enhanced osteogenesis and chondrogenesis of mesenchymal stem cells in response to surrounding induction microenvironments. After implanting in injured tendon-to-bone interface, SDNS was capable of adapting to distinct microenvironments and stimulated synchronous regeneration of fibrocartilage and bone layers at the tendon-to-bone interface, which restored the biomechanical strength of the rotator cuff significantly. Taken together, our study provided a novel perspective on the utilization of electrospining scaffolds for the in situ regeneration of inhomogeneous interface tissue by regulating microenvironments through metal ions. Acknowledgment. This work was supported by the National Natural Science Foundation of China (Nos. 81871753, 31972923, 82272570), the Shanghai Talent Development Fund (No. 2021057), and the Shanghai Jiao Tong University Science and Technology Innovation Special Fund (No. 2021JCPT02).

References 1. Ahmad, Z., Al-Wattar, Z., Rushton, N., Akinfala, M., Dawson-Bowling, S., Ang, S.: Holding on by a thread: the continuing story of rotator cuff tears. Br. J. Hosp. Med. (Lond.) 82, 1–10 (2021) 2. Galatz, L.M., Ball, C.M., Teefey, S.A., Middleton, W.D., Yamaguchi, K.: The outcome and repair integrity of completely arthroscopically repaired large and massive rotator cuff tears. J. Bone Joint Surg. Am. 86A, 219–224 (2004) 3. Lei, T., et al.: Biomimetic strategies for tendon/ligament-to-bone interface regeneration. Bioact. Mater. 6, 2491–2510 (2021) 4. Lu, H.H., Thomopoulos, S.: Functional attachment of soft tissues to bone: development, healing, and tissue engineering. Annu. Rev. Biomed. Eng. 15, 201–226 (2013) 5. Zhu, C., Qiu, J., Thomopoulos, S., Xia, Y.: Augmenting tendon-to-bone repair with functionally graded scaffolds. Adv. Healthc. Mater. 10, e2002269 (2021) 6. Kovacevic, D., Suriani, R.J., Jr., Levine, W.N., Thomopoulos, S.: Augmentation of rotator cuff healing with orthobiologics. J. Am. Acad. Orthop. Surg. 30, e508–e516 (2022) 7. Kim, H., et al.: A novel 3D indirect co-culture system based on a collagen hydrogel scaffold for enhancing the osteogenesis of stem cells. J. Mater. Chem. B 8, 9481–9491 (2020) 8. Wang, I.E., Bogdanowicz, D.R., Mitroo, S., Shan, J., Kala, S., Lu, H.H.: Cellular interactions regulate stem cell differentiation in tri-culture. Connect. Tissue Res. 57, 476–487 (2016) 9. Dahl, S.G., et al.: Incorporation and distribution of strontium in bone. Bone 28, 446–453 (2001) 10. Jimenez, M., Abradelo, C., San Roman, J., Rojo, L.: Bibliographic review on the state of the art of strontium and zinc based regenerative therapies. Recent developments and clinical applications. J. Mater. Chem. B 7, 1974–1985 (2019) 11. Liu, D., et al.: 3D printed PCL/SrHA scaffold for enhanced bone regeneration. Chem. Eng. J. 362, 269–279 (2019) 12. Yu, H., et al.: Strontium ranelate promotes chondrogenesis through inhibition of the Wnt/betacatenin pathway. Stem Cell Res. Ther. 12, 296 (2021) 13. Deng, C., et al.: Bioactive scaffolds for regeneration of cartilage and subchondral bone interface. Theranostics 8, 1940–1955 (2018) 14. Yu, K., et al.: Fabrication of poly(ester-urethane)urea elastomer/gelatin electrospun nanofibrous membranes for potential applications in skin tissue engineering. RSC Adv. 6, 73636–73644 (2016)

Self-adaptive Dual-Inducible Nanofibers Scaffolds

239

15. Bolam, S.M., et al.: Obesity impairs enthesis healing after rotator cuff repair in a rat model. Am. J. Sports Med. 49, 3959–3969 (2021) 16. Su, W., Wang, Z., Jiang, J., Liu, X., Zhao, J., Zhang, Z.: Promoting tendon to bone integration using graphene oxide-doped electrospun poly(lactic-co-glycolic acid) nanofibrous membrane. Int. J. Nanomed. 14, 1835–1847 (2019) 17. Wang, L., et al.: Crimped nanofiber scaffold mimicking tendon-to-bone interface for fattyinfiltrated massive rotator cuff repair. Bioactive Mater. (2022). https://doi.org/10.1016/j.bio actmat.2022.01.031 18. Huang, K., Su, W., Zhang, X., Chen, C., Zhao, S., Yan, X., Jiang, J., Zhu, T., Zhao, J.: Cowpea-like bi-lineage nanofiber mat for repairing chronic rotator cuff tear and inhibiting fatty infiltration. Chem. Eng. J. 392 (2020) 19. Li, Y., et al.: Strontium regulates stem cell fate during osteogenic differentiation through asymmetric cell division. Acta Biomater. 119, 432–443 (2021) 20. Huang, K.;Du, J.;Xu, J.;Wu, C.;Chen, C.;Chen, S.;Zhu, T.;Jiang, J.; Zhao, J. Tendon-bone junction healing by injectable bioactive thermo-sensitive hydrogel based on inspiration of tendon-derived stem cells. Materials Today Chemistry 2022, 23 21. Song, W., Ma, Z., Wang, C., Li, H., He, Y.: Pro-chondrogenic and immunomodulatory melatonin-loaded electrospun membranes for tendon-to-bone healing. J. Mater. Chem. B 7, 6564–6575 (2019) 22. Roffino, S., Camy, C., Foucault-Bertaud, A., Lamy, E., Pithioux, M., Chopard, A.: Negative impact of disuse and unloading on tendon enthesis structure and function. Life Sci. Space Res. (Amst.) 29, 46–52 (2021) 23. Deymier, A.C., et al.: The multiscale structural and mechanical effects of mouse supraspinatus muscle unloading on the mature enthesis. Acta Biomater. 83, 302–313 (2019) 24. Moffat, K.L., et al.: Characterization of the structure-function relationship at the ligamentto-bone interface. Proc. Natl. Acad. Sci. U.S.A. 105, 7947–7952 (2008) 25. Zumstein, M.A., Ladermann, A., Raniga, S., Schar, M.O.: The biology of rotator cuff healing. Orthop. Traumatol. Surg. Res. 103, S1–S10 (2017) 26. Kim, D.M., et al.: A combination treatment of raloxifene and vitamin D enhances boneto-tendon healing of the rotator cuff in a rat model. Am. J. Sports Med. 48, 2161–2169 (2020) 27. Liao, H., et al.: Amorphous calcium phosphate nanoparticles using adenosine triphosphate as an organic phosphorus source for promoting tendon-bone healing. J. Nanobiotechnol. 19, 270 (2021) 28. Chen, Z., et al.: Challenges and perspectives of tendon-derived cell therapy for tendinopathy: from bench to bedside. Stem Cell Res. Ther. 13, 444 (2022) 29. Chae, S., et al.: 3D cell-printing of gradient multi-tissue interfaces for rotator cuff regeneration. Bioact. Mater. 19, 611–625 (2023) 30. Meunier, P.J., et al.: The effects of strontium ranelate on the risk of vertebral fracture in women with postmenopausal osteoporosis. N. Engl. J. Med. 350, 459–468 (2004) 31. Wu, Q., et al.: Strontium-incorporated bioceramic scaffolds for enhanced osteoporosis bone regeneration. Bone Res. 10, 55 (2022) 32. Cai, Z., Li, Y., Song, W., He, Y., Li, H., Liu, X.: Anti-inflammatory and prochondrogenic in situ-formed injectable hydrogel crosslinked by strontium-doped bioglass for cartilage regeneration. ACS Appl. Mater. Interfaces 13, 59772–59786 (2021) 33. Farmani, A.R., Salmeh, M.A., Golkar, Z., Moeinzadeh, A., Ghiasi, F.F., Amirabad, S.Z., Shoormeij, M.H., Mahdavinezhad, F., Momeni, S., Moradbeygi, F., Ai, J., Hardy, J.G., Mostafaei, A.: Li-doped bioactive ceramics: promising biomaterials for tissue engineering and regenerative medicine. J. Funct. Biomater. 13 (2022) 34. Todd, T., et al.: LiF@SiO2 nanocapsules for controlled lithium release and osteoarthritis treatment. Nano Res. 11, 5751–5760 (2018)

Medical Informatics

A Multifunctional Image Processing Tool for CT Data Standardization Yiwei Gao1 , Jinnan Hu1 , Peijun Hu1 , Chao Huang1 , and Jingsong Li1,2(B) 1 Research Center for Healthcare Data Science, Zhejiang Laboratory, Yu Hang Street,

Hangzhou, China [email protected] 2 Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China

Abstract. As one of the most commonly used imaging examinations in clinical practice, CT images have been widely sued in computer-aided diagnosis and disease-related studies. Due to the diverse content and format of the original CT data, it is usually necessary to conduct standardized operations on CT data before image processing or deep learning, which is called data preprocessing. The complexity of CT data makes preprocessing often time-consuming and labor-intensive especially for multi-cohort studies of multiple sequence CT images, and the results directly affect the subsequent algorithms. This study establishes a universal CT data standardization process and tool that can greatly improve the efficiency of preprocessing. The established pipeline includes de-privacy, format conversion, data check, denoising, resampling and registration. The standardization process we built in the tool is applicable to the preprocessing of most studies. We integrate a variety of common algorithms to extend the user’s choice at the same time. Keywords: CT · Standardization · Preprocessing

1 Introduction Computed Tomography (CT) imaging technology plays an important role in the diagnosis of diseases. More and more researchers are committed to digging the potential value from CT images to promote the development of medical technology [1]. With the development of deep learning, the research of combining CT images and neural networks to build automated medical diagnosis methods has become popular [2]. CT scans retrieved from different clinical institutions may vary in coordinate systems, spatial resolution, gray scale deviations etc. due to different scanning machines and protocols. In addition, the established CT cohort may face data missing or inconsistent data size, which impede the subsequent automated image processing. Thus, preprocess of CT standardization is necessary and important for subsequent image processing and research study. However, manual preprocessings such image registration, resolution resampling, data check, deprivacy will take a long time and cannot be performed uniformly for different cohorts. Establishing a highly-automated preprocessing pipeline or toolbox and output standard data is meaningful for the efficient medical image study and clinical research [3]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 243–250, 2024. https://doi.org/10.1007/978-3-031-51485-2_26

244

Y. Gao et al.

Common data preprocessing steps can be summarized as format conversion, data cleaning, and data normalization. CT data preprocessing involves the processing of text and images in a wide variety of methods with unstable results. Researchers need to try different methods to find the one that performs best on their own data, which usually takes a lot of time for preprocessing and is a complex and time-consuming procedure. Firstly, researchers need to convert the CT data from DICOM format to NIFTI or other formats that are convenient for feature extraction in deep learning. Secondly, it is also necessary to filter the large amount of information in metadata, which will retain the useful information for the study and eliminate the useless information, and protect patient privacy as much as possible. Finally, researchers need to selectively perform operations such as denoising, resampling, normalization, and registration on the initial CT images according to the needs of research, to ensure that all CT images have the same structural information in parts that are not related to the research [4]. Researchers need to make their own judgment on the quality of the preprocessing results, which puts certain requirements on their own experience. Currently, the researches based on CT images are increasing, and developing an integrated CT images standardization tools is a good choice to improve the research efficiency [5]. The CT data needs to be preprocessed and standardized before conducting a specific research. There are many quality issues with CT data from different sources and individual study requires for differences preprocessing steps. Many researchers have made explorations on CT data processing software. Jenkinson et al. developed FSL, a software that can be used for MRI and CT data analysis, but does not have strong preprocessing capabilities for CT [6]. Paul A. Yushkevich et al. developed ITK-SNAP to help users with semi-automatic segmentation of 3D medical images, which contains some common preprocessing steps, but is benchmarked to serve segmentation studies and has limited generality for other research tasks [7]. Yi Liu et al. released a toolkit called MedicialSeg which supports the whole segmentation process including data preprocessing, model training, and model deployment [8]. MedicalSeg is convenient for medical practitioners to quickly establish a medical recognition model and perform image recognition efficiently, but it is too cumbersome and difficult to use as a preprocessing tool. The above tools can meet the standardization of CT data to a certain extent, but the scope of application is limited, and there is still a lack of general standardization processes and tools. The CT preprocessing requirements of different studies can be summarized into a common standardization process. This process can help in the development of configurable CT image standardization tools to speed up the efficiency of studies for user. In view of the above situation, an integrated and modularized CT data standardization tool were developed to provide users with universal data preprocessing functions in research. While ensuring readability and operability, it provides users with greater flexibility and is convenient to implement custom operation content in use. By using this tool, users can save time on preprocessing and improve their efficiency.

A Multifunctional Image Processing Tool

245

2 Methods The CT data standardization tool integrates several common preprocessing methods and supports unified standardization of CT data, which can convert DICOM format into NIFTI and perform data check, denoising, resampling and registration operations on target data. By filtering the tag in the metadata, the tool can obtain the information related to the study and output it to the corresponding JSON file, while hiding the information irrelevant to the study. The process of CT data standardization is shown in Fig. 1. After inputting CT data and configuration parameters, the tool processes the target data according to the requirements, and outputs standardized CT image data without privacy information. The CT data standardization tool contains 6 modules, namely IO module, denoising module, resampling module, registration module, logging module and tag modifying module. In the IO module, denoising module, resampling module and registration module, users can customize whether to perform the operation and modify the relevant parameters. The logging module and tag modifying module are embedded in the above 4 modules and run automatically with them. The architecture of the tool is shown in Fig. 2. 2.1 IO Module The IO module includes two main functions: format conversion and data check. Since the original CT data are mostly in DICOM format and stored in slices in the form of 2D data, a large number of files will be generated in one scan. In order to get the NIFTI format which is saved in the form of 3D data and more convenient for deep learning research, this tool supports the conversion of DICOM format into NIFTI format, and at the same time combines with the tag modifying module to complete the de-privacy operation. The function of data check includes 4 operations: duplicate check, tag check, mask check and coordinate check. The duplicate check operation finds whether the data set contains duplicate data and outputs the names of them. This tool is more convenient and faster than manual deduplication. Tag check operation can check each data with specified tag and fill the missing value with specific content, so that the information contained in the metadata can be consistent. Though running the mask check operation, this tool can verify whether there is a mask that has the same size of the image. For multi-sequence CT, it can be judged whether there is a missing mask file. Anatomical coordinate system is the most important coordinate system in medical imaging technology. The anatomical coordinate system currently used in CT is generally LPS (Left, Posterior, Superior), RAS (Right, Anterior, Superior), RPI (Right, Posterior, Inferior) and LAS (Left, Anterior, Superior). If the coordinate system of CT data is inconsistent, it will cause unexpected effects when performing operations such as feature extraction, 3D reconstruction, or image result comparison. Unifying the coordinate system of the CT data is a step that cannot be bypassed before starting the research correctly. In order to ensure that operations of the deep learning part are carried out in a unified direction, this tool can sample CT data into the coordinate system specified by the user. Instead of tedious manual checking, the coordinate check operation checks the coordinate system of the data and converting it to the user-specified one. In this tool, the path parameters of the research data are split into

246

Y. Gao et al.

multiple keywords of different levels, namely ‘data_dir’, ‘task_name’, ‘dataset_name’ and ‘study_name’. Users can establish the superior-subordinate relationship of CT data according to their needs. This tool can preprocess multiple studies and multiple queues, and input or output data at a certain level to facilitate subsequent algorithm research.

Fig. 1. Process of CT data standardization tool

Fig. 2. The architecture of the CT data standardization tool

2.2 Denoising Module Denoising module is used to realize the data denoise processing and reduce the influence of noise. The denoising module integrates 7 commonly used filtering methods, which is suitable for various CT data and meets the basic needs of the research. This tool provides 7 filtering modes: Mean filter, Median filter, Gaussian filter, Bilateral filter, Non-Local Means (NLM) filter, box filter, and Block-Matching and 3D (BM3D) filter. Users can quickly compare the results obtained by various filtering methods to select the best one, saving time in finding and debugging denoising methods. 2.3 Resampling Module The resampling module samples data to the same origin, orientation and spatial resolution, solving the problem of varying layer thickness and size of the original CT data. This module integrates 4 resampling methods, namely Neighbor resample, Linear resample, Bspline resample and Super-resolution resampling for Z-axis using deep learning network. Users can choose the specific method according to their needs and set the origin, direction and spatial resolution to resample the target data.

A Multifunctional Image Processing Tool

247

2.4 Registration Module The purpose of registration is to align CT of different phases or time series, which is an essential step in image processing and clinical cohort studies. There are 8 registration methods to choose from in the tool for the registration operation. Among them, there are 3 methods for non-rigid registration based on deep learning, namely, namely Voxelmorph [9], Monai [10] and Deepreg [11]; 5 methods using mature framework, respectively are ANTs [12], NIFTY [13], DeedsBCV [14], SimpleElastix [15] and Demon. Among the mature frameworks, ANTs, SimpleElastix and Demon can achieve rigid and non-rigid registration respectively, NIFTY and DeedsBCV can only achieve non-rigid registration. Meanwhile, the tool provides 4 evaluation methods: Mean Square Displacement (MSD) evaluation, Mutual Information (MI) evaluation, Normalized Correlation Coefficient (NCC) evaluation, and Mean Square (MS) evaluation. With this tool, rigid and non-rigid registration can be achieved with simple operation, and the registration results can be judged visually based on the evaluation methods. 2.5 Logging Module The logging module is used to output the user’s operation logs for easy viewing and troubleshooting of the tool. The logs include the start time, run time, and user-specified operation parameters. If the operation is successful, the success result will be output. Otherwise, the failure reason will be output. The logs are named after dates with one day for the interval and 30 days for the cycle. 2.6 Tag Modifying Module The tag modifying module can extract the metadata of the original data and perform with privacy operation. This tool selects 49 tags that may be related to the research and stores them in JSON files which only contain metadata. 4 tags of the whole 49 tags and their names are shown in Table 1. For other tags, this tool defaults that they are irrelevant to the research and hides them to achieve the effect of de-privacy. During subsequent operations, if the values in these 49 tags are changed, the corresponding changes will be made in the JSON file. This module is embedded in the other modules of the tool and runs by default. Table 1. 4 tags and their names of the whole 49 tags Tag

00080021

00180015

00281050

00180050

Name

Series date

Body part examined

Window center

Slice thickness

248

Y. Gao et al.

3 Results To evaluate the effectiveness of the proposed universal procedure, 6 abdominal CT data of varying size and resolution were used as test set. The test set in DICOM format was directly input into this tool without any additional processing, and the NIFTI format were output after de-privacy, mean denoising, linear resampling and Monai registration to aligns the arterial phase data with the venous phase data. The respective results of CT data after denoising, resampling and registration operations are shown in Table 2. Peak signal-to-noise ratio (PSNR) is used to represent the evaluation index before and after denoising. The resampling operation resamples the data all to a resolution of (1, 1, 1), and the table shows the number of slices before and after the operation. NCC is used to evaluate the results of registration. Table 2. Results of CT data after each operation Module

Data 1

Data 2

Data 3

Data 4

Data 5

Data 6

Denoising

44.6

36.29

41.51

37.81

37.58

37.74

Resampling (before/after)

41/205

48/240

43/215

46/230

46/230

41/205

Registration (before/after)

−0.92/−0.85

−0.98/−0.92

−0.92/−0.86

−0.98/−0.91

−0.98/−0.93

−0.90/−0.87

The CT data standardization tool can greatly improve the efficiency and reduce the time of manual work. The average time consuming of each module in processing 6 cases of data is shown in Table 3. The time unit used for the experiment is seconds, and the size range of the 6 cases is (16.3 ± 1.1) megabyte. This tool can quickly realize the operation of CT data standardization, which greatly saves the user’s time. Table 3. Time consuming of each module Module

IO

Denoising

Resampling

Registration

Total

Times(s)

14.95

7.17

41.74

492.65

556.51

4 Discussion Integrated CT data standardization tools are important because it is time-consuming to define standardization processes and methods individually. Although many medical imaging data standardization tools have been developed to simplify the data preprocessing process, they are rarely for CT data and are mostly applied to MRI data. This study establishes a universal process and an integrated standardization tool for the CT data preprocessing of deep learning, which can be used to improve the efficiency of the study. The tool can shorten the time of standardization and has effectiveness in practical applications.

A Multifunctional Image Processing Tool

249

Although many studies have focused on CT data standardization, we would like to highlight the following advantages of our tool. First, this tool proposes a universal CT data standardization process that is applicable to most deep learning studies. Second, this tool integrates a variety of common preprocessing methods, including traditional algorithmic methods and deep learning-based methods, to meet as many needs as possible for users. Third, this tool modularizes each step to make it easy to choose, allowing users to use it flexibly.

5 Conclusions The proposed universal process is well suited for the task of CT data standardization in deep learning and can be gradually applied to other clinical data. The standardization tool for original CT data proposed in this paper is feasible and practical for improving the efficiency and speed of CT data preprocessing for research species. Future work will include (1) carrying out clinical data standardization in more domains to test and verify the universal standardization process; (2) improving the performance and content of current tools and optimize the experience of using the tool by studying the user’s experience; (3) visualize the tool to provide users with a more convenient way to use it. Acknowledgment. The project is supported in part by the National Natural Science Foundation of China (No. 12101571) and Key Research Project of Zhejiang Laboratory (No. 2022ND0AC01).

References 1. Niyaz, U., Singh Sambyal, A., Devanand: Advances in deep learning techniques for medical image analysis. J. Grid Comput. 271–277 (2018) 2. Singh, S.P., Wang, L., Gupta, S., Goli, H., Padmanabhan, P., Gulyás, B., et al.: 3D deep learning on medical images: a review. Sensors 20(18), 5097 (2020) 3. Egmont-Petersen, M., de Ridder, D., Handels, H.: Image processing with neural networks—a review. BPRA Pattern Recognit. 35(10), 2279–2301 (2002) 4. Avants, B.B., Tustison, N.J., Stauffer, M., Song, G., Wu, B., Gee, J.C.: The insight ToolKit image registration framework. Front. Neuroinform. 28(8), 44 (2014) 5. Beers, A., et al.: DeepNeuro: an open-source deep learning toolbox for neuroimaging. Neuroinformatics 19(1), 127–140 (2019) 6. Jenkinson, M., Beckmann, C.F., Behrens, T.E., Woolrich, M.W., Smith, S.M.: FSL. NeuroImage 62, 782–790 (2012) 7. Yushkevich, P.A., Piven, J., Hazlett, H.C., Smith, R.G., Ho, S., Gee, J.C., Gerig, G.: Userguided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31(3), 1116–1128 (2006) 8. Yi, L., Lutao, C., Guowei, C., Zewu, W., Zeyu, C., Baohua, L., Yuying, H., et al.: PaddleSeg: A high-efficient development toolkit for image segmentation (2021). arXiv preprint arXiv:2101(06175) 9. Guha, B., Amy, Z., Mert, R.S., John, V.G., Adrian, V.D., et al.: VoxelMorph: a learning framework for deformable medical image registration. Comput. Vis. Pattern Recognit. (2019). arXiv:abs/1809.05231(8), 1788–1800

250

Y. Gao et al.

10. M. J C, Wenqi L, Richard B, et al. MONAI: An open-source framework for deep learning in healthcare (2022). arXiv preprint arXiv:2211(02701) 11. Fu, Y., et al.: DeepReg: a deep learning toolkit for medical image registration. J. Open Source Softw. 5(55), 2705 (2020) 12. Avants, B.B., Tustison, N.J., Song, G., Cook, P.A., Klein, A., Gee, J.C.: A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage 54, 2033–2044 (2011). 10.1016 13. Wenqi, L., Guotai, W., Lucas, F., Sébastien, O., Jorge Cardoso, M., Tom, V., et al.: On the compactness, efficiency, and representation of 3d convolutional networks: brain parcellation as a pretext task. Comput. Res. Repository 10265, 348–360 (2017) 14. Mattias, P.H., Mark, J., Michael, B., Julia, A.S., et al.: MRF-based deformable registration and ventilation estimation of lung CT. IEEE Trans. Med. Imaging 32(7), 1239–1248 (2013) 15. Marstal, K., Berendsen, F., Staring, M., Klein, S.: SimpleElastix: a user-friendly, multi-lingual library for medical image registration. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 574–582 (2016)

Effect of Schroth Exercise on Pulmonary Function and Exercise Capacity in Patients with Severe Adolescent Idiopathic Scoliosis Wei Liu1,2 , Christina Zong-Hao Ma1,3 , Chang Liang Luo1,2 , Yu Ying Li2 , and Hui Dong Wu2(B) 1 Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong

SAR, China 2 Department of Prosthetic and Orthotic Engineering, School of Rehabilitation, Kunming

Medical University, Kunming, China [email protected] 3 Research Institute for Smart Ageing, The Hong Kong Polytechnic University, Hong Kong SAR, China

Abstract. Purpose: This study aimed to investigate the effect of Schroth exercise on pulmonary function and exercise capacity in patients with severe adolescent idiopathic scoliosis (AIS). Methods: Forty subjects with severe AIS were assigned to the Schroth exercise and aerobic exercise groups and then received the corresponding intervention. The baseline and 12-week follow-up visits were used to assess the pulmonary function and exercise capacity. Results: After 12 weeks of intervention, all the subjects presented improved pulmonary function and exercise capacity (P < 0.05). According to inter-group comparison, The Schroth exercise group demonstrated more significant enhancements in pulmonary function compared to the other groups (P < 0.05). Besides, subjects in the Schroth exercise group demonstrated a significantly longer average walking distance according to the 6-min walk test (p < 0.05) and a lower average Borg fatigue score (p < 0.05). Conclusion: Schroth exercise offered better improvement of pulmonary function and exercise capacity in patients with severe AIS than aerobic exercise. Keywords: Schroth exercise · Adolescent idiopathic scoliosis · Pulmonary function · Exercise capacity

1 Introduction Adolescent idiopathic scoliosis (AIS) is a complex three-dimensional (3D) spinal deformity occurring in adolescents with unknown causes. As getting severe, it could affect the physiological shape [1], volume [2], and movement [2] of rig cage, which may result in impaired pulmonary function such as restrictive pattern of changes [3], impairment of respiratory muscles [4], restrictive and asymmetric movement of the chest [5], and restrictive ventilation abnormalities during activity [6]. Impaired pulmonary function may subsequently influence one’s exercise capacity to different degrees [7]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 251–258, 2024. https://doi.org/10.1007/978-3-031-51485-2_27

252

W. Liu et al.

According to the guideline recommended by the Scoliosis Research Society (SRS) [8], growing patients with Cobb > 45° are commonly prescribed with surgical treatment to correct the spinal deformities and alleviate the impairment in pulmonary function and exercise capacity. However, postoperative complications have always been a concern for families. Pulmonary complication was one of the most common, which has been the leading cause of postoperative morbidity and mortality [9]. While there is no proof indicating a clear link between preoperative pulmonary function and the incidence or seriousness of postoperative complications, effective management of preoperative pulmonary function aided in preventing postoperative complications [10]. Koumbourlis [1] reported a higher risk of postoperative respiratory failure in patients with a vital capacity less than 40% of the standard value and maximum inspiratory/expiratory pressure 40° were analyzed, and results demonstrated advantageous impacts of Schroth exercise on pulmonary function and Cobb angle [22]. To strengthen the evidence, this study was to investigate the effect of the Schroth exercise on improving pulmonary function and exercise capacity in patients with severe AIS.

2 Materials and Methods 2.1 Subjects The subjects were selected from a local scoliotic center in Jan. 2018-Dec. 2019. All the subjects were in the preoperative stage. The inclusion criteria were diagnosed with severe AIS (Cobb > 40°), aged 10–16, no contraindications to exercise training, and signed the consent form. Those suffering from other cardiopulmonary diseases, nervous system diseases, musculoskeletal diseases, cognitive disorders, or mental diseases, or

Effect of Schroth Exercise on Pulmonary Function

253

were to receive the surgery within 12 weeks were excluded from this study. The eligible subjects were assigned to either the Schroth exercise group or the conventional aerobic. The authors’ Institutional Review Board provided ethical approval. 2.2 Interventions All the subjects were prescribed either Schroth exercise or conventional aerobic exercise 4 times a week for 12 weeks [22]. The exercise load of each 60-min session was evaluated using the Rating of Perceived Exertion and kept at a level of 13–15 scores making a subject feel slightly strenuous. Subjects did the exercise under the supervision of an experienced therapist. Subjects in the Schroth exercise group receipted: (1) preparation: 10 min, low-load aerobic walking; (2) stretching the chest part, 5 min; (3) main exercise: 40 min, lying right/left back concave, lying aside static postural control training, sitting posture adjustment exercise, and stability exercise and resistance exercise of back and neck muscles with heart rate (HR) at 60–85% of maximum HR (maximum HR = 220-age); (4) moving ribs, 5 min. The Schroth exercise was performed according to one’s spinal shape, companying with a 3D rotational breathing exercise [23]. Subjects in the conventional aerobic group received: (1) 10-min preparation: lowload aerobic walking; (3) 40-min aerobic walking on the treadmill with heart rate (HR) at 60–85% of maximum HR (maximum HR = 220-age); (4) 10-min relax exercise including low-dose aerobic walking and adequate manual massage. 2.3 Data Collection Pulmonary function: Static pulmonary function of all the subjects was evaluated at baseline and 12-week follow-up visits using the Pulmonary Function Equipment (K4b2, COSMED S.R.L.). During the test, the subjects were in a sitting position in a quiet room with adequate temperature and humidity; the equipment was debugged and proofread before the test began. The parameters that were analyzed during the study were forced vital capacity (FVC), forced expiratory volume in 1 s (FEV1), FEV1/FVC ratio, maximum inspiratory pressure (MIP), maximal expiratory pressure (MEP), and peak expiratory flow (PEF) [18]. To ensure accuracy, each pulmonary function test was conducted three times, and the mean values of the parameters were utilized for analysis. Exercise capacity: exercise capacity of all subjects at pre-exercise and 12-week was assessed using the 6-min walk test (6MWT). All the assessment was done at around 3 pm. During the 6MWT, the subject wore light and comfortable clothing and shoes and was instructed to make a round-trip walk in a 30-m-long and quiet corridor for 6 min at the self-fastest speed. At the end of every single minute, the subject was motivated and taught the remaining time. If the subject had any physical discomfort such as chest pain and breathing difficulties during the test, the test would be stopped, and the walking time, distance, and relevant reason were recorded. After 6MWT, the total walking distance and immediate respiratory rate (time/minute) were recorded. Subjects were also requested to fulfill the modified Borg scale to analyze their subjective fatigue after the 6MWT. The score of the modified Borg scale varies from 0 to 10. A higher score represents more subjective fatigue.

254

W. Liu et al.

2.4 Statistical Analysis SPSS (version 21, IBM, Chicago, IL, USA) was used to perform the statistical analysis with a significance level of 0.05. The normal distribution of all data was assessed using the Shapiro-Wilks test, and results were presented as mean ± standard deviation. The independent-sample t-test (2-tailed) was employed to compare results between groups, while the paired t-test (2-tailed) was utilized to evaluate changes within groups.

3 Results 3.1 Subjects Sixty-eight subjects were assessed, and forty were eligible for this study (20 for each group). Table 1 lists the demographic information of subjects in each group. There was no significant difference between the baseline of the two groups (p > 0.05). Table 1. Baseline of the Schroth and aerobic groups Schroth group (n = 20, 5M/15F)

Aerobic group (n = 20, 3M/17F)

Mean ± standard deviation

Mean ± standard deviation

Age (year)

13.2 ± 2.6

13.5 ± 2.3

Height (cm)

158.3 ± 6.7

160.2 ± 8.3

Weight (kg)

47.7 ± 4.9

48.3 ± 6.6

BMI (kg/m2 )

47.7 ± 4.9

48.3 ± 6.6

Cobb (°)

59.4 ± 9.5

58.1 ± 8.5

M: male; F: female; BMI: body mass index

3.2 Pulmonary Function As shown in Table 2, after 12 weeks of intervention, the pulmonary function parameters of all subjects were significantly improved (p < 0.05). According to the inter-group comparison, the parameters, including FEV1/FVC, MIP, MPE, and PEF, were significantly higher in the Schroth group than in the aerobic group after 12 weeks (p < 0.05). 3.3 Exercise Capacity After 12-week exercise, the walking distance, Borg score, and respiratory rate of all subjects were significantly improved (p < 0.05), as shown in Table 3. Besides, Borg score and respiratory rate showed a significant decrease compared to the Schroth group than in the aerobic group at the 12-week follow-up visit (p < 0.05).

Effect of Schroth Exercise on Pulmonary Function

255

Table 2. Intra- and inter-group comparisons of parameters of pulmonary function at pre-exercise and after 12-week exercise

FVC (L) FVC pred% FEV1 (L)

Schroth group Mean ± standard deviation

Aerobic group Mean ± standard deviation

Pre-exercise

Pre-exercise

After 12-week

After 12-week

2.8 ± 0.3

3.1 ± 0.4

2.7 ± 0.4

3.0 ± 0.4

86.0 ± 4.2

91.1 ± 3.9

86.0 ± 4.0

90.6 ± 3.2

2.3 ± 0.3

2.6 ± 0.4

2.3 ± 0.4

2.4 ± 0.3

FEV1 pred%

85.4 ± 4.8

91.5 ± 4.2

85.3 ± 5.4

89.7 ± 4.6

FEV1 /FVC (%)

81.2 ± 5.1

85.1 ± 3.2*

83.2 ± 5.3

80.2 ± 6.3*

MIP (cmH2 O)

42.0 ± 3.4

49.2 ± 3.7*

42.2 ± 3.0

46.3 ± 3.3*

MEP (cmH2 O)

62.4 ± 3.7

72.3 ± 6.1*

61.8 ± 4.0

67.7 ± 4.0*

5.3 ± 0.4

6.4 ± 0.5*

5.3 ± 0.5

5.9 ± 0.5*

PEF (L/s)

* Inter-group difference was statistically significant (p < 0.05)

Table 3. Intra- and inter-group comparisons of parameters of exercise capacity at pre-exercise and after 12-wees exercise

Walking distance (m) Borg score Respiratory rate (time/minute)

Schroth group Mean ± standard deviation

Aerobic group Mean ± standard deviation

Pre-exercise

12-week

Pre-exercise

12-week

398.2 ± 10.4

449.1 ± 26.0

398.0 ± 10.3

436.0 ± 22.4

6.0 ± 0.8

3.6 ± 0.8*

6.0 ± 0.8

4.5 ± 0.9*

25.0 ± 2.7

21.2 ± 2.3*

25.3 ± 2.9

24.0 ± 2.5*

* Inter-group difference was statistically significant (p < 0.05)

4 Discussion Due to the complex interconnections between the spine, ribs, and sternum, AIS could profoundly affect the shape and volume of the thorax [25], impede the movement of the ribs [1], Reduce the compliance of the chest wall directly and the compliance of the lungs indirectly [1], and impair exercise capacity [26]. The effect could be associated with the severity of AIS. Patients with Cobb > 45° can get a higher risk of respiratory disorder and respiratory muscle fatigue [1, 27], and may present decreased exercise tolerance [1, 7]. Surgical treatment, commonly recommended for those severe cases, could subsequently accompany postoperative complications. Among the complications, respiratory dysfunction and decreased exercise capacity were two primary concerns to clinicians and families. The respiratory function has been considered an essential indicator for

256

W. Liu et al.

preoperative evaluation of patients with severe AIS [27], Therefore, exploring a helpful approach to improve pulmonary function and exercise capacity for severe cases was attractive. The main finding of this study was that the Schroth exercise showed superiority in improving pulmonary function and exercise capacity compared to conventional aerobic exercise. After 12-week exercise, all the subjects showed significant improvement in respiratory function (p < 0.05), following previous reports [16, 18, 20, 22]. In comparison with the conventional aerobic exercise group, FEV1 /FVC of subjects in the Schroth exercise group was significantly greater (85% vs. 80%, p < 0.05), suggesting that the Schroth exercise could better improve the airflow obstruction. Furthermore, the MIP and MEP values of subjects in the Schroth group were significantly higher than those in the conventional aerobic exercise group (49 vs. 46, 72 vs. 68, p < 0.05). This may be due to the improvement of respiratory muscle strength after 12-week exercise [28], in which the Schroth exercise group was superior to the conventional aerobic exercise. For the PEF, the value was significantly higher in subjects of the Schroth exercise group than in those of the conventional aerobic exercise group, revealing that Schroth exercise may better alleviate the airways’ narrow. These positive outcomes may be attributed by the integration of sensorimotor, posture and rotational breathing exercises, and muscle strength training. The 3D rotational breathing changed one’s abnormal breathing strategy by inhaling into the concave side of the trunk and exhaling from the convex side. The pressure generated from respiration on the chest wall would benefit the correction of deformed thoracic cage dimension and geometry, and increase lung volume and compliance, improving pulmonary function [29]. The exercise capacity of all subjects was significantly improved after the 12-week exercise (p < 0.05), which was in line with the previous reports [16, 18, 20]. The Schroth exercise was significantly superior to the conventional aerobic exercise in improving subjects’ respiratory rate and Borg score (3.6 vs. 4.5, 21.2 vs. 24.0, p < 0.05). Although there was no evidence to support the direct relationship between pulmonary function and exercise capacity, the improvement of exercise capacity was associated with improved pulmonary function. The superior outcome of Schroth exercise on pulmonary function could result in a superior outcome on the exercise capacity. For the walking distance, subjects in the Schroth exercise walked an average of 13 m longer than those in the conventional aerobic exercise group after 12-week exercise. However, the difference (13 m) was smaller than the minimum clinically significant difference (32 m) [30], so this finding reserved further investigation. There were several limitations to this study. Only the short-term effect of the Schroth exercise was studied, so the long-term effect was reserved for further study. Besides, this study was not a large-sample study. Future studies should recruit more eligible subjects to confirm the related findings. Nevertheless, this study provided evidence for applying Schroth exercise in patients with severe AIS and established a base for future relevant studies. Due to its complexity, the Schroth exercise should be performed under the supervision of a professional physical therapist.

Effect of Schroth Exercise on Pulmonary Function

257

5 Conclusions The Schroth exercise was superior to the conventional aerobic exercise in improving the respiratory function and exercise capacity of patients with severe AIS. It has the potential to be used as a helpful method for improving the preoperative respiratory function and exercise capacity of patients with severe AIS, decreasing the incidence of postoperative complications.

References 1. Koumbourlis, A.C.: Scoliosis and the respiratory system. Paediatr. Respir. Rev. 7, 152–160 (2006) 2. Yu, W., Zhang, Y.G., Zheng, G.Q., et al.: Influencing factors of preoperative pulmonary function and clinical significance for patients with adolescent idiopathic scoliosis. J. Spinal Surg. 12, 81–86 (2014) 3. Posadzki, P., Lee, M.S., Ernst, E.: Osteopathic manipulative treatment for pediatric conditions: a systematic review. Pediatrics 132, 140–152 (2013) 4. Weiss, H.R., Maier-Hennes, A.: Specific exercises in the treatment of scoliosis – differential indication. Stud. Health Technol. Inf. 135, 173–190 (2008) 5. Leong, J.C., Lu, W.W., Luk, K.D., et al.: Kinematics of the chest cage and spine during breathing in healthy individuals and in patients with adolescent idiopathic scoliosis. Spine 24, 1310–1315 (1999) 6. Smyth, R.J., Chapman, K.R., Wright, T.A., et al.: Ventilatory patterns during hypoxia, hypercapnia, and exercise in adolescents with mild scoliosis. Pediatrics 77, 692–697 (1986) 7. Agabegi, S., Kazemi, N., Sturm, P., et al.: Natural history of adolescent idiopathic scoliosis in skeletally mature patients: a critical review. J. Am. Acad. Orthop. Surg. 23, 714–723 (2015) 8. Scoliosis Research Society. Adolescent idiopathic scoliosis-treatment [Internet]. srs.org. [cited 2021 Apr 13] 9. Tsiligiannis, T., Grivas, T.: Pulmonary function in children with idiopathic scoliosis. Scoliosis 7, 7 (2012) 10. Negrini, S., Donzelli, S., Aulisa, A.G., et al.: 2016 SOSORT guidelines: orthopaedic and rehabilitation treatment of idiopathic scoliosis during growth. Scoliosis 13, 3 (2018) 11. Matsuo, T., So, R., Shimojo, N., et al.: Effect of aerobic exercise training followed by a lowcalorie diet on metabolic syndrome risk factors in men. Nutr. Metab. Cardiovasc. 25, 832–838 (2015) 12. Negrini, S., Aulisa, A.G., Aulisa, L., Circo, A.B., de Mauroy, J.C., Durmala, J., et al.: 2011 SOSORT guidelines: orthopaedic and rehabilitation treatment of idiopathic scoliosis during growth. Scoliosis 7(1), 3 (2012) 13. Bettany-Saltikov, J., Parent, E., Romano, M., Villagrasa, M., Negrini, S.: Physiotherapeutic scoliosis-specific exercises for adolescents with idiopathic scoliosis. Eur. J. Phys. Rehabil. Med. 50(1), 111–121 (2014) 14. Fusco, C., Zaina, F., Atanasio, S., Romano, M., Negrini, A., Negrini, S.: Physical exercises in the treatment of adolescent idiopathic scoliosis: an updated systematic review. Physiother. Theory Pract. 27(1), 80–114 (2011) 15. Hennes, A.: Schroth-Method, p. 1. Asklepios Katharina Schroth Klinik, Bad Sobernheim (2011) 16. Schreiber, S., Parent, E.C., Khodayari Moez, E., et al.: Schroth physiotherapeutic scoliosisspecific exercises added to the standard of care lead to better Cobb angle outcomes in adolescents with idiopathic scoliosis—an assessor and statistician blinded randomized controlled trial. PLoS One 11, e0168746 (2016)

258

W. Liu et al.

17. Kuru, T., Yeldan, I., Dereli, E.E., et al.: The efficacy of three-dimensional Schroth exercises in adolescent idiopathic scoliosis: a randomised controlled clinical trial. Clin. Rehabil. 30, 181–190 (2016) 18. Burger, M., Coetzee, W., du Plessis, L.Z., et al.: The effectiveness of Schroth exercises in adolescents with idiopathic scoliosis: a systematic review and meta-analysis. S. Afr. J. Physiother. 75, 904 (2019) 19. Fan, Y., Ren, Q., To, M.K.T., et al.: Effectiveness of scoliosis-specific exercises for alleviating adolescent idiopathic scoliosis: a systematic review. BMC Musculoskel. Dis. 21, 495 (2020) 20. Schreiber, S., Parent, E.C., Moez, E.K., et al.: The effect of Schroth exercises added to the standard of care on the quality of life and muscle endurance in adolescents with idiopathic scoliosis-an assessor and statistician blinded randomized controlled trial: “SOSORT 2015 Award Winner.” Scoliosis 10, 24 (2015) 21. Rigo, M., Reiter, C., Weiss, H.R.: Effect of conservative management on the prevalence of surgery in patients with adolescent idiopathic scoliosis. Pediatr. Rehabil. 6(3–4), 209–214 (2003) 22. Kim, K.D., Hwangbo, P.N.: Effects of the Schroth exercise on the Cobb’s angle and vital capacity of patients with idiopathic scoliosis that is an operative indication. J. Phys. Ther. Sci. 28, 923–926 (2016) 23. Kim, B.J.: A comparison on the influences of Schroth-based static scoliosis exercise and asymmetric scoliosis exercise on the patients with scoliosis. Ph.D. thesis, Graduate School of Daegu University (2014) 24. Rekha, Y.B., Rao, V.S.P.: Evaluation of pulmonary function in adolescent idiopathic thoracic scoliosis. Int. J. Orthop. Sci. 3, 665–670 (2017) 25. Liu, W., Wu, H.D., Liu, Y., et al.: Comparison of orthosis and exercise training for adolescent idiopathic scoliosis. Chin. J. Rehabil. Theory Pract. 25, 869–874 (2019) 26. Fornias, S.E., Carlos, V.M., Sales, A.A., et al.: Functional exercise capacity, lung function and chest wall deformity in patients with adolescent idiopathic scoliosis. Fisioter. Mov. 28, 563–572 (2015) 27. Lao, L.F., Shen, J.X., Weng, X.S., et al.: Evaluation of preoperative pulmonary function test in severe scoliosis and its clinical significance. Chin. J. Clin. 7, 5880–5883 (2013) 28. Alves, V.L.D.S., Avanzi, O.: Respiratory muscle strength in idiopathic scoliosis after training program. Acta Ortop. Bras. 24, 296–299 (2016) 29. Lin, Y., Tan, H., Rong, T., et al.: Impact of thoracic cage dimension and geometry on cardiopulmonary function in patients with congenital scoliosis: a prospective study. Spine 44, 1441–1448 (2019) 30. Shoemaker, M.J., Curtis, A.B., Vangsnes, E., et al.: Clinically meaningful change estimates for the six-minute walk test and daily activity in individuals with chronic heart failure. Cardiopulm. Phys. Ther. J. 24, 21–29 (2013)

An Imputation Approach to Electronic Medical Records Based on Time Series and Feature Association Y. F. Yin1(B) , Z. W. Yuan1 , J. X. Yang1 , and X. J. Bao2 1 College of Computer Science, Chongqing University, Shapingba District, No. 174

Shazhengjie, Chongqing, China [email protected] 2 Maharishi University of Management, Fairfield, IA, USA

Abstract. Due to the interruption of network transmission, the collected electronic medical records are usually incomplete data. Therefore, the imputation of missing values is of great significance, whose main challenge is to accurately predict the missing data. Current state-of-the-art methods often employ time relationships to predict the missing data, while ignoring the correlation of similar features. This motivates us to explore the association relationships of similar features, and fuse them with time series data mining. In this paper, we propose a novel time series and feature association approach, GRU-RMF, which can effectively improve the accuracy of missing value imputation in electronic medical records. In our GRU-RMF, we utilize the matrix factorization principle to decompose electronic medical records into time feature matrix and spatial feature matrix; we design a nonlinear regularized GRU deep neural network, which can learn the time relationship in the time feature matrix; we design a time-space feature fusion method with alternate solving to organically fuse the feature correlation and time correlation, which can improve the comprehensive accuracy of missing value imputation. The experimental results on several public datasets show that the proposed GRU-RMF has the advantages of higher imputation accuracy and better scalability. Keywords: Deep learning · Electronic medical · GRU · Missing value imputation · Time series data

1 Introduction Due to the interruption of network communication, data loss is very common phenomenon [1, 2]. If there are missing values in the collected electronic medical records, it will affect the normal use of the data. Current state-of-the-art methods [3, 4] usually employ time relations to predict the missing values, and then use the predicted values to impute the missing values. However, if the missing data are serious, such as continuous and long-term data loss, the time relations cannot be used to predict the missing values. Given those practical demands, matrix factorization methods have been studied more and more in recent years, where time regularized matrix factorization is a hot and difficult © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 259–276, 2024. https://doi.org/10.1007/978-3-031-51485-2_28

260

Y. F. Yin et al.

research topic. Early neural networks and fuzzy membership methods [5, 6] can be leveraged to alleviate the above problems. Recently, fog computing methods [3, 4] and adversarial generative network methods [7, 8] have emerged, which can also improve the accuracy of missing value imputation to a certain extent. Despite these successes, the association of similar features and their integration with time series methods have not been considered, which can potentially provide another way to alleviate the serious missing problem. For example, in Fig. 1, a4 is heavily missing, and if only the historical data of a4 are utilized to predict the missing data of a4, the prediction accuracy is very low. This motivates us to explore other similar features with a4, and exploit them to assist the imputation of a4. To characterize the time and spatial features of a4, we utilize the matrix factorization principle to decompose electronic medical records into time feature matrix and spatial feature matrix. Further, in order to learn the time relationship in the time feature matrix, we design a nonlinear regularized GRU deep neural network. Moreover, in order to organically fuse the feature correlation and time correlation, we design a time-space feature fusion method with alternate solving, which can improve the comprehensive accuracy of missing value imputation. The experimental results show that our approach GRU-RMF has higher prediction accuracy than the current state-of-the-art methods [3, 4].

Fig. 1. Missing data of medical devices. In Fig. 1, a medical device collects data from 6 different channels at the same time: a1, a2, a3, a4, a5, and a6, where the missing problem of a4 is the most serious. If the missing data of a4 are only imputed by its own time series relationship, the imputation accuracy is very low. If the relationship between a4 and a1, a2, a3, a5, a6 can be discovered, and this relationship can be leveraged to assist the imputation of a4, the imputation accuracy of a4 will certainly be improved.

In this paper, our motivation is to exploit the electronic medical records with small missing to assist the prediction and imputation of the records with large missing according to their association relationships. Our approach is a comprehensive method that leverages strategies such as matrix approximate factorization, GRU-based nonlinear gating regularization, and alternative solving. Our main contributions are outlined as follows:

An Imputation Approach to Electronic Medical Records

261

(1) We propose a novel missing value imputation approach GRU-RMF, which can alleviate the problem of consecutively missing electronic medical records. (2) We design the GRU-RMF neural network model, which includes missing value imputation module based on matrix factorization, time relationship mining module based on GRU, fusion module, etc. (3) We demonstrate theoretically and experimentally that our proposed GRU-RMF has better performance.

2 Related Work 2.1 Statistical Prediction Imputation Methods The early method of dealing with missing values in electronic health records is “deletion” [9]. The main idea of this method is to keep complete records and discard incomplete records. However, this method may remove some important information, resulting in large errors in using the data. As consecutively missing values appear, the error of this method will become unusually large. Therefore, missing values can be estimated by mathematical statistical methods [10, 11], such as mean value estimation, median value estimation. When electronic medical features follow the normal distribution, the imputation effect of statistical prediction imputation method is better; however, this type of method does not take advantage of the correlation between values at different times, and its main disadvantage is that the imputation accuracy is very low in most cases. 2.2 Data Mining Prediction Imputation Methods It is an important method to discover the underlying knowledge and rules in data through data mining methods [12, 13]. Such methods include neural networks, decision trees, fuzzy mathematics, etc. For a certain scale of data, data mining methods can obtain a certain pattern and then impute missing values according to the discovered pattern. The frequently used methods include the K-nearest neighbor (KNN) algorithm [14], neural network methods [15], maximum likelihood estimation and expectation maximization algorithm [16, 17]. This kind of method relies heavily on prior knowledge and does not adequately take into account the time characteristics of time series data, making it difficult to deal with data with high missing rates. 2.3 Deep Learning Prediction Imputation Methods Deep learning methods [18, 19] can learn the distribution law of the data and impute the missing values through the law. Such methods include matrix factorization [4], denoising encoders [20], generative adversarial networks [21, 22], etc. These methods require a lot of training before “filling”. For example, Yoon et al. [23] used adversarial generative network to provide additional information to the discriminator in a prompt manner, so that the generator could generate samples according to the distribution of existing data. However, this method focuses on the imputation of discrete data and does not consider correlations between times. Yu et al. [4] improved the matrix factorization method and proposed to take time series data as the input of the matrix factorization

262

Y. F. Yin et al.

model, which improved the accuracy of missing value imputation. Yoon et al. [24] proposed a bidirectional neural network missing value imputation model. Cao et al. [25] studied the relationship between missing variables and used graph neural networks to impute missing values. Ahn et al. [26] investigated methods for imputing missing values in time series data. References [27, 28] used the mean and the last observation to estimate the missing values of multivariate time series, and its main idea was to use the local mean of multivariate time series data and the last observed value to impute the missing values. However, when dealing with consecutively missing time series data, the imputation accuracy of these methods will drop sharply with the increase of missing time.

3 Missing Value Imputation of Electronic Medical Records In order to alleviate the problem of low imputation accuracy of consecutively missing electronic medical records, the model GRU-RMF based on time series and feature association is proposed in this paper. In this section, the problem of the electronic medical record missing is first presented, followed by an introduction to the model GRU-RMF. 3.1 Description of the Problem Let the variable Y satisfy Eq. (1).   Y = y1 , . . . yi , . . . yT , whereyi =< f1 , f2 , . . . , fn > .

(1)

In Eq. (1), Y represents the time series data in electronic medical records. yi is the data at time i, which is a multivariate vector. If the dimension of yi is n, it needs to be characterized by . Taking Y as the research object, obviously, it is a tensor. For the convenience of research, suppose y1 , y2 . . . yT to be the state values of Y at the time 1st, 2nd … Tth. If the missing data are not serious, it can be imputed only based on the time series relationship. However, when the missing time is long, the imputation effect is not good if only based on the time series relationships. Figure 2 is the consecutively missing matrix of electronic medical records. The problem of imputing missing values in electronic medical records can be simplified to the problem of imputing the missing matrix in Fig. 2. In the next section, we will explore the method for fusing matrix factorization and GRU regularization for missing value imputation in electronic medical records.

Fig. 2. Missing matrix. In Fig. 2, the part shaded in green represents consecutively missing data.

An Imputation Approach to Electronic Medical Records

263

3.2 Our Approach—GRU-RMF To effectively address the imputation problem of consecutively missing values, according to the features of electronic medical records, we propose GRU-RMF. GRU-RMF can discover the time series relationship between the time series data in the multivariate time series data, and the correlation between different features, etc. Moreover, it can leverage time relationship and feature relationship to perform integrated and comprehensive imputation for the missing values. Figure 3 is the structural framework of the GRU-RMF.

Fig. 3. Framework of GRU-RMF model. In Fig. 3, the left side is the matrix factorization, and the right side is the nonlinear regularized GRU. The original input is decomposed into time feature matrix (denoted as Time slots) and spatial feature matrix (denoted as Features). The output of GRU regularization is the nonlinear regularization term, RGRU , which is used as a component of the loss function. * indicates matrix multiplication.

GRU-RMF appears as a nested structure. First, an approximate matrix factorization is performed for electronic medical records with missing values (denoted as Y[yi,t ] ∈ Rn×T ). Next, for the decomposed time feature matrix X, the GRU nonlinear gating regularization method is exploited into the iterative process of X. Finally, the alternatively solving method is utilized to complete the matrix factorization and iterative optimization of the parameters of the GRU neural network. Where, res is the residual between the result of matrix factorization and the original matrix, Rx is the regularization term of the time feature matrix, Rw is the regularization term of the spatial feature matrix, and RGRU is the GRU regularization term. Therefore, the key parts of the model GRU-RMF are the matrix factorization-based missing value imputation and GRU-based time relationship mining. Imputation of Missing Values Based on Matrix Factorization: The matrix factorization algorithm attempts to decompose and reconstruct the original data by matrix factorization and reconstruction, to find the correlation between the data and then impute the missing values, which is an idea of collaborative filtering. In recent years, methods based on matrix factorization have been introduced into the field of time series data imputation [4]. Generally, matrix factorization-based methods decompose a data matrix

264

Y. F. Yin et al.

into two low-dimensional matrices, which simultaneously extract relevant features from the original data, and then attempt to reconstruct the original matrix. During the process of matrix reconstruction, missing values are imputed. Generally speaking, the data matrix for matrix factorization is: Y[yi,t ] ∈ Rn×T , abbreviated as Y. Taking the time series data of human physiological indicators as an example, the matrix Y is obtained by the interaction of the eigenvector and the time vector, where the dimension of the eigenvector is n, and the dimension of the time vector is T. In this paper, we consider expanding the eigenvector into an n × r-dimensional matrix, denoted by the spatial feature matrix W ∈ Rn×r , where r is the embedding dimension. In the same way, the time vector is expanded into a matrix of r × T dimension, denoted by the time feature matrix X ∈ Rr×T , where r is the embedding dimension. The matrix Y is approximated by matrix multiplication from a low-rank spatial feature matrix W and a time feature matrix X, so as to complete the imputation of missing values, as shown in Eq. (2). Y ≈ W ∗ X.

(2)

To impute missing values, it is necessary to use gradient descent, least squares, etc., which are used to solve the minimum optimization problem, thus completing the approximation of matrix multiplication [4]. Equation (3) is the optimization objective. min WX

2   yi,t − wiT xt + λw Rw (W ) + λx Rx (X )

(3)

(i,t)∈

where Ω is the set of positions of non-zero elements in the original data matrix Y;  2  T is the square of the F-norm of the residual matrix, which is (i,t)∈ yi,t − wi xt exploited to describe the difference between W*X and the original matrix Y; Rw and Rx are the regularization terms of matrix W and matrix X, respectively, which are used to prevent overfitting. Rw (W ) and Rx (X ) are calculated as shown in Eqs. (4) and (5). Rw (W) = W 2F =

n 

wi 2 =

i=1

Rx (X) = X 2F =

T  t=1

n 

wiT wi .

(4)

xTt xt .

(5)

i=1

xt 2 =

T  t=1

According to Eqs. (2)–(5), the interaction between the features in the original data matrix Y can be captured. It should be noted that when Y ∈ Rn×T is decomposed into W ∈ Rn×r and X ∈ Rr×T , not only the subsequent operations are simple, but also it is easier to explain the interaction relations between W and X, between the various components of W, and between the various components of X. Time Relationship Mining Based on GRU: To mine the time correlation in electronic medical records, on the basis of Eq. (2), it is necessary to further mine the time correlation between components in X. For this purpose, we design a GRU-based nonlinear gating regularization method to mine long-term and short-term dependencies between time

An Imputation Approach to Electronic Medical Records

265

series. Since GRU is a deep neural network, the network needs to be trained so that it can learn the correlation between time series. Generally, the time correlation in the time feature matrix X ∈ Rr×T represents to the relationship between each column in X. Let xt and xt−l represent any two different columns in X, and construct a time correlation, such as Eq. (6).  xt ≈ θl ∗ xt−l (6) l∈L

where L is a set that stores the correlation between t and t−l in X, and θl is the coefficient vector of xt−l . Due to the arbitrariness of l, combining all θl can obtain an association matrix, denoted as . Based on the time correlation among the columns in the time feature matrix X ∈ Rr×T , we introduce the GRU nonlinear gating regularization term RGRU (X |L, ), as shown in Eq. (7). The matrix  and the time feature matrix X are learned using a GRU deep neural network. RGRU (X |L, ) =

1 2

T 

  2 xt − f xt−l1 , xt−l2 , . . . , xt−ld 2

(7)

ld ∈ L, t = ld + 1

where X is the time feature matrix, L is the set that stores the correlations, and  is the association matrix. Suppose f is the transfer function of the neural network, then the historical time vector xt−l1 , xt−l2 , . . . , xt−ld can be input into f , and thus, the prediction     vector xt can be obtained, that is, xt = f xt−l1 , xt−l2 , . . . , xt−ld . Therefore, the optimization objective of the GRU-based nonlinear gating regularization is defined by Eq. (8). 2 λ η 1   λx η λx w W 2F + X 2F + RGRU (X |L, ). (8) yi,t − wiT xt + W ,X , 2 2 2 2 min

(i,t)∈

In Eq. (8), the first term is the residual sum of squares; the second and third terms are the L2-norm penalty terms, which are used to prevent overfitting; the fourth term is the GRU time regularization term.  is the set of positions of non-zero elements in the original data matrix, and λw , λx and η are hyper-parameters. Fusion Module: To achieve an effective fusion of feature correlation and time correlation, we consider using an alternatively solving method to iteratively optimize the parameters of each matrix and neural network. The specific method is “two-step crossing iteration”, that is, each iteration includes two steps: Step 1: Use Alternating Least Squares (ALS) [29] to update the spatial feature matrix W and the time feature matrix X; Step 2: Update the parameters  of the GRU neural network using back propagation. In Step 1, the following operations are specifically performed.   First, fix the neural  network parameter  of GRU, and calculate xt = fθ xt−l1 , xt−l2 , . . . , xt−ld , t = ld + 1, . . . , T .

266

Y. F. Yin et al.

Next, the optimization objectives of the spatial feature matrix W and the time feature matrix X based on the least squares method are constructed, as shown in Eq. (9). 2 λ η 1   λx η w W 2F + X 2F . yi,t − wiT xt + W ,X 2 2 2 min

(9)

(i,t)∈

Then, use the least squares method to update the spatial feature vector wi , i = 1, 2, . . . , M , as shown in Eq. (10). ⎛

⎞−1 ⎛



wi = ⎝

xt xTt + λw ηI ⎠



t:(i,t)∈





yi,t xt ⎠

(10)

t:(i,t)∈

where I is the identity matrix. The least squares method is used to update the time feature vectors xt , t = 1, 2, . . . , ld , as shown in Eq. (11). ⎛



xt = ⎝

⎞−1 ⎛ wi wiT + λx ηI ⎠



i:(i,t)∈





yi,t wi ⎠.

(11)

i:(i,t)∈

According to Eq. (12), the time feature vector xt is updated again using the least squares method, t = ld + 1, ld + 2, . . . , T . ⎛ xt = ⎝

⎞−1 ⎛





wi wiT + λx I +λx ηI ⎠

i:(i,t)∈



⎞ yi,t wi +λx xt ⎠.

(12)

i:(i,t)∈

In Step 2, the following operations are specifically performed. First, a GRU-based optimization objective is constructed as shown in Eq. (13). min 

T    xt − f xt−l , xt−l , . . . , xt−l 2 . 1 2 d 2

(13)

t=ld +1

Second, through the gradient descent mechanism and back propagation mechanism of the GRU deep neural network, the neural network parameters  of the GRU are iteratively updated. Finally, according to the convergence condition, such as Eq. (14), the evaluation is performed. T 2 (p+1) (p+1)T X − W (p) X (p) W F ≤∈, C= (p) (p)T 2 X W

(14)

F

where p and p + 1 represent the number of iterations, and the value of  is 1 × 10−4 .

An Imputation Approach to Electronic Medical Records

267

The GRU-RMF model utilizes the current observation data yt to correct the pre viously predicted time feature vector xt . The corrected time feature vectors are then optimized by the least squares method to obtain the optimized eigenvectors xt . The optimized feature vector xt is combined with the pre-trained spatial feature matrix W to impute the current observations, as shown in Eq. (15). yˆ t = W T xt .

(15)

By using the time regularization term based on the GRU deep neural network, the  time feature vector xt+1 can be predicted at time t, as shown in Eq. (16).   (16) xt+1 = fθ xt+1−l1 , xt+1−l2 , . . . , xt+1−ld . The predicted value at time t + 1 can be obtained by combining the predicted time  feature vector xt+1 with the pre-trained spatial feature matrix W, as shown in Eq. (17). yt+1 = W T xt+1 .

(17)

Thus, an iterative training process is formed until the training target is reached or terminated in advance.

4 Experimental Method To verify the effectiveness of the model GRU-RMF, we have carried out the experimental studies. Through experiments, we try to answer the following questions: (1) whether the imputation accuracy of the GRU-RMF model is superior under different missing rates; (2) how the dimensions of the dataset affect the GRU-RMF model. 4.1 Experimental Setup Like current state-of-the-art methods [3, 4], we used three publicly available e-health datasets, viz., Health-care, Perf-DS1, and Perf-DS2 [3, 30]. These electronic medical records are about human health indicators, such as data on physiological indicators of patients in the Intensive Care Unit (ICU) [3], and there are also monitoring datasets for online medical care [30]. The dataset is provided by intensive care units and community hospitals; the content of the dataset is human physiological indicators, such as body temperature, heart rate, blood sugar content, electrocardiogram and other 35 indicators; each record is multivariate time series data with a length of 24–36 h. The average missing rate of the health-care dataset is 80.67%, and its main task is to classify patients. The data missing rate of the Perf-DS1 dataset is 13%, and it has obvious periodicity. The consecutively missing phenomenon of the Perf-DS2 dataset is serious, the average data missing rate is 50%, and its value is relatively stable over time, that is, there are no obvious peaks and troughs. In terms of the division of the training dataset and the test dataset, three datasets are divided in a ratio of 7:3 in this paper, which are trained and tested, respectively. Since

268

Y. F. Yin et al.

the missing value imputation based on traditional statistical methods does not require training, it is directly imputed with the test dataset. For fair comparison, based on the original datasets, a secondary random missing rate is generated, in which the random seed is 1024. In the comparison experiments, we chose Zero [10], Mean [11], Last [12], KNN [14, 26], TRMF [4], BTMF [3, 4], GRU-RMF and other methods as comparison objects. • Zero [10]: The missing values are imputed by zero. • Mean [11]: This is a global mean imputation algorithm, which uses the global mean to impute missing values. • Last [12]: The last observation is used to impute the missing values. • KNN [14, 26]: This is a k-nearest neighbor imputation algorithm, which leverages the KNN algorithm to find similar samples, and uses the weighted average of the neighbors of missing values to impute missing values. • TRMF [4]: This is a revised matrix factorization method, which assists the matrix factorization by defining a temporal relationship equation. • BTMF [3, 4]: This is a sort of Bayesian time factorization method, which uses vector autoregressive to model the time factor matrix. • GRU-RMF: This is the method proposed in this paper that integrates GRU-based nonlinear gating regularization into the iterative optimization process of matrix factorization, and this method essentially exploits the correlation between time series data and the correlation between spatial features to perform comprehensive imputation. Due to the deep neural network model involves a large number of parameters that need to be trained. Like current state-of-the-art methods [3, 4], a normal distribution is used to initialize all parameters of the model. It should be noted that the values of these parameters are carried out before training. The model time set ld is determined by the time series length. During initialization, the dimensions of W and X are both 100, and the iteration period (epoch) is 200. The Adam optimizer is used for stochastic gradient descent training with a learning rate of 0.001 and a batch size of 64. To prevent the distribution of the dataset from adversely affecting the training process, all data are normalized so that their mean and unit variance are zero. In the experiments, variables were mapped between 0 and 1 using Sigmoid as the activation function. 4.2 Evaluation Criteria To directly calculate the imputation accuracy, the evaluation criteria used in this paper are the Root Mean Squared Error (RMSE) and the Mean Absolute Percentage Error (MAPE) between the original data and the imputed data, as shown in Eqs. (18) and (19).

 n 1   2 (yi − yi ) (18) RMSE =  n i=1    n 100%  yi − yi  (19) MAPE =    yi  n i=1

An Imputation Approach to Electronic Medical Records

269



In Eqs. (18) and (19), n represents the number of samples, and yi and yi represent the actual value and the predicted value at time i, respectively. RMSE and MAPE represent the gap between the original data and the imputed data. The smaller the RMSE and MAPE, the smaller the gap between the imputed value and the original data, and the better the imputation effect. In order to evaluate the classification effect, such as the balance of sample categories, and the accuracy of the binary classification, etc., the AUC (Area Under Curve) criteria is adopted in this paper. The AUC criteria represents the area under the Receiver Operating Characteristic Curve (ROC). Since the ROC curve is determined by the sampling threshold of the classification model, the AUC criteria is not sensitive to the proportion of positive and negative samples. Compared with other evaluation criteria that are sensitive to the proportion of positive and negative samples, such as precision and recall, the AUC criteria can better distinguish the quality of a binary classification model. Therefore, higher AUC indicates better performance.

5 Results In the experiments, we implemented all the evaluation testbeds using Pytorch. In order to evaluate the imputation accuracy under different missing rates, the related models need to be evaluated and compared using three datasets under different missing rates. In the evaluation experiments, use "underlined" to identify the three models that performed better, and "bold" to identify the model that performed the best in the experiment. Table 1 is the imputation accuracy of the models Zero, Mean, Last, KNN, TRMF, BTMF, and GRU-RMF on the Health-care dataset under different missing rates. Table 1. Imputation accuracy at different missing rates (Health-care dataset). The first column of the table is the models, the second column is the experimental results of the criteria RMSE and MAPE under the random missing rate of 10%, and the third column shows the experimental results of the criteria RMSE and MAPE under the random missing rate of 20%. Models

10%

20%

RMSE

MAPE

RMSE

MAPE

Zero [10]

144.16

100%

156.02

100%

Mean [11]

136.35

330.69%

149.97

311.56%

Last [12]

121.95

45.04%

135.55

46.04%

KNN [26]

162.44

46.71%

140.18

42.12%

TRMF [4]

89.53

43.30%

104.99

45.35%

BTMF [3]

82.77

40.61%

97.50

42.33%

GRU-RMF

73.91

26.48%

87.31

30.97%

As can be seen from Table 1, the imputation effects of different models on the dataset Health-care are very different. The GRU-RMF model has the best imputation effect, it

270

Y. F. Yin et al.

is 49% higher than the Zero model in the RMSE criteria, and 10% higher than the BTMF model in the RMSE criteria; in the MAPE criteria, the GRU-RMF model is 73% higher than the Zero model, and is 14% improvement over the BTMF model. Due to the high missing rate of the Health-care dataset, it is difficult for imputing. Therefore, for the imputation of the health-care dataset, if the imputation only relies on the same kind of historical data, for example, only the existing body temperature data are used to impute the missing body temperature data, the imputation effect is poor. In Table 1, the imputation effect of models such as Last, KNN, etc. are all not good, which just illustrates this point. The GRU-RMF model not only leverages the spatial feature relationship between dimensions, but also exploits the time feature relationship between time series data, so the imputation effect of the GRU-RMF model is better. Table 2 shows the imputation accuracy of the models Zero, Mean, Last, KNN, TRMF, BTMF, and GRU-RMF on the Perf-DS1 dataset under different missing rates. In Table 2, the GRU-RMF model exhibits outstanding performance on the PerfDS1 dataset. On the RMSE evaluation criteria, the GRU-RMF model has an average improvement of 91% over the Zero model and an average of 4% improvement over the BTMF model. On the MAPE evaluation criteria, it is 91% higher than the Zero model and 1% higher than the BTMF model. It is worth noting that the imputation effect of the BTMF model on the Perf-DS1 dataset is second only to the GRU-RMF model. The imputation effect of the KNN model ranks third. By analyzing the Perf-DS1 dataset, it is found that the missing rate of the Perf-DS1 dataset is low, and the correlation between features and the correlation between time series are easy to capture. On the whole, the GRU-RMF model proposed in this paper can better capture the correlation between features and the correlation between time series. Table 3 is the imputation accuracy of the models Zero, Mean, Last, KNN, TRMF, BTMF, and GRU-RMF on the Perf-DS2 dataset under different missing rates. It can be seen from Table 3 that the GRU-RMF model has the best average imputation effect on the Perf-DS2 dataset, followed by the Last model, and the BTMF model is the third. Since the consecutively missing phenomenon of the Perf-DS2 dataset is serious, the average missing rate reaches over 50%, so it is not good to utilize the time series relationship alone or the spatial feature relationship alone to impute. For example, in Table 1, the TRMF model and the KNN model do not work well, which illustrates this point. The GRU-RMF model introduces the GRU time regularization method on the basis of the feature correlation mining based on matrix factorization, and successfully captures and reasonably integrates the time correlation and feature correlation. Therefore, the imputation effect of GRU-RMF can achieve the best.

GRU-RMF

BTMF [3]

TRMF [4]

KNN [26]

Last [12]

4.66

8.57%

MAPE

8.52%

MAPE

RMSE

4.75

19.47%

MAPE

RMSE

8.09

9.60%

RMSE

5.02

MAPE

11.65%

MAPE

RMSE

7.63

23.71%

MAPE

RMSE

10.00

RMSE

MAPE

Mean [11]

57.58

100.0%

RMSE

Zero [10]

10%

Evaluation

Models

8.73%

4.74

8.74%

4.97

20.20%

8.42

9.98%

5.19

12.20%

7.96

31.00%

12.14

100.0%

57.60

20%

8.89%

4.82

8.92%

5.11

21.19%

8.75

10.42%

5.33

12.91%

8.30

31.79%

12.25

100.0%

57.59

30%

9.04%

4.89

9.17%

5.20

21.70%

8.99

10.90%

5.50

13.52%

8.76

31.08%

12.23

100.0%

57.57

40%

9.32%

5.00

9.50%

5.24

22.29%

9.21

11.51%

5.73

14.38%

9.31

31.00%

12.16

100.0%

57.59

50%

9.52%

5.07

9.58%

5.32

22.63%

9.38

12.29%

6.02

15.60%

10.04

30.79%

12.16

100.0%

57.59

60%

10.08%

5.27

10.24%

5.47

23.15%

9.56

13.59%

6.44

17.28%

11.06

30.89%

12.17

100.0%

57.59

70%

10.75%

5.53

11.32%

5.59

23.69%

9.76

15.87%

7.15

19.97%

12.57

31.07%

12.19

100.0%

57.60

80%

Table 2. Imputation accuracy at different missing rates (Perf-DS1 dataset). The first column of the table is the models, the second column is the evaluation criteria, and the third to tenth columns are the experimental results of the criteria RMSE and MAPE under different random missing rates.

An Imputation Approach to Electronic Medical Records 271

272

Y. F. Yin et al.

Table 3. Imputation performance under different missing rates (Perf-DS2 dataset). The first column of the table is the models, the second column is the evaluation criteria, and the third to eighth columns are the experimental results of the criteria RMSE and MAPE under different random missing rates. Models

Evaluation

10%

20%

30%

40%

50%

60%

Zero [10]

RMSE

48.54

49.08

49.50

49.39

49.23

49.37

MAPE

100.0%

100.0%

100.0%

100.0%

100.0%

100.0%

Mean [11]

RMSE

36.52

36.94

37.57

37.65

37.72

38.14

MAPE

1107%

1089%

1102%

1097%

1095%

1066%

RMSE

12.66

13.96

15.67

16.83

17.87

19.54

MAPE

27.66%

31.03%

33.29%

36.89%

41.42%

47.59%

KNN [26]

RMSE

14.33

15.28

17.02

18.97

22.40

27.21

MAPE

37.03%

40.06%

44.30%

53.32%

70.96%

104.2%

TRMF [4]

RMSE

19.63

20.65

22.51

23.36

23.99

25.12

MAPE

111.7%

121.3%

128.9%

136.6%

143.4%

149.0%

BTMF [3]

RMSE

12.54

13.46

15.86

17.02

18.25

20.56

MAPE

44.65%

48.42%

57.74%

61.40%

68.55%

74.38%

RMSE

10.99

10.95

13.20

13.47

14.23

15.09

MAPE

44.99%

47.49%

50.76%

56.99%

61.40%

66.02%

Last [12]

GRU-RMF

6 Discussions 6.1 Usefulness of the Model The integrity of electronic medical records is of great significance to the construction of intelligent medical care. The main challenge of electronic medical records imputation is to improve the accuracy of missing data predictions. The existing methods utilize historical data to predict missing data, with few mentioning the association relationships of similar features. We exploit deep learning method to integrate the mining of similar feature association relationships with the prediction of time series, so as to improve the accuracy of missing values imputation in electronic medical records. In the field of electronic medical records imputation, our method improves by an average of 0.7123 on the RMSE criteria and 0.683 on the MAPE criteria compared to the state-of-the-art method [3, 4]. Finding the association relationships of similar features can not only improve the accuracy of missing values imputation, but also provide reference for online medical diagnosis research. The imputed electronic medical records can be used for classification and regression prediction. We discuss the performance of the imputed datasets on classification and regression prediction. The evaluation criteria used for the classification task is AUC, and the evaluation criteria used for the regression prediction task is RMSE. The recurrent neural network (RNN) classifier and regression model are considered to construct by

An Imputation Approach to Electronic Medical Records

273

using RNN layers [2, 14], and the imputed input datasets are used to train the models. The epoch of training is 30, the learning rate is 0.005, the dropout is 0.5, and the dimension of the hidden state in the RNN is 64. Since the Health-care dataset is suitable for classification tasks, classification experiments are performed on the imputed Health-care dataset, as shown in Fig. 4. As can be seen from Fig. 4, the classification performance of each model is inseparable from its imputation effect. The GRU-RMF model has the best classification performance, followed by BTMF, and TRMF in third. The poorly performing models are Zero and KNN, where the KNN model is even worse than the Zero model. Since the Perf-DS1 dataset and Perf-DS2 dataset are suitable for regression prediction task, the regression prediction experiments on the imputed Perf-DS1 dataset and Perf-DS2 dataset are shown in Fig. 5.

Fig. 4. Classification effect on the imputed health-care dataset. In Fig. 4, the abscissa is the models Zero, Mean, Last, KNN, TRMF, BTMF, and GRU-RMF. The ordinate is the evaluation criterion AUC of the model. For classification tasks, the larger the AUC, the better the classification effect.

Fig. 5. Regression prediction effect on the imputed Perf-DS1 and Perf-DS2 datasets. In Fig. 5, the abscissa is the models Zero, Mean, Last, KNN, TRMF, BTMF, and GRU-RMF. The ordinate is the evaluation criterion RMSE of the model. The left figure is the experimental results of the Perf-DS1 dataset, and the right figure is the experimental results of the Perf-DS2 dataset. For regression prediction tasks, the smaller the RMSE, the better the regression prediction effect.

It can be seen from Fig. 5, when the missing rate of the original dataset is small, e.g., dataset Perf-DS1, a simple statistical imputation method can achieve better results. For

274

Y. F. Yin et al.

example, on the left side of Fig. 5, the regression prediction effect of the Mean model reaches RMSE = 0.489. When the consecutive missing phenomenon of the original dataset is serious, e.g., dataset Perf-DS2, the statistical prediction imputation method is basically unusable, while our GRU-RMF still shows good performance. To sum up, our GRU-RMF outperforms the existing models in both classification and regression prediction. 6.2 Impact of Data Dimension on the Model The impact of data dimension on the model refers to the impact of the number of features contained in the dataset on the model. In order to explore the impact of data dimension on the GRU-RMF model, it is necessary to select a dataset with more features in the original dataset. In the experimental study, the Perf-DS1 dataset has the largest number of features. Therefore, the Perf-DS1 dataset is utilized as an example to verify the impact of the data dimension on the GRU-RMF model. RMSE and MAPE are still chosen as the evaluation criteria. Figure 6 is the experiment results of the impact of data dimension on the GRU-RMF model.

Fig. 6. Impact of different data dimensions on GRU-RMF. In Fig. 6, the abscissa is the data dimension, and the ordinate is the evaluation criteria RMSE and MAPE.

As can be seen from Fig. 6, with the change of data dimensions, the criteria RMSE and MAPE of the model GRU-RMF do not change significantly. This shows that the data dimension has little impact on the GRU-RMF model.

7 Conclusion The imputation of missing values in electronic medical records is a research topic of great significance. For consecutively missing electronic medical data, we have proposed a novel approach, GRU-RMF, which uses the correlation of similar features and the correlation between time series to improve the accuracy of imputation. By investigating the “matrix factorization” and “GRU time series mining”, we have designed the GRURMF neural network, which includes missing value imputation module based on matrix factorization, time relationship mining module based on GRU, and fusion module, etc. Experimental studies have been conducted on three public electronic medical datasets,

An Imputation Approach to Electronic Medical Records

275

and the experimental results show that our GRU-RMF exceeds the current state-of-theart methods. In addition, our GRU-RMF is not affected by the dimensions of the dataset and can well support downstream classification and regression tasks. In the future, we will explore the missing value imputation of electronic medical records based on few-shot learning, which we believe will motivate new algorithmic findings.

References 1. Wang, R.J., Pei, X.K., Zhu, J.Y., et al.: Multivariable time series forecasting using model fusion. Inf. Sci. 585, 262–274 (2022) 2. Li, Z.K., Liu, H., Zhao, J.B., et al.: A power system disturbance classification method robust to PMU data quality issues. IEEE Trans. Ind. Inf. 18(1), 130–142 (2022) 3. Duhayyim, M.A.I., Al-Wesabi, F.N., Marzouk, R.: Integration of fog computing for health record management using blockchain technology. CMC Comput. Mater. Continua 71(2), 4135–4149 (2022) 4. Yu, H.F., Rao, N., Dhillon, I.S.: Temporal regularized matrix factorization for highdimensional time series prediction. Adv. Neural. Inf. Process. Syst. 29, 847–855 (2016) 5. Xu, H., Sun, G.P., Jiang, P., et al.: Water quality monitoring missing data filling method based on improved OCS-FCM. In: Proceedings of 2019 Chinese Automation Congress (CAC 2019), pp. 4291–4296 (2019) 6. Yu, J.Y., He, Y.L., Huang, J.S., et al.: A two-stage missing value imputation method based on autoencoder neural network. In: Proceedings of 2021 IEEE International Conference on Big Data (Big Data 2021), pp. 6064–6066 (2021) 7. Pati, S.K., Gupta, M.K., Shai, R., et al.: Missing value estimation of microarray data using Sim-GAN. Knowl. Inf. Syst. 64(10), 2661–2687 (2022) 8. Xiao, X., Zhang, Y.L., Yang, S., et al.: Efficient missing counts imputation of a bike-sharing system by generative adversarial network. IEEE Trans. Intell. Transp. Syst. 23(8), 13443– 13451 (2022) 9. Sánchez-Morales, A., Sancho-Gómez, J., Martínez-García, J.-A.: Improving deep learning performance with missing values via deletion and compensation. Neural Comput. Appl. 32(17), 13233–13244 (2020) 10. Park, S., Li, C.T., Han, S., et al.: Learning sleep quality from daily logs. In: 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), pp. 2421–2429 (2019) 11. Robertson, T., Beveridge, G., Bromley, C.: Allostatic load as a predictor of all-cause and cause-specific mortality in the general population: evidence from the Scottish Health Survey. 12(8), 1–14 (2017) 12. Nickerson, P., Baharloo, R., Davoudi, A., et al.: Comparison of Gaussian processes methods to linear methods for imputation of sparse physiological time series. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4106–4109 (2018) 13. Lena, P.D., Sala, C., Prodi, A., et al.: Missing value estimation methods for DNA methylation data. Bioinformatics 35(19), 3786–3793 (2019) 14. García-Laencina, P.J., Sancho-Gómez, J.L., Figueiras-Vidal, A.R., et al.: K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing 72(7–9), 1483–1493 (2009) 15. Iranfar, A., Arza, A., Atienza, D.: ReLearn: a robust machine learning framework in presence of missing data for multimodal stress detection from physiological signals. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 535–541 (2021)

276

Y. F. Yin et al.

16. Liu, C.L., Soong, R.S., Lee, W.C., et al.: Predicting short-term survival after liver transplantation using machine learning. Sci. Rep. 10(1), 1–10 (2020) 17. Zhang, X.J., Savalei, V.: Examining the effect of missing data on RMSEA and CFI under normal theory full-information maximum likelihood. Struct. Equ. Model. 27(2), 219–239 (2020) 18. Lin, W.-C., Tsai, C.-F., Zhong, J.R.: Deep learning for missing value imputation of continuous data and the effect of data discretization. Knowl. Based Syst. 239 (2022) 19. Samad, M.D., Abrar, S., Diawara, N.: Missing value estimation using clustering and deep learning within multiple imputation framework. Knowl. Based Syst. 249 (2022) 20. Gondara, L., Wang, K.: Multiple imputation using deep denoising autoencoders. In: PacificAsia Conference on Knowledge Discovery and Data Mining (PAKDD 2018): Advances in Knowledge Discovery and Data Mining, pp. 260–272 (2018) 21. Yu, L., Zhang, W., Wang, J., et al.: Seqgan: sequence generative adversarial nets with policy gradient. In: Proceedings of the AAAI conference on artificial intelligence, pp. 2852–2858 (2017) 22. Fedus, W., Goodfellow, I., Dai, A.M.: Maskgan: better text generation via filling in the_, Sixth International Conference on Learning Representations, ICLR 2018, pp. 1–18 (2018) 23. Yoon, J., Jordon, J., Schaar, M.: Gain: missing data imputation using generative adversarial nets. In: International Conference on Machine Learning, PMLR2018, pp. 5689–5698 (2018) 24. Yoon, J., Zame, W.R., van der Schaar, M.: Estimating missing data in temporal data streams using multi-directional recurrent neural networks. IEEE Trans. Biomed. Eng. 66(5), 1477– 1490 (2019) 25. Cao, W., Wang, D., Li, J., et al.: BRITS: bidirectional recurrent imputation for time series. In: 32nd Conference on Neural Information Processing Systems (NIPS 2018), pp. 1–11 (2018) 26. Ahn, H., Sun, K., Kim, K.P.: Comparison of missing data imputation methods in time series forecasting. CMC Comput. Mater. Continua 70(1), 767–779 (2022) 27. Zhang, Y., Zhou, B., Cai, X., et al.: Missing value imputation in multivariate time series with end-to-end generative adversarial networks. 551, 67–82 (2021) 28. Che, Z., Purushotham, S., Cho, K., et al.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 1–12 (2018) 29. Pablo, S.-Q.: A regularized alternating least-squares method for minimizing a sum of squared Euclidean norms with rank constraint. J. Appl. Math. 2022 (2022) 30. Lee, Y.K., Pae, D.S., Hong, D.K., et al.: Emotion recognition with short-period physiological signals using bimodal sparse autoencoders. Intell. Autom. Soft Comput. 32(2), 657–673 (2022)

A Software Tool for Anomaly Detection and Labeling of Ventilator Waveforms Cheng Chen1 , Zunliang Wang1(B) , Chuang Chen2 , Xuan Wang1 , and Songqiao Liu2,3(B) 1 State Key Laboratory of Bioelectronics, School of Biological Science and Medical

Engineering, Southeast University, Si Pai Lou 2, Nanjing, China [email protected] 2 Jiangsu Provincial Key Laboratory of Critical Care Medicine, Department of Critical Care Medicine, School of Medicine, Zhongda Hospital, Southeast University, Dingjiaqiao 87, Nanjing, China [email protected] 3 Department of Critical Care Medicine, Nanjing Lishui People’s Hospital, Zhongda Hospital Lishui Branch, Nanjing, China

Abstract. Patient-ventilator asynchronies (PVA) during mechanical ventilation can lead to a prolonged duration of ventilation and increased mortality. Identifying asynchronies still mainly relies on clinical experience due to the lack of effective, accurate automation tools. Machine learning approach promises to offer an automated solution for PVA detection. However, accurately labelling normal or abnormal patterns from massive ventilator data is extremely challenging and manual labor intensive. This leads to the lack of well-annotated training sets for automatic PVA identification. In this work, we designed an annotation software based on the Python programming language. The basic functions of this software consist of 4 functions: (1) loading and reconstructing ventilator data files; (2) extracting breathing cycles from ventilation waveforms; (3) anomaly detection of mechanical ventilation waveforms; (4) generating annotation files. 400 h of ventilator waveform data from 10 patients collected from Zhongda Hospital, Southeast University, Nanjing, China was applied for software validation. The accuracy of automatic extraction of breathing cycle is 99.98%. For each breathing cycle, the abnormal waveform detection was completed with a high accuracy of 95.00%. Keywords: Mechanical ventilation · Patient-ventilator asynchronies · Anomaly detection · Machine learning · Data labeling

1 Introduction Mechanical ventilation (MV) is the most important supportive treatment for respiratory failure [1, 2]. Patient-ventilator asynchrony (PVA) during ventilatory support is an abnormal breathing event characterized by an abnormal ventilator waveform due to a mismatch of the mechanically ventilated patient respiratory drive with ventilator function [3]. Such asynchrony is commonly associated with poor clinical outcomes such as © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 277–283, 2024. https://doi.org/10.1007/978-3-031-51485-2_29

278

C. Chen et al.

patient discomfort and increased mortality [4]. Almost all commercially available alarm systems for mechanical ventilators rely on simple threshold settings to alarm, frequently causing false-positive alarms [5]. Moreover, these false alarms can also lead to alarm fatigue and inappropriate clinical responses [6–8]. Data-driven machine learning approach is promising for automatic PVA detection of massive ventilator waveforms [9, 10]. This approach largely relies on a reliable training set, while labelling of which is labor-intensive and time-consuming, requiring extensive clinical experiences [11–13]. Extraction of breathing cycles from ventilation waveforms is the first step for creating a reliable dataset. Since ventilator waveforms are very sensitive to random disturbance induced by various clinical factors [14], it is difficult to automatically detect a single breathing cycle with high accuracy only by simple threshold setting method [2]. Additionally, identifying abnormal cycles from massive waveform data is another crucial step. However, abnormal cycles reflected in PVA, like many adverse clinical events, tend to occur with relatively low probability [15, 16], which brings a great challenge for anomaly detection of ventilator waveforms. In this work, we developed a software tool for semi-automatic abnormal breathing cycle waveform detection and labeling. We designed an interactive user interface for the software, and proposed an automatic extraction algorithm that divides different breathing cycles using two geometric features, “movement trend” and “optimal slope”. Then, we combined the correlation coefficient method [17] with the K-nearest neighbor (KNN) algorithm [18] to distinguish whether the waveform within a respiratory cycle is normal or abnormal. All the ventilator waveform data was obtained from Zhongda Hospital, Southeast University, Nanjing, China. A total of 400 h of ventilator waveform data from 10 patients was screened. The experimental results demonstrate that our software can efficiently and accurately extract abnormal cycles of ventilator waveforms from massive ventilator data of clinical patients using less manual intervention.

2 Materials and Methods 2.1 Data Collection A total of 400 h of ventilator waveforms were used for the algorithm and software development in this study, which were collected from a total of 10 ventilated patients at Zhongda Hospital, Southeast University, Nanjing, China. The continuous airway signals were sampled at a frequency of 62.5 Hz using the built-in data acquisition module of the ventilator (Dräger Evita 4, Germany). 2.2 Design of Software Functions The graphical user interface (GUI) of our software was created using the Python-Tkinter library, and its data visualization was implemented by the Python-Matplotlib library. The GUI for the software is shown in Fig. 1. The basic functions of this software are given as follows: 1. Scalars waveform representations of pressure, flow, or volume on the y-axis vs time on the x-axis.

A Software Tool for Anomaly Detection and Labeling

279

2. Label abnormal/normal waveform data manually to create a training set for anomaly detection. The sample to be labeled could be provided automatically by the software or freely selected. 3. Semi-automatic anomaly detection and labeling in ventilator waveforms. 4. Generate the annotated files that can be used to further identify PVA by clinicians.

Fig. 1. Graphical user interface of the software. The blue line denotes the labeled waveforms, and the red waveform between the two dashed lines denotes the extracted breathing cycle to be confirmed for data labeling by clinicians.

The workflow for data pre-processing and anomaly detection of ventilator waveforms is shown in Fig. 2. 2.3 Automatic Extraction of Breathing Cycles In this work, we developed an automatic extraction method using two geometric features: movement trend and optimal slope. Movement trend is used to describe the rising, falling, or moving horizontally patterns of ventilator waveform time series within a short time. Adjacent slopes of a point could be obtained through formula (1), if more than half slopes are above the average slope of the waveform when there is no flow, the moving trend of this point could be upward. km,i =

Pm+i − Pm i = 1, 2, 3, . . . , n tm+i − tm

(1)

where Pm is the flow value corresponding to point m, tm is the specific time value, and km,i is the slope calculated from point m and its i-th consecutive adjacent point.

280

C. Chen et al.

Fig. 2. Workflow diagram of the software in this work

The optimal slope at each time point can be determined by the average slope value slope(m) of the subsequent slopes: 5 km,i (2) slope(m) = i=1 5 At the beginning of each breathing cycle, the slope of the flow waveform will become significantly larger, forming a noticeable inflection point. The slope after the inflection point is k0 . At the same time, the volume waveform is at a position close to 0. The automatic extraction algorithm of the breathing cycle is shown in Fig. 3.

Fig. 3. Flowchart of the automatic extraction algorithm of breathing cycle

2.4 Anomaly Detection of Mechanical Ventilation Waveforms We used 10 periodic waveforms with standard normal cycle as the templates to filter out mostly abnormal cycles from ventilator waveform using correlation coefficients. If the

A Software Tool for Anomaly Detection and Labeling

281

correlation coefficient is less than 0.4, the waveform will be classified as abnormal cycle, and other waveforms will be temporarily classified as normal cycle for further detection. The division of normal/abnormal cycle waveform using the correlation coefficients method is demonstrated in Fig. 4.

Fig. 4. Anomaly detection of breathing cycle waveform using correlation coefficient method, the red line denotes the normal cycle template, blue line represents the normal waveforms, and green line corresponds to the abnormal waveforms.

After confirmed by clinicians, the normal waveforms can be used for further anomaly detection. In our work, all ventilator waveforms were collected from 10 patients, each patient was labeled with about 300 breathing cycles (normal: abnormal = 1: 1) by experienced clinicians. Then, the KNN-based model was trained by using the labeled data of each patient to implement discrimination between unlabeled normal and abnormal breathing-cycle waveforms. 10-fold cross-validation was used for model verification.

3 Results By our developed software, each breathing cycle in the ventilator waveforms can be automatically extracted. Through the clinicians’ confirmation, the accuracy of the cycle extraction algorithm is about 99.98%. Using the correlation coefficient method, the original waveform data was firstly divided into abnormal or normal pattern. The KNN algorithm was then used to classify the breathing-cycle dataset for each patient separately. As illustrated in Table 1, the accuracy of anomaly detection is about 95.00% in 9 patients and 89.50% in 1 case. Meanwhile, the detection method also shows excellent performance in sensitivity, specificity and F1 score. Overall, our software can successfully detect and classify abnormal breathing cycles.

282

C. Chen et al.

4 Discussion In this work, only a small amount of clinician labeling is required for our software to perform accurate anomaly detection of ventilation waveforms. This greatly improves the efficiency of dataset labeling, while reducing the errors of purely manual data labeling. In Table 1, both the detection accuracy and F1 score for the prediction model on the data of patient 2 were relatively low. This may be mainly resulted from some unstable ventilation waveforms induced by the patient movement or condensate backflow during ventilator data acquisition, which can lower the performance of the detection algorithm. In future work, we will further improve the accuracy and robustness of anomaly detection of breathing cycle waveforms by exploiting the pattern features of pressure waveforms. Cross-labelling strategy will be introduced to further reduce the manual labelling errors. Table 1. Accuracy of anomaly detection No

Sensitivity

Specificity

F1 score

Accuracy

1

0.974 ± 0.016

0.972 ± 0.007

0.972 ± 0.010

0.956 ± 0.015

2

0.912 ± 0.028

0.912 ± 0.045

0.935 ± 0.011

0.895 ± 0.016

3

0.992 ± 0.016

0.952 ± 0.029

0.971 ± 0.017

0.973 ± 0.016

4

0.992 ± 0.016

0.957 ± 0.019

0.974 ± 0.017

0.974 ± 0.012

5

0.980 ± 0.025

0.932 ± 0.027

0.956 ± 0.013

0.944 ± 0.019

6

1.000 ± 0.000

1.000 ± 0.000

1.000 ± 0.000

1.000 ± 0.000

7

0.974 ± 0.023

0.958 ± 0.019

0.964 ± 0.010

0.961 ± 0.013

8

1.000 ± 0.000

1.000 ± 0.000

1.000 ± 0.000

1.000 ± 0.000

9

0.952 ± 0.019

0.972 ± 0.019

0.965 ± 0.005

0.945 ± 0.005

10

1.000 ± 0.000

1.000 ± 0.000

1.000 ± 0.000

1.000 ± 0.000

5 Conclusion In this work, we designed a software tool for the anomaly detection and labeling of 400 h of ventilator waveform data which is collected from 10 patients. The verification result shows that the breathing cycle can be extracted accurately by our software. The detection of abnormal cycle waveforms can be precisely achieved through the semiautomatic labeling of waveform data. Overall, our developed software can largely reduce the workload of manual labeling ventilator waveform data that help clinicians build high-quality labeled training datasets for PVA identification. Acknowledgment. This research was financially supported by the Jiangsu Provincial Special Program of Medical Science, China (BE2020786), and the National Natural Science Foundation of China (81971885).

A Software Tool for Anomaly Detection and Labeling

283

References 1. Urner, M., Jüni, P., Hansen, B., et al.: Time-varying intensity of mechanical ventilation and mortality in patients with acute respiratory failure: a registry-based, prospective cohort study. Lancet Respir. Med. 8(9), 905–913 (2020) 2. Blanch, L., Villagra, A., Sales, B., et al.: Asynchronies during mechanical ventilation are associated with mortality. Intensive Care Med. 41(4), 633–641 (2015) 3. Liu, L., Yang, Y., Gao, Z., et al.: Practice of diagnosis and management of acute respiratory distress syndrome in mainland China: a cross-sectional study. J. Thorac. Dis. 10(9), 5394 (2018) 4. Kyo, M., Shimatani, T., Hosokawa, K., et al.: Patient–ventilator asynchrony, impact on clinical outcomes and effectiveness of interventions: a systematic review and meta-analysis. J. Intensive Care 9(1), 1–13 (2021) 5. Drew, B.J., Harris, P., Zegre-Hemsey, J.K., et al.: Insights into the problem of alarm fatigue with physiologic monitor devices: a comprehensive observational study of consecutive intensive care unit patients. PLoS One 9(10), e110274 (2014) 6. Imhoff, M., Kuhls, S.: Alarm algorithms in critical care monitoring. Anesth. Analg. 102(5), 1525–1537 (2006) 7. Nouira, K., Trabelsi, A.: Intelligent monitoring system for intensive care units. J. Med. Syst. 36(4), 2309–2318 (2012) 8. Graham, K.C., Cvach, M.: Monitor alarm fatigue: standardizing use of physiological monitors and decreasing nuisance alarms. Am. J. Crit. Care 19(1), 28–34 (2010) 9. Zhang, L., Mao, K., Duan, K., et al.: Detection of patient-ventilator asynchrony from mechanical ventilation waveforms using a two-layer long short-term memory neural network. Comput. Biol. Med. 120, 103721 (2020) 10. Haro, C.D., Ochagavia, A., López-Aguilar, J., et al.: Patient-ventilator asynchronies during mechanical ventilation: current knowledge and research priorities. Intensive Care Med. Exp. 7(S1) (2019) 11. Gholami, B., et al.: Replicating human expertise of mechanical ventilation waveform analysis in detecting patient-ventilator cycling asynchrony using machine learning. Comput. Biol. Med. 97, 137–144 (2018) 12. Bakkes, T.H., Montree, R.J., Mischi, M., et al.: A machine learning method for automatic detection and classification of patient-ventilator asynchrony. In: 2020 42nd Annual International Conference of the EMBC, pp. 150–153. IEEE (2020) 13. Rehm, G.B., Han, J., Kuhn, B.T., et al.: Creation of a robust and generalizable machine learning classifier for patient ventilator asynchrony. Methods Inf. Med. 57(04), 208–219 (2018) 14. Ossai, C.I., Wickramasinghe, N.: Intelligent decision support with machine learning for efficient management of mechanical ventilation in the intensive care unit–A critical overview. Int. J. Med. Inf. 150, 104469 (2021) 15. Beitler, J.R., Sands, S.A., Loring, S.H., et al.: Quantifying unintended exposure to high tidal volumes from breath stacking dyssynchrony in ARDS: the BREATHE criteria. Intensive Care Med. 42(9), 1427–1436 (2016) 16. Blanch, L., Sales, B., Montanya, J., et al.: Validation of the Better Care® system to detect ineffective efforts during expiration in mechanically ventilated patients: a pilot study. Intensive Care Med. 38(5), 772–780 (2012) 17. Charlton, P.H., Villarroel, M., Salguiero, F.: Waveform analysis to estimate respiratory rate. Secondary Anal. Electron. Health Rec. 377–390 (2016) 18. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

A Machine Learning Approach for Predicting the Time Point of Achieving a Negative Fluid Balance in Patients with Acute Respiratory Distress Syndrome Haowen Lei1 , Zunliang Wang1(B) , and Songqiao Liu2,3(B) 1 State Key Laboratory of Bioelectronics, School of Biological Science and Medical

Engineering, Southeast University, Si Pai Lou 2, Nanjing, China [email protected] 2 Jiangsu Provincial Key Laboratory of Critical Care Medicine, Department of Critical Care Medicine, School of Medicine, Zhongda Hospital, Southeast University, Dingjiaqiao 87, Nanjing, China [email protected] 3 Department of Critical Care Medicine, Nanjing Lishui People’s Hospital, Zhongda Hospital Lishui Branch, Nanjing, China

Abstract. It is beneficial for patients with acute respiratory distress syndrome (ARDS) to achieve a negative fluid balance at an appropriate time, but it is harmful to achieve a negative fluid balance too early or too late. At present, how to determine the appropriate time point is still unclear. In this work, we developed an XGBoost-based machine learning model for the prediction of the time point of reaching a negative fluid balance in ARDS patients, and explored the relevant influencing factors. A total of 8,685 samples were sampled from 494 ARDS patients with negative fluid balance, including 1,441 positive samples and 7,244 negative samples. As a result, our model shows a considerable prediction performance (AUC: 0.950, accuracy: 92.2%). Furthermore, we found the cumulative fluid balance, mild liver disease, dopamine dosage, fraction of inspiration O2 (FiO2 ), central venous pressure (CVP) and base excess (BE) are important factors for the prediction accuracy of the model. This result shows our method is expected to assist clinicians better determine the time point of negative fluid balance in ARDS patients, thus optimizing the fluid management strategy in time to obtain better treatment effect. Keywords: ARDS · Fluid management · Negative fluid balance · MIMIC-IV · Intelligent decision making

1 Introduction Acute respiratory distress syndrome (ARDS) is a syndrome with acute progressive respiratory failure as the principal clinical manifestation caused by a variety of causes. Its etiology and pathogenesis are complex, with mortality rating up to 40% [1]. Conservative © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 284–290, 2024. https://doi.org/10.1007/978-3-031-51485-2_30

A Machine Learning Approach for Predicting the Time

285

fluid management strategy is one of the supportive treatment methods that can improve the prognosis of patient. Research shows that conservative fluid management strategy can help reduce the duration of mechanical ventilation and ICU stay [2]. Patients with ARDS who can achieve the negative fluid balance during the fluid management have lower mortality compared with continuous positive fluid balance [3]. It is beneficial for patients with ARDS to achieve a negative fluid balance at an appropriate time, but for the high heterogeneity of ARDS, it is hard to determine when patients with ARDS can achieve a negative fluid balance [4]. Some studies try to distinguish patients with different response of negative fluid balance to guide doctors to personalize the fluid management strategies of patients, but have not explained when patients can achieve a negative fluid balance [5, 6]. The key factors affecting when patients can reach negative fluid balance have not been determined, making it difficult for clinicians to intervene early to improve patients’ fluid therapy strategies. The XGBoost model has superior learning efficiency and is insensitive to missing data [7]. In recent years, the XGBoost algorithm has been increasingly used in clinical research including early prediction of ARDS [8]. In this study, we developed a XGBoostbased machine learning method to determine the time point of negative fluid balance during the conservative fluid management of patients with ARDS. The prediction result may as to assist doctors in timely adjusting fluid management strategies to improve the treatment of patients with ARDS. Meanwhile, the factors that affect the changes of the fluid balance of patients are analyzed, which provides additional assistance for the conservative fluid management of patients with ARDS.

2 Materials and Methods 2.1 Source of Data The data of patients with ARDS comes from the Medical Information Mart for Intensive Care (MIMIC) database, which is a large open clinical database, containing the relevant health data of patients from the intensive care unit at Beth Israel Deaconess Medical Center [9]. MIMIC is also introducing new versions over time. The data in this paper are from the latest MIMIC-IV database. 2.2 Participants The population in this study is defined as patients with ARDS during admission, who have undergone adequate fluid resuscitation and finally reached a stable negative fluid balance. According to the Berlin definition and the relevant research on ARDS fluid management [10–12], the patients selection criteria for this study are defined as: (1) The patients are older than 18 and (2) mechanically ventilated over 48 h in the ICU; (3) The patient whose P/F value dropped below 300 mmHg at a certain time after ICU admission for more than 24 h; (4) Within 6 h after the start of vasopressor treatment, the patient was injected with fluid ≥20 mL/kg, and the central venous pressure reached ≥8 mmHg; (5) After adequate fluid resuscitation, the patient finally reached and maintained a negative

286

H. Lei et al.

balance. Finally, 494 patients were selected for this study. The flow chart of patient selection is shown in Fig. 1. Of the 494 patients experienced the change from long-term positive fluid balance to long-term negative fluid balance, the number of patient deaths is 21.

Fig. 1. Flow chart of patient selection

2.3 Definition of Time Window and Label As shown in Fig. 2, we set three types of time windows, including observation window, prediction gap window and prediction window. Considering the distribution of patient data, the effect of model training and the timely intervention of patients with ARDS, the three types of time windows were set as 6 h, 10 h and 6 h respectively. The prediction model uses 6-h clinical data of the patient before the current time point as the model input to predict the fluid balance status of the patient after 10 h. The label definition of the patient’s fluid balance status can be defined by the patient’s cumulative fluid balance data covered by 6-h prediction window.

Fig. 2. The time windows used in this study.

At present, it is very difficult to judge whether the negative fluid balance is beneficial to ARDS patients, due to the lack of a unified view [13]. In this work, the survival

A Machine Learning Approach for Predicting the Time

287

of patients is taken into consideration in the labeling definition. In the 6-h prediction window, the time point when the survival patient reaches a stable negative fluid balance status is used as positive labels, and other time point when the patient is in positive fluid balance state or in the dead status are marked as negative labels. Finally, a total of 8,685 samples are sampled from 494 ARDS patients with negative fluid balance, including 1,441 positive samples and 7,244 negative samples. 2.4 Predictors The predictors used are from the data within the observation window, which are divided into the following categories, including the patient base information, fluid input and output, vital sign, doses of vasopressor, and blood gas. The data is shown in the Table 1. The selection of predictors does not adopt complex and strict criteria, but includes as many factors as possible to maximize the statistical ability of the model [14]. Table 1. Predictors used in the model Data type

Data description

Patient basic information Age, height, sex, score on the first day of admission to ICU, complications Fluid input and output

Fluid input per hour, fluid output per hour, cumulative fluid balance

Vasopressor

Dobutamine, dopamine, epinephrine, norepinephrine, milrinone, phenylephrine, vasopressin

Blood gas

aado2 , base excess, bicarbonate, totalco2 , chloride, calcium, hematocrit, hemoglobin, lactate, fio2 , pco2 , peep, PH, po2 , potassium, sodium

Vital sign

Heart rate, respiratory rate, blood pressure systolic, blood pressure diastolic, mean arterial pressure, temperature, glucose, central venous pressure

2.5 Training and Testing Model Figure 3A and B show the training and testing process of the XGBoost model in this study, respectively. During the training process, patients’ data is binned into 1-h intervals and data in 6-h intervals is collected according to the size of the selected observation window. Before the data is input into the model, feature extraction is achieved by calculating the maximum value, minimum value, average value, standard deviation, and coefficient of variation of data in the 6-h intervals. Patient basic information is considered as a constant. The maximum value of 6 h is taken for blood gas, and the total amount is taken for vasopressor. XGBoost is selected as the prediction model according to the data characteristics. When new clinical data of ARDS patients is input to the model, the model is able to predict whether a negative fluid balance will occur after the next 10 h

288

H. Lei et al.

based on the patient’s 6 h of data so far. Based on the clinical data, accurately predicting the fluid balance status of patients in the next stage will help clinicians to adjust early the fluid management of patients with ARDS for better outcome.

Fig. 3. A Training and B testing process of the XGBoost model used for the prediction in this study

3 Results We divided the data into a training set and test set. 80% of the sample data was used for training and 20% of the sample data for testing. 5-fold cross validation and grid search was adopted for the model training. To evaluate the performance of the predictive model, we further compared the prediction performance of the XGBoost with the the support vector machine (SVM) model and the random forest model (RF) [15, 16] on the same testing set. The Table 2 shows the comparison of model performance metrics for this prediction task. Our test result shows the XGBoost model can achieve the best prediction performance with the accuracy of 92.2%, the area under the ROC curves (AUC) of 0.950, the sensitivity of 0.826 and the specificity of 0.931. Table 2. Comparison of prediction performance Metrics

XGBoost

SVM

RF

AUC

0.950

0.889

0.876

Accuracy

0.922

0.825

0.815

Sensitivity

0.826

0.805

0.786

Specificity

0.931

0.830

0.831

The contributions of different predictive features on the prediction performance of our XGBoost model were ranked by using feature importance analysis [17]. The top 20 most important features are demonstrated in Fig. 4.

A Machine Learning Approach for Predicting the Time

289

Fig. 4. Ranking of feature importance for the XGBoost model

4 Discussion In this work, we proposed an intelligent prediction method to determine the time point when ARDS patients can reach negative fluid balance. This method can be well integrated with the current clinical information system to help doctors optimize the fluid management strategy for the ARDS patients. Compared with the SVM and RF models, the XGBoost demonstrated obviously better prediction performance, which is mainly attributed to its more prominent ability to handle clinical time series data and missing data. In addition, the model generalizability is limited by the inadequate ARDS patient data. We sampled clinical data of ARDS patients by setting a time window [18]. The sampled data covered by the time window is used to describe the change of patient’s fluid balance status within the time window. This sampling method can enrich the sample space, thus improving the generalization of model on the condition of insufficient ARDS data. For better generalization of the model, more ARDS patient data is required. Previous study found that FiO2 of ARDS patients better responding to the negative fluid balance state has increased to a certain extent after entering the negative fluid balance [19]. In Fig. 4, FiO2 also shows its importance in predicting the time point of entering negative fluid balance. In the next work, the effects of different factors on the fluid balance status in ARDS patients are worthy of further study.

5 Conclusions In this study, we developed a XGBoost-based machine learning method to predict the time point prediction of reaching a negative fluid balance in patients with ARDS. The model shows the higher prediction performance. Some important factors are also investigated that help guide clinicians better optimize fluid therapy scheme for ARDS patients. For the next work, further clinical validation is necessary to optimize the performance of the model.

290

H. Lei et al.

Acknowledgment. This research was financially supported by the Jiangsu Provincial Special Program of Medical Science, China (BE2020786), and the National Natural Science Foundation of China (81971885).

References 1. Máca, J., Jor, O., Holub, M., et al.: Past and present ARDS mortality rates: a systematic review. Resp. Care 62(1), 113–122 (2017) 2. National Heart, Lung, and Blood Institute Acute Respiratory Distress Syndrome (ARDS) Clinical Trials Network: Comparison of two fluid-management strategies in acute lung injury. N. Engl. J. Med. 354(24), 2564–2575 (2006) 3. van Mourik, N., Metske, H.A., Hofstra, J.J., et al.: Cumulative fluid balance predicts mortality and increases time on mechanical ventilation in ARDS patients: an observational cohort study. PLoS One 14(10), e0224563 (2019) 4. Chacko, B., Peter, J.V.: Negative fluid balance: beneficial or harmful? Crit. Care Update 2019, 3 (2019) 5. Messmer, A.S., Moser, M., Zuercher, P., et al.: Fluid overload phenotypes in critical illness—a machine learning approach. J. Clin. Med. 11(2), 336 (2022) 6. Sinha, P., Calfee, C.S.: Phenotypes in ARDS: moving towards precision medicine. Curr. Opin. Crit. Care 25(1), 12 (2019) 7. Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016) 8. Yun, H., Choi, J., Park, J.H.: Prediction of critical care outcome for adult patients presenting to emergency department using initial triage information: an XGBoost algorithm analysis. JMI 9(9), e30770 (2021) 9. Johnson, A.E.W., Pollard, T.J., Shen, L., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 1–9 (2016) 10. Mahmoud, O.: Mechanical power is associated with increased mortality and worsened oxygenation in ARDS. Chest 158(4), A679 (2020) 11. Ferguson, N.D., Fan, E., Camporota, L., et al.: The Berlin definition of ARDS: an expanded rationale, justification, and supplementary material. Intensive Care Med. 38, 1573–1582 (2012) 12. Murphy, C.V., Schramm, G.E., Doherty, J.A., et al.: The importance of fluid management in acute lung injury secondary to septic shock. Chest 136(1), 102–109 (2009) 13. Vignon, P., Evrard, B., Asfar, P., et al.: Fluid administration and monitoring in ARDS: which management? Intensive Care Med. 46, 2252–2264 (2020) 14. Bhattarai, S., Gupta, A., Ali, E., et al.: Can big data and machine learning improve our understanding of acute respiratory distress syndrome? Cureus 13(2), e13529 (2021) 15. Jakkula, V.: Tutorial on support vector machine (SVM). School of EECS, Washington State University 37(2.5), 3 (2006) 16. Biau, G., Scornet, E.: A random forest guided tour. Test 25, 197–227 (2016) 17. Zhang, Z., Ho, K.M., Hong, Y.: Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit Care 23(1) (2019) 18. Johnson, A.E.W., Mark, R.G.: Real-time mortality prediction in the intensive care unit AMIA annual symposium proceedings. J. Am. Med. Inf. Assn. 2017, 994 (2017) 19. Allyn, J., Allou, N., Dib, M., et al.: Echocardiography to predict tolerance to negative fluid balance in acute respiratory distress syndrome/acute lung injury. J. Crit. Care 28(6), 1006– 1010 (2013)

3D Simulation Model for Urine Detection in Human Bladder by UWB Technology Mengfei Jiang1 , Liping Qin2 , Hui Zhen2 , and Gangmin Ning3,4(B) 1 Polytechnic Institute, Zhejiang University, Hangzhou, China 2 Zhejiang Institute of Medical Device Testing, Hangzhou, China 3 College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou,

China [email protected] 4 Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China

Abstract. Develop a 3D simulation model for non-contact detection of urine volume in human bladder by Ultra-wideband (UWB) imaging technology. A 3D simulation model for urine detection was established, which consists of an elliptical cylindrical fat layer and a spherical urine layer, and the dielectric permittivity of tissues is considered. The UWB Vivaldi antenna is used in the simulation and sweeps the model in a linear form. By mimicking the action of a UWB radar system, the reflected signals at each aperture position are obtained from the boundary of fat and urine. To generate the imaging results, the synthetic aperture (SA)based imaging algorithm is applied to the calibrated signal. The estimated radius of bladder agrees with theoretical data with a bias of 11.1%. It is shown that the possibility of detecting the volume of urine in the human bladder and UWB imaging technique has a huge application prospect in medical diagnosis. Keywords: UWB · Model simulation · Urine detection · SAR imaging

1 Introduction Ultra-wideband (UWB) technology has become an emerging trend in the medical field in recent years. UWB device can emit radio pulses with a duration of nanoseconds due to their huge bandwidth in the 3.1–10.6 GHz range. UWB can achieve very high data rates and multipath resolution. Unlike X-ray imaging and computerized tomography, UWB radar uses non-ionizing electromagnetic (EM) waves, which make them virtually harmless to users. UWB has a great penetrating ability and provides the potential for imaging internal organs of human body, while ultrasound cannot penetrate bone and air [1]. In addition, the low complexity of UWB systems makes them less expensive and advantageous over expensive MRI technology. All these properties make UWB technology attractive for medical diagnosis [2]. The detection of urine accumulation in human body is a valuable topic in medical fields [3]. Monitoring urine accumulation in the bladder benefit the patients with bladder

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 291–298, 2024. https://doi.org/10.1007/978-3-031-51485-2_31

292

M. Jiang et al.

control problems (e.g. urinary retention) by determining urination intervals from the outside to avoid permanent intubation. In the patients with urological disorders, monitoring the amount of residual urine in the post-excretory bladder helps estimate the severity of obstruction to the urinary system and kidneys [4]. This paper aims to develop 3D simulation model for detecting urine accumulation in the human bladder by SA-based UWB imaging technique. The technique is based on the fact that the permittivity of urine is much larger than that of the surrounding tissues (e.g. fat). Therefore, a stronger reflection of the incident UWB pulses appears. The presence of urine in the human bladder can be estimated through the time delay of the reflections [5]. The organization of the paper is as follows: in Sect. 2 the simulation of the antenna and bladder model including the geometrical structure and tissue parameters as well as the signal processing are discussed. Section 3 shows the obtained results and the corresponding discussions are shown in Sect. 4.

2 Methods The simulations are carried out using electromagnetic simulation software CST Microwave Studio [6], which is based on finite integration technique (FIT) [7]. In this section, the simulation scenario including the model constructions, the imaging scan configuration, and the imaging constructing algorithm will be discussed in detail. 2.1 Antenna Model The UWB Vivaldi antenna is used both as a transmit (Tx) antenna to send pulse stimulation signal and as a receive (Rx) antenna to collect reflected signal in the simulation scenario. The antenna has been designed to operate in the whole UWB band 3.1–10.6 GHz, which is chosen owing to its great time domain properties and frequency domain properties in particular in the field of low signal distortion and high directivity. The antenna structure is illustrated in Fig. 1, designed referring to [8]. Its dimensions are summarized in Table 1. The distance between the antenna and the surface of the bladder model is 20cm. Besides, the main beam of the antenna is oriented perpendicularly to the surface of the model. 2.2 The 3D Bladder Model Since human tissues have varies permittivity [9, 10], the human bladder can be modeled as a multilayer structure in CST Microwave Studio. The model consists of two layers including fat and urine as shown in Fig. 2. As for the fat layer, an elliptical cylinder form is designed regarding the human abdomen and the overall dimension of the fat layer is restricted to a size of 81 × 54 × 90 mm3 to reduce computing resources. The urine layer is devised as a sphere [11] with a radius of 18 mm.

3D Simulation Model for Urine Detection in Human Bladder

a) Top view

b) Bottom view

293

c) Side view

Fig. 1. Configuration of Vivaldi antenna

Table 1. Dimensions (in mm) of the antenna structure W

L

Th

W1

W2

W3

L1

L2

L3

L4

90

90

1

41

20

1

15

25

4

29

According to the Cole-Cole dispersion, the complex relative permittivity of human tissues is described in Eq. 1 [12], in which ε indicates the magnitude of the dispersion, w is the angular frequency, τ is the relaxation time, α is the distribution parameter, σi is the conductivity, ε0 is the permittivity of free space. ε(w) = ε∞ +

 n

εn 1 + (jwτn )

(1−αn )

+

σi jwε0

(1)

The dielectric properties of various tissues at 3.6 GHz frequency are summarized in Table 2.

a) Top view

b) Side view

Fig. 2. Geometry of the simulation model

294

M. Jiang et al. Table 2. Dielectric properties of body tissues

Tissue

Conductivity (s/m)

Fat

0.16086

Urine

3.5783

Relative Permittivity 5.1641 67.308

Loss Tangent 0.15554 0.26545

2.3 Imaging Configuration In the following SA-based UWB imaging system, a 2D scan scheme is adopted [11]. The synthetic array lies in the x-y plane with a total area of 40 × 50 mm2 . A scan step of 10 mm in both x and y directions is used. Thus, 30 (5 × 6) scan positions are applied in the whole scan area. According to the symmetry of the whole scan area in x and y directions, only 9 (3 × 3) scan positions are simulated as shown in Fig. 3, and the remaining scan position can be extended from these. The simulation is performed in CST Microwave Studio with the time domain solver [13].

Fig. 3. Scan configurations of the SA-based imaging system

2.4 Imaging Algorithm Two different simulations are needed: simulation of background and simulation of bladder model. As for the background simulation, the surrounding signal is obtained, which is used to eliminate the effect of everything, including measurement instruments and surrounding objects in the following reflected signals. For the calibration procedure, reflected signals at each aperture position are subtracted from the surrounding signal to eliminate undesired parts of reflected signals, including environment noise and antenna coupling. For the imaging procedure, processed signals are summed to create an energy profile. Reflected signal from the urine site will sum coherently and clutter will add incoherently. Thus, returns other than urine response will be suppressed. In order to create energy profiles, SA-based imaging technique is adopted [14]. Each image pixel is filled with appropriate samples. The appropriate data sample in time,

3D Simulation Model for Urine Detection in Human Bladder

295

which is related to the pulse round-trip time between each aperture position and image pixel, is found using the following calculations. First of all, the round-trip distance is calculated by:  (2) d = (x − xi )2 + (y − yi )2 + (z − zi )2 where x, y, and z are the pixel’s location in the reconstructed image, xi , yi , and zi are the position of the Tx and Rx antennas. The round-trip time t, is found by dividing the round-trip distance d by the EM wave velocity vm : d vm c vm = √ εr t=

(3) (4)

where vm is the velocity of EM wave in a certain medium, c indicates the speed of the light and εr denotes the relative permittivity of the medium. The index of the appropriate time sample is obtained by dividing the round-trip time by time resolution t: index = round (

t ) t

(5)

The precise sample corresponding to each image pixel is picked using the nearest neighbor method [15]. A larger sample population results in a smaller time resolution t, which increases the probability of choosing a more accurate sample when the number is rounded to get the time index, namely, the larger sample population leads to greater reconstructed images. p I= Sa (p, index) (6) n=1

where I denotes the reconstructed SAR image and p is the aperture position.

3 Results The UWB Vivaldi antenna sweeps a linear form at 30 positions around the bladder model constructed in CST (Fig. 3). Urine layer is modeled as a sphere of radius 18 mm placed in the middle of fat layer. Firstly, the background reflections and antenna coupling are calibrated, as can be seen in Fig. 4b. It can be observed that the unwanted background reflections cannot be completely subtracted, however, urine layer has a stronger amplitude compared to the original signal in Fig. 4a. Figure 4c is obtained after removing the unwanted signal range and moving the antenna in x-y plane, and the urine response is further enhanced in Peak No.1 and No.2, which identify the urine volume. Specifically, Peak No.1 can be considered as the fat-urine layer boundary, since the distance between the fat and urine layers is only 9 mm, which almost coincides in time. Peak No.2 is the urine-fat boundary, although it is very faint.

296

M. Jiang et al.

Then, according to the urine peak response, images are reconstructed using the SAbased imaging algorithm. To make the results more apparent, Fig. 5 is a superposition of all the achieved images in the x-y plane. Since the urine layer is set as a sphere, the radius of the sphere can be inferred from the x-y plane, namely, 20 mm, which is slightly greater than the original size (18 mm). To be more specific, the estimated radius of bladder has accuracy of 88.9% compared to the theoretical value. It is due to the weak reflections when EM waves reach the urine layer because of attenuation of the EM waves passing through the fat and urine layer. Besides, fat and urine layers are arc-shaped, which makes the reflected EM waves divergent, resulting in a weak reflection. Moreover, the FIT used by CST causes numerical dispersion, which distorts the reflected signals.

Fig. 4. Reflected signals processing. a Reflected signal and noise, b Calibrated signal, c Calibrated signals when moving the antenna in x-y plane

Fig. 5. Scan configurations of the SA-based imaging system

3D Simulation Model for Urine Detection in Human Bladder

297

4 Discussion Results demonstrate the feasibility of detection human bladder volume. Compared with the reported study [11], our work has more realistic simulation setting and higher accuracy. The study [11] set the distance of 5 mm between antenna and the outer of bladder model to strengthen the energy penetrated the model and achieved an accuracy of 72%. In contrast, the present work sets an actual 20 cm of working distance, and more importantly, it gained a bias of 11.1%. The major limitations of the present work should be noticed. Firstly, the model of the bladder was simplified as an ideal sphere, which is somewhat different from the actual situation. Secondly, the FIT used by CST causes numerical dispersion and distorts the reflected signals. Thirdly, due to the restraint of the detection depth of the current antenna, it is unable to reconstruct real 3D image. In the future study, the model should be built with physiological data, close to the actual human bladder shape. Besides, it is necessary to further denoise the reflected signals to solve the problem caused by numerical dispersion. Meanwhile, the design of the UWB Vivaldi antenna should be improved by strengthening the radiation intensity and focusing the directivity of the antenna, thus the real bladder volume can be obtained by acquiring efficient signals in x-y, y-z, and x-z planes.

5 Conclusions This work demonstrates a method for detecting urine volume in the human bladder through 3D simulation. Simulation models of the UWB Vivaldi antenna and two-layer bladder model are built in CST, in which realistic dielectric properties of antenna substrate and different human tissues are taken into account. Due to the fact that the Rx antenna receives more energy when illuminating the urine, the volume of the urine can be determined by shifting the antenna to scan the bladder model. The analytical signal and imaging method has been used to eliminate the noise and construct images. The reconstructed image demonstrated that SA-based UWB imaging system for detecting urine in the human bladder is feasible and UWB technique has a huge application prospect in medical diagnosis. Due to the characteristics of SA-based imaging, more positions within the synthetic array can obtain higher precision images, but it costs more computation. Therefore, there is a tradeoff between the cost of the system and the accuracy of the achieved images. The numerical dispersion and signal attenuation should be considered, meanwhile, the imaging quality needs to be further improved. Acknowledgement. This work was supported by Key Research Project of Zhejiang Province (2020C03073) and Key Research Project of Zhejiang Lab (2022ND0AC01). The authors thank Prof. Karumudi Rambabu and Gary Chen at the University of Alberta, Canada for providing UWB prototype and helpful suggestions.

298

M. Jiang et al.

References 1. Metev, S.M., Veiko, V.P.: Laser-Assisted Microtechnology. Springer Science & Business Media (2013) 2. Staderini, E.M.: UWB radars in medicine. IEEE Aerosp. Electron. Syst. Mag. 17(1), 13–18 (2002) 3. Schmid, J., Niestoruk, L., Lamparth, S., et al.: Ultra-wideband signals for the detection of water accumulations in the human body. In: The 3rd International Conference on Bio-Inspired Systems and Signal Processing (2010) 4. Nasrabadi, M.Z., Tabibi, H., Salmani, M., et al.: A comprehensive survey on non-invasive wearable bladder volume monitoring systems. Med. Biol. Eng. Comput. 59(7–8), 1373–1402 (2021) 5. Li, X., Pancera, E., Niestoruk, L., et al:. Performance of an ultra wideband radar for detection of water accumulation in the human bladder. In: The 7th European Radar Conference, IEEE (2010) 6. CST Microwave Studio. http://www.cst.com 7. Orfanidis, S.J.: Electromagnetic waves and antennas (2002). http://www.ece.rutgers.edu/ ~{}orfanidi/ewa/ 8. Chan, K.K.-M., Tan, A.E.-C., Rambabu, K.: Decade bandwidth circularly polarized antenna array. IEEE Trans. Antennas Propag. 61(11), 5435–5443 (2013) 9. Gabriel, C., Gabriel, S., Corthout, Y.: The dielectric properties of biological tissues: I. Literature survey. Phys. Med. Biol. 41(11), 2231 (1996) 10. Gabriel, S., Lau, R., Gabriel, C.: The dielectric properties of biological tissues: II. Measurements in the frequency range 10 Hz to 20 GHz. Phys. Med. Biol. 41(11), 2251 (1996) 11. Li, X., Jalilvand, M., Zwirello, L., et al.: Synthetic aperture-based UWB imaging system for detection of urine accumulation in human bladder. In: 2011 IEEE International Conference on Ultra-Wideband (ICUWB), IEEE (2011) 12. Gabriel, S., Lau, R., Gabriel, C.: The dielectric properties of biological tissues: III. Parametric models for the dielectric spectrum of tissues. Phys. Med. Biol. 41(11), 2271 (1996) 13. Studio, M.: CST-computer simulation technology. Bad. Nuheimer. Str. 19(64289) (2008) 14. Chitradevi, B., Srimathi, P.: An overview on image processing techniques. Int. J. Innov. Res. Comput. Commun. Eng. 2(11), 6466–6472 (2014) 15. Parker, J.A., Kenyon, R.V., Troxel, D.E.: Comparison of interpolating methods for image resampling. IEEE Trans. Med. Imaging 2(1), 31–39 (1983)

AI in Medicine

A Nearest Neighbor Propagation-Based Partial Label Learning Method for Identifying Biotypes of Psychiatric Disorders Yuhui Du1(B) , Bo Li1 , Ju Niu1 , and Vince D. Calhoun2 1 School of Computer and Information Technology, Shanxi University, Taiyuan, China

[email protected] 2 Tri-Institutional Center for Translational Research in Neuroimaging and Data Science

(TReNDS), Georgia State University, Georgia Institute of Technology, Emory University, Atlanta, GA, USA

Abstract. Diagnoses of psychiatric disorders only based on clinical presentation are less reliable. In clinical practice, it is difficult to distinguish bipolar disorder with psychosis (BPP), schizoaffective disorder (SAD), and schizophrenia (SZ) as they have many overlapping symptoms. Therefore, there is an urgent need to develop new methods to help increase diagnostic reliability or even explore biotypes for the psychiatric disorders by using neuroimaging measures such as brain functional connectivity (FC). Partial label learning can extract valid information from subjects with incompletely accurate labels, however it has not been well studied in the neuroscience field. Here, we propose a new partial label learning method to explore transdiagnostic biotypes using FC estimated from functional magnetic resonance imaging (fMRI) data. Our method iteratively mines reliable information from available subjects and then propagates the gained knowledge in a typical K + N graph structure. Based on fMRI data from 113 BPP patients, 113 SAD patients, 113 SZ patients, and 113 healthy controls (HC), meaningful biotypes are obtained using our method, showing significant differences in FC. In conclusion, the proposed method is promising in extracting biotypes of psychiatric disorders. Keywords: Psychiatric disorders · Bipolar disorder with psychosis · Schizoaffective disorder · Schizophrenia · Partial label learning · Biotype

1 Introduction At present, the diagnoses of psychiatric disorders are mainly based on the subjective description of patients’ symptoms and the observation of patients’ behavior. However, many psychiatric disorders are highly heterogeneous in terms of pathological and biological basis. Therefore, the clinical diagnoses obtained only through clinical evaluation or patient self-assessment often have the disadvantage of low accuracy due to a lack of biological evidence [1, 2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 301–308, 2024. https://doi.org/10.1007/978-3-031-51485-2_32

302

Y. Du et al.

There are already many well-established methods to explore biotypes in psychiatric disorders. The mostly applied unsupervised clustering methods totally ignore diagnostic information. To address this issue, semi-supervised learning methods that use a portion of labeled subjects as guidance have gained much attention of researchers. Ge et al. explored subtypes of youth, aged 9–10 years, with mood and anxiety disorders compared to typically developing youth by Heterogeneity through Discriminative Analysis (HYDRA) [3]. Honnorat et al. used a semi-supervised clustering method called CHIMERA to discover three subtypes of schizophrenia [4]. Chand et al. used a semi-supervised machine learning approach using structural MRI, resulting in two different biological types of schizophrenia (SZ) [5]. However, these semi-supervised learning approaches did not take into account the case of data with inaccurate diagnosis labels. Fortunately, partial label learning works under the assumption of initial label ambiguity of subjects by constructing candidate label sets, which is very suitable for the study of mental disorders. However, to the best of our knowledge, there is no work using partial label learning to study biotypes of mental disorders. Furthermore, there are also some shortcomings in the traditional partial label learning methods. For example, candidate label sets of subjects are often randomly added with additional class labels. Another limitation is that important subjects with more reliable diagnoses are not well employed. These disadvantages probably hinder its application to the study of mental disorders [6, 7]. In this paper, a novel instance-based nearest neighbor propagation partial label learning (INNPL) method is proposed to overcome disadvantages of previous partial label learning methods, and is then applied to exploring transdiagnostic biotypes using brain functional connectivity (FC) estimated from functional magnetic resonance imaging (fMRI) data. INNPL constructs candidate label sets based on the data similarity, builds a special nearest neighbor graph that reinforces label information from reliable subjects, and iteratively propagates information to achieve class labels of all subjects. We focus on identifying biotypes among bipolar disorder with psychosis (BPP), schizoaffective disorder (SAD), SZ, and healthy controls (HC), considering their large overlapping on clinical symptoms. Whole-brain regions-of-interest (ROI) based FC features are computed to investigate those disorders at a whole-brain level. Through our study, we not only extract biotypes across those disorders, but also verify significant FC differences between the identified biotypes, providing meaningful insights for neuroimaging-based auxiliary diagnosis of psychiatric disorders.

2 Material and Method 2.1 Materials We use resting-state fMRI data of 113 patients with BPP, 113 patients with SAD, 113 patients with SZ, and 113 HC from the multi-site bipolar and schizophrenia network on intermediate phenotypes study [7]. There are no significant group differences in age and gender. Statistical parametric mapping (SPM) software is used for preprocessing those fMRI data [8]. Based on the fMRI data of each subject, we calculate wholebrain FC among 116 ROIs from the automated anatomical labeling (AAL) atlas as the neuroimaging features.

A Nearest Neighbor Propagation-Based Partial Label Learning Method

303

2.2 Method We propose a partial label learning method named INNPL for the biotype detection across those subjects of BPP, SAD, SZ, and HC using FC. Figure 1 shows the framework of INNPL.

Fig. 1. A framework of INNPL

In Step 1, typical subjects with prominent characteristics and reliable diagnosis labels are selected first for each class (e.g., BPP). In this work, within each class (e.g., BPP) of subjects, we take the subjects with the top 10% of degree centrality as the typical subjects of this class, since they should be more reliable and representative. And then, the initial candidate label set Si is formed for each subject xi (1 ≤ i ≤ n). While the candidate labels of those typical subjects are made to be unique and as same as their diagnosis labels, the initial candidate labels of the remaining subjects are generated based on their similarity to the typical subjects in addition to their initial diagnosis labels. That means if one subject is close to the center of typical subjects in one class, the corresponding class label will be added to its candidate label set in addition to its original label. Step 2 is to iteratively construct a K + N nearest neighbor node graph and propagate the label information to complete the disambiguation of the candidate label sets of nodes (i.e., subjects). Regarding the K + N nearest neighbor node graph, each node represents one subject, and each edge linking two subjects is measured by the similarity between them computed using their FC features. Different from a regular graph, for each subject, our K + N nearest neighbor node graph not only considers the similarity with its K nearest neighbor subjects, but also highlights the similarity with its N nearest typical subjects. The process emphasizes the important role of typical subjects, considering that their labels are more reliable and their features are more representative. As mentioned above, our method achieves the final labels of all subjects through iterative graph construction and label propagation. The following describes the two sub-steps in one iteration of Step 2. Step 2(1): Construct a K + N nearest neighbor node graph by means of an auxiliary fully connected graph that changes with iterations. Specifically, the nodes of the auxiliary graph in this iteration include the nodes of the auxiliary graph from the previous iteration and their neighbor nodes. It should be pointed out that the labels of the nodes that are already determined in the previous iteration will be kept unchanged during the subsequent

304

Y. Du et al.

propagation through setting their candidate label unique. The edges of the auxiliary graph are formed by calculating the inter-subject similarity using FC features. Particularly, the nodes of the auxiliary graph from the first iteration include typical nodes (i.e., subjects) and their nearest neighbor nodes. Based on the auxiliary graph, we respectively select K nearest neighbor nodes and N nearest neighbor typical nodes for each node to obtain a K nearest neighbor subject graph with the weight matrix W K and N nearest neighbor typical subject graph with the weight matrix W N , and fuse them to obtain the K + N nearest neighbor node graph that has a weight matrix of W K + W N . Step 2(2): Propagate label information in the constructed K + N graph to determine the labels of the nodes in this iteration, and then update the typical subjects of each class, the center of typical subjects of each class, and the candidate label set for each subject. Based on the K + N nearest neighbor node graph constructed in Step 2(1), the newly added nodes’ candidate label sets of in this iteration are disambiguated to determine their labels. We use a label confidence matrix whose element represents the degree of likelihood that each subject belongs to each class to obtain the final label after label propagation. The detailed update strategy of the label confidence matrix in the process of label propagation is as follows.   (0) is given Firstly, each element of the initial label confidence matrix F (0) = fi,c n×q

by  ∀1 ≤ i ≤ n :

(0) fi,c

=

1 |Si | ,

ifyc ∈ Si 0, otherwise

(1)

Here, n and q denote the number of nodes in this iteration and the number of categories, (0) respectively. fi,c denotes the initial likelihood degree that the ith subject belongs to the cth class with the label value yc . |Si | denotes the number of labels in the candidate label set of the ith subject. As such, we guarantee that each subject has the same initial likelihood of belonging to the categories whose labels belong to the candidate label set. ˜ (t)  the tth propagation, we calculate a temporary label confidence matrix F =  For (0) according to Eq. (2). f˜i,c n×q

  F˜ (t) = α · W K + W N F (t−1) + (1 − α) · F (0) #

(2)

α ∈ (0, 1) is a weight to balance the influence of the label confidence matrix F (t−1) in the (t−1)th propagation and the original label confidence matrix F (0) . We then transform F˜ (t) into F (t) by ensuring that the sum of the likelihoodsthateach (t) . subject belongs to different categories equals to 1 using Eq. (3). Here, F (t) = fi,c n×q

yc and yl represent the labels of the cth class and the lth class, respectively. ⎧ (t) ⎨ f˜i,c , if yc ∈ Si

(t) ˜ (t) 1 ≤ i ≤ n : fi,c = # y ∈S f ⎩ l i i,l 0, otherwise

(3)

When the mode of the confidence matrix difference between the tth and the (t − 1)th is small enough or the number of iterations reaches 100, the label propagation stops.

A Nearest Neighbor Propagation-Based Partial Label Learning Method

305

Next, based on the final label confidence matrix F (t) after the label propagation, we determine each node’s label in this iteration by selecting the candidate label with the highest likelihood for each node. Subsequently, we then update the typical subjects by selecting the top 15% nontypical subjects measured by the degree centrality within the same class, and calculate the center of typical subjects of each class based on the updated typical subjects, and update the candidate labels of the remaining subjects in the candidate label set based on the class centers. The information will be used for constructing the K + N graph in the Step2 (1). In summary, Step 2 iteratively builds a K + N graph to propagate label information until the labels of all subjects are obtained by the method. In our study, by using the proposed INNPL algorithm, these subjects of BPP, SAD, SZ, and HC are grouped into different biotypes. It is expected that the subjects grouped into the same biotype have a similar FC pattern, while different biotypes show significant differences in the FC patterns. 2.3 Evaluation of Biotypes To evaluate the identified biotypes, we first calculate a confusion matrix to reflect the relationship between the original diagnosis categories and the identified biotypes. We use the analysis of variance (ANOVA) to evaluate the differences among the identified biotypes as well as among the original diagnosis categories to investigate whether the biotypes would show more group differences. After performing a Bonferroni correction (p-value < 0.05) on the group difference results, we select the most significant 15 FC features from the original categories as well as from identified biotypes, respectively, for a comparison. Also, we present the 3D projection of the subjects under the original categories and the identified biotypes via a t-distributed stochastic neighbor embedding (t-SNE) technology to see if the distribution of the subjects from different biotypes is more clearly separated. To clearly reflect group differences in the FC strengths, we also visualize the mean FC strength for each of 15 important FC features under the original categories and identified biotypes in the brain map.

3 Results Figure 2 shows the confusion matrices to reflect the relationship between the original diagnosis categories and the identified biotypes by computing how many subjects in the original categories are separately assigned into the biotypes. Figure 2a and b display the results of the subjects except the typical subjects and the results of all subjects, respectively. It can be observed that HC, SAD, and SZ almost remained the largest amounts in the corresponding biotypes (i.e., HC*, SAD*, and SZ*), while many BPP patients are grouped into other biotypes, indicating the possible complexity and unstableness of BPP.

306

Y. Du et al. (a)

(b)

Fig. 2. a Confusion matrix for the subjects except the typical subjects. b Confusion matrix for all subjects. The percentage and number in each cell represent the proportion and the number of subjects that are overlapped between the original categories (i.e., HC, BPP, SAD, and SZ) and the identified biotypes (HC*, BPP*, SAD*, and SZ*).

Figure 3 shows the results of the 3D projections of the original categories and the biotypes via a t-SNE technology based on the PCA-reduced 15 FC features. As can be seen from the figure, the 3D subject projection of the biotypes obtained using INNPL shows a greater disparity between groups and a closer relation within each group, compared to the original categories. (a)

(b)

Fig. 3. a The 3D projection of the subjects under the original categories via a t-SNE technology based on the PCA-reduced connectivity features. b The 3D projection of the subjects under the biotypes via a t-SNE technology based on the PCA-reduced connectivity features.

Figure 4a and b show the mean FC strength of the subjects for the 15 FCs that have the greatest group differences for the original category and the biotypes, respectively. Since the original diagnostic categories only had 15 FCs that showed group differences after the Bonferroni correction (p < 0.05), we also selected the top 15 FCs with the smallest p-values in the biotypes for a display. The mean p-value of the 15 FCs was smaller in the biotypes (i.e. 6.46e-23) than that in the diagnosis categories (i.e. 2.11e-06). It is seen that the mean FC strength of the identified biotypes had greater group differences, with the brain FC mainly involving the frontal, occipital and parietal lobes.

A Nearest Neighbor Propagation-Based Partial Label Learning Method (a)

307

(b)

Fig.4. a Mean FC strength of the subjects in each original category for the 15 FCs that show the greatest group differences (corresponding to the first 15 smallest p-values in ANOVA) in the original categories. b Mean FC strength of the subjects in each biotype for the 15 FCs that show the greatest group differences in the identified biotypes. Positive and negative strengths are shown using red and blue colors, respectively.

4 Conclusions Heterogeneity in mental disorders is indeed a significant issue. While several studies have attempted to identify biotypes using data-driven approaches, the findings have been inconsistent. As a result, the development of new data-driven methods for predicting biotype labels presents a significant challenge. In this paper, a novel instance-based nearest neighbor propagation partial label learning method is proposed and successfully applied to the biotype identification of BPP, SAD, SZ, and HC. The biotypes identified by our method exhibited consistent neurobiological distinctiveness and biological importance. This is supported by the observation of greater differences among the identified biotypes compared to the original categories in the 15 most significant functional connectivity (FC) features. This finding provides evidence for the effectiveness of our INNPL method in studying psychiatric disorders. Much remains to be explored in the area of transdiagnostic biotypes detection in psychiatric disorders using data-driven approaches. In the future, we will continue to do more work on model improvement and validation of partial label learning for crossdiagnostic studies of mental disorders. Acknowledgment. This work was supported by National Natural Science Foundation of China (Grant No. 62076157 and 61703253, to Yuhui Du). We acknowledge the contribution of the participants in the Bipolar-Schizophrenia Network for Intermediate Phenotypes-1 (BSNIP-1).

References 1. Bzdok, D., Meyer-Lindenberg, A.: Machine learning for precision psychiatry: opportunities and challenges. Biol. Psychiatry: Cogn. Neurosci. Neuroimaging 3(3), 223–230 (2018) 2. Clementz, B.A., Sweeney, J.A., et al.: Identification of distinct psychosis biotypes using brainbased biomarkers. Am. J. Psychiatry 173(4), 373–384 (2016)

308

Y. Du et al.

3. Ge, R., Sassi, R., et al.: Neuroimaging profiling identifies distinct brain maturational subtypes of youth with mood and anxiety disorders. Mol. Psychiatry 28(3), 1072–1078 (2023) 4. Honnorat, N., Dong, A., et al.: Neuroanatomical heterogeneity of schizophrenia revealed by semi-supervised machine learning methods. Schizophr. Res. 214, 43–50 (2019) 5. Chand, G.B., Dwyer, D.B., et al.: Two distinct neuroanatomical subtypes of schizophrenia revealed using machine learning. Brain 143(3), 1027–1038 (2020) 6. Xu, N., Qiao, C., et al.: Instance-dependent partial label learning. Adv. Neural. Inf. Process. Syst. 34, 27119–27130 (2021) 7. Tamminga, C.A., Ivleva, E.I., et al.: Clinical phenotypes of psychosis in the BipolarSchizophrenia Network on Intermediate Phenotypes (B-SNIP). Am. J. Psychiatry 170(11), 1263–1274 (2013) 8. Du, Y., Pearlson, G.D., et al.: Identifying dynamic functional connectivity biomarkers using GIG-ICA: application to schizophrenia, schizoaffective disorder, and psychotic bipolar disorder. Hum. Brain Mapp. 38(5), 2683–2708 (2017)

Predicting Timing of Starting Continuous Renal Replacement Therapy for Critically Ill Patients with Acute Kidney Injury Using LSTM Network Model Chengyuan Li1 , Zunliang Wang1(B) , Lu Niu1 , and Songqiao Liu2,3(B) 1 State Key Laboratory of Bioelectronics, School of Biological Science and Medical

Engineering, Southeast University, Nanjing, China [email protected] 2 Jiangsu Provincial Key Laboratory of Critical Care Medicine, Department of Critical Care Medicine, School of Medicine, Zhongda Hospital, Southeast University, Nanjing, China [email protected] 3 Department of Critical Care Medicine, Nanjing Lishui People’s Hospital, Zhongda Hospital Lishui Branch, Nanjing, China

Abstract. Acute kidney injury (AKI) is a common complication after liver transplantation (LT), with a high mortality rate. Continuous renal replacement therapy (CRRT) is an important mean of treatment of AKI, but its optimal starting time is still controversial. In this study, we present a long short-term memory (LSTM) network based predictive model to select the timing of initiating CRRT therapy for AKI patients. We used patients’ daily clinical examination data from MIMIC-IV to build prediction models. Study populations were a random sample of patients aged 18 years or older with AKI. Using the model, the timing of starting CRRT for AKI patients within 12–18 h in advance can be well predicted. The area under the receiver operating characteristic curve (AUC) of 12-h prediction CRRT starting was 0.985. For the interpretability of the model, we utilized Shapley Additive Explanations to analyze the importance of features, and seek the highly relevant factors concerned in the model prediction process and the positive and negative contributions of the value of each variable to the model. The five most important features for this prediction model are the dose of vasopressin within 6 h, creatinine value, bicarbonate concentration, chloride concentration, and blood urea nitrogen. This study will help clinicians make optimal decision of CRRT treatment for the patient with AKI for better outcomes. Keywords: CRRT · AKI · Time Series Prediction · Sliding Window · LSTM

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 309–316, 2024. https://doi.org/10.1007/978-3-031-51485-2_33

310

C. Li et al.

1 Introduction Acute kidney injury (AKI) is a common but potentially fatal complication in the intensive care unit (ICU) [1]. Acute kidney injury means high morbidity, high mortality, and high treatment costs. Renal replacement therapy (RRT) is an important supportive treatment for patients with severe AKI [2]. It provides an extremely important homeostasis balance for the treatment of critically ill patients. In the world, some countries have carried out extensive research on the timing of CRRT initiation. Most of the existing studies are prospective randomized controlled trials [3], and patients are randomly grouped according to certain screening conditions, divided into standard group and delayed starting group, observe the change of mortality through renal replacement intervention at different starting times, so as to obtain a more reasonable starting time of renal replacement. However, different randomized controlled trials have obtained different conclusions [4–6]. And in summary, there is still a lot of room for development in clinical research on the timing of starting renal replacement therapy in critically ill patients, and it is difficult to find a perfect solution in a short time. However, if the timing of starting renal replacement therapy can be predicted in advance based on the experience of existing doctors, leaving enough time for doctors to perform additional interventions, it will have non-negligible clinical significance for the recovery of patients. Today, most reported predictive researches mainly focus on mortality rate after CRRT for critically ill patients [7, 8], or feature threshold selection related to the starting time based on clinical observation. There is still lacking of intelligent prediction of CRRT starting time. The prediction of the starting time of CRRT requires the analysis of continuous clinical data with strong real-time performance. This poses a challenge for the traditional machine learning based prediction methods involved in many complex features. Moreover, the collection of time-series data, the division of time windows, and the definition of patient labels are the difficulties of this prediction task. LSTM networks [9] are well suited to classify, process and predict time series given time lags of unknown duration. In this study, we employed the LSTM network to build a prediction model that can help clinicians determine the optimal CRRT starting time for AKI patients.

2 Materials And Methods 2.1 Source of Data MIMIC [10] database integrates critical clinical data of patients at the Beth Israel Deaconess Medical Center in Boston, Massachusetts. The data set does not contain personal sensitive information. Compliant with HIPAA security laws [11]. The version officially released in 2021 is MIMIC-IV, and MIMIC-IV consists of 28 tables, among which the data of the three major modules of Core. Hosp, and ICU have been put into use.

Predicting Timing of Starting Continuous Renal Replacement Therapy

311

2.2 Research Population In MIMIC-IV, there were 33,135 patients with AKI stage > 0 defined by KDIGO [12] criteria, 44,486 ICU events, and 1,985 CRRT treatment records. According to the timeserialization data used in this study and the label division method (visible label division), it is not suitable to include all AKI admission events into the study. We firstly collected 1,985 cases diagnosed as AKI and using CRRT, and a 0.1 ratio was used for stratified random sampling from the admission events of patients diagnosed with AKI but not using CRRT according to the three stages of AKI, and finally 4268 patients who were diagnosed with AKI and did not start CRRT and 1,985 who were diagnosed with AKI and started CRRT were collected. 2.3 Data Extraction and Feature Screening We divide a series of medical events of patients into time units, and take every 6 h as a basic unit. All records that occurred within the same 6-h period are grouped together. We extracted demographic data, admission information, vital sign, laboratory tests, and related drug treatments during admission from the MIMIC-IV database to form a feature set, and processed the raw data reasonably. A denser dataset with 27 features was obtained. We used linear interpolation to fill in the remaining missing data. The specific features used in this study are shown in the Table 1. Table 1. Feature data used in the model Data type

Data description

Patient basic information

Age, height, sex, charlson comorbidity index

Fluid input and output

Urine output

Vasopressor

Dobutamine, dopamine, epinephrine, norepinephrine, milrinone, phenylephrine, vasopressin

Blood gas

Blood urea nitrogen, bicarbonate, chloride, calcium, glucose, potassium, sodium, anion gap, creatinine

Vital sign

Heart rate, systolic blood pressure, diastolic blood pressure, mean blood pressure, respiratory rate, temperature, blood oxygen saturation

2.4 Time Series Forecasting Events Via Sliding Window As shown in Fig. 1, all electronic health record (EHR) data available for each patient were structured into a sequential historical data for both inpatient and outpatient events in six-hour time unit described as a circle. We used the feature data (Table 1) covered by a 24-h features sliding window for model prediction. The sliding window moves forward one step every 6 h, which is used as a prediction event to determine whether the CRRT starting event occurs in the 6 h after the 12 h gap (see Fig. 1).

312

C. Li et al.

Fig. 1. Sliding window for the timing prediction of CRRT initiation

Figure 2 shows the division of positive and negative samples in this study. The red basic unit represents that the patient has started CRRT at this time, and blue is the opposite. For each sliding, if the CRRT starting time is behind the 6 h prediction unit, we marked the last basic unit of the 24 h feature sliding window with a negative label. On the contrary, if the starting time of CRRT is between 6 h prediction units, we marked the last basic unit of the 24 h feature sliding window with a positive label. Each sliding of the time window is regarded as a prediction event.

Fig. 2. Sample labelling criteria

2.5 Model Training and Prediction We trained our model on the MIMIC-IV dataset using 5—fold cross-validation [13]. The basic workflow of our machine learning based prediction method is shown in the Fig. 3. In this work, firstly, we extracted the ICU patient data from MIMIC-IV, then performed feature processing. The high-dimensional feature map is processed by the LSTM model with 64-units, and then converted to 2D features by a flattening layer. These tensors are normalized and then processed by the fully connected layer. Finally, the prediction unit output the prediction result of CRRT starting time.

Predicting Timing of Starting Continuous Renal Replacement Therapy

313

Fig. 3. Schematic of the prediction method based on the LSTM network

3 Results We trained our model using 5-fold cross-validation to ensure the generalization of the model. In order to evaluate the model performance, we compared the prediction performance of the LSTM network model with Random Forest [14], XGBoost [15], and anomaly detection model OCSVM for unbalanced datasets [16], which is illustrated in Table 2. The four models achieved the precision of 59.8–90.4% and areas under the receiver operating characteristics curves (AUC) of 0.644–0.985. The ROC curves of 4 models as shown in Fig. 4. The comparison result suggests that the LSTM model is the best in predicting the CRRT starting time. Table 2. Comparison of model prediction performance

Precision

LSTM

XGBoost

RF

OCSVM

0.986

0.933

0.972

0.610

Recall

0.900

0.880

0.890

0.580

Specificity

0.908

0.886

0.897

0.595

AUC

0.985

0.942

0.973

0.644

We adopted the Shapley Additive Explanations (SHAP) [17] to examine feature importance for the LSTM prediction model. Here, the mean of SHAP value for each feature was used as the feature’s importance indicator, which are illustrated in Table 3. Obviously, vasopressin amount has an important effect on the prediction of the model. In addition, blood gas data such as creatinine and bicarbonate also contribute a lot to the model prediction in this work. In contrast, some vital signs such as heart rate and temperature contribute less to the model prediction performance in this study.

314

C. Li et al.

Fig. 4. ROC curves of the XGBoost, RF, OCSVM and LSTM models

4 Discussion Through this study, we further understand the importance of related clinical data in the prediction of CRRT initiation time. Since all clinical data used in this study was collected from a single-center dataset (MIMIC-IV), further validation tests on large amounts of high-quality multicenter data need to be carried out for improving the performance and reliability of our prediction model. From our results, it can be seen that the recall (sensitivity) of each model is relatively low, indicating that the prediction accuracy of the true positive samples needs to be further improved. This is mainly caused by the imbalanced sample data due to the relatively low proportion of the positive samples. So, we will further examine the distribution of data sets on the sensitivity of the prediction model. Meanwhile, the use of dynamic sliding window method will be investigated for the optimization of feature selection to improve the prediction performance of our model.

5 Conclusions In this study, we developed a machine learning prediction method based on LSTM network, by which the CRRT starting time of AKI patients can be well predicted in advance within 12–18 h using the daily clinical examination data of AKI patients. The result of this study is expected to be used as an intelligent CRRT decision-making tool to help doctors select the optimal timing of CRRT initiation in AKI patients for better outcomes.

Predicting Timing of Starting Continuous Renal Replacement Therapy

315

Table 3. Feature importance rankings Feature name

Mean(|SHAP value|)

Vasopressin amount

0.362

Creatinine

0.048

Bicarbonate

0.038

Chloride

0.035

Blood urea nitrogen

0.032

Systolic blood pressure

0.031

Urine amount

0.029

Anion gap

0.028

Respiratory rate

0.026

Glucose

0.024

Potassium

0.023

Heart rate

0.023

Diastolic blood pressure

0.020

Temperature

0.019

Sodium

0.018

Mean(|SHAP value|): average impact on model output magnitude

Acknowledgment. This research was financially supported by the Jiangsu Provincial Special Program of Medical Science, China (BE2020786), and the National Natural Science Foundation of China (81971885).

References 1. De Vlieger, G., Kashani, K., Meyfroidt, G.: Artificial intelligence to guide management of acute kidney injury in the ICU: a narrative review. Curr. Opin. Critical Care 26(6), 563–573 (2020) 2. Cho, K.C., Himmelfarb, J., Paganini, E., et al.: Survival by dialysis modality in critically ill patients with acute kidney injury. J. Am. Soc. Nephrol. 17(11), 3132–3138 (2006) 3. Gaudry, S., Hajage, D., Martin-Lefevre, L., et al.: The artificial kidney initiation in kidney injury 2 (AKIKI2): study protocol for a randomized controlled trial. Trials 20(1), 1–10 (2019) 4. Ren, A., et al.: Optimal timing of initiating CRRT in patients with acute kidney injury after liver transplantation. Ann. Transl. Med. 8, 21 (2020) 5. Zarbock, A., et al.: Effect of early vs delayed initiation of renal replacement therapy on mortality in critically ill patients with acute kidney injury: the ELAIN randomized clinical trial. Jama 315(20), 2190–2199 (2016) 6. Gaudry, S., et al.: Delayed versus early initiation of renal replacement therapy for severe acute kidney injury: a systematic review and individual patient data meta-analysis of randomised clinical trials. Lancet 395(10235), 1506–1515 (2020)

316

C. Li et al.

7. Hung, Pei-Shan., et al.: Explainable machine learning-based risk prediction model for inhospital mortality after continuous renal replacement therapy initiation. Diagnostics 12(6), 1496 (2022) 8. Kang, M., et al.: Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Critical Care 24(1), 1–9 (2020) 9. Hochreiter, S., Schmidhuber, Jürgen.: Long short-term memory. Neural Comput. 9(8), 1735– 1780 (1997) 10. Johnson, A.E.W., Pollard, T.J., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3 (2016) 11. Gross Marielle, S., Hood Amelia, J., Rubin Joshua, C.: HIPAA and the Leak of “Deidentified” EHR data. N. Engl. J. Med. 385(12): e38–e38 (2021) 12. ad-hoc working group of ERBP et al.: A European Renal Best Practice (ERBP) position statement on the Kidney Disease Improving Global Outcomes (KDIGO) clinical practice guidelines on acute kidney injury: part 1: definitions, conservative management and contrastinduced nephropathy. Nephrol. Dial. Transplant. 27(12), 4263–4272 (2012) 13. Refaeilzadeh, P., Tang, L., Liu, H.: Cross-validation. Encycl. Database Syst. 5, 532–538 (2009) 14. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 15. Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016) 16. Schölkopf, B., et al.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems 12 (1999) 17. Sundararajan, M., Najmi, A.: The many Shapley values for model explanation. In: International Conference on Machine Learning. PMLR (2020)

An End-To-End Seizure Prediction Method Using Convolutional Neural Network and Transformer Yiyuan Wang and Wenshan Zhao(B) School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China [email protected]

Abstract. With the fast development of intelligent medical technology, epileptic seizure prediction (SP) based on electroencephalography (EEG) has gradually become a frontier research topic in the field of healthcare digitalization due to the advantages of unravelling the mechanism of seizures and avoiding possible injuries. Existing SP methods based on EEG have several shortcomings such as poor feature representation ability and low prediction performance. In this paper, an end-to-end model for SP based on EEG is proposed, where convolutional neural network (CNN) is used to extract spatial features and transformer is employed to analyze long-term temporal information. The model proposed in this paper was evaluated on the Children’s Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) dataset. Experiment results show that the hybrid model in this paper achieves excellent seizure prediction performance, with accuracy of 92.5%, sensitivity of 91.8%, specificity of 93.1%, F1 score of 0.925 and area under the curve of 0.924. Keywords: Artificial intelligence in medicine · Convolutional neural network · Electroencephalography · Seizure prediction · Transformer

1 Introduction Epilepsy is one of the most common chronic diseases of nervous system with more than 50 million people suffering worldwide [1]. Although most patients can control their disease with anti-epileptic drugs (AEDs), there is still 30 percent of patients taking AEDs ineffective, known as refractory epilepsy. The recurrent and unprovoked seizures cause a constant threat to the refractory patients and seriously affect their physical and mental health. By employing seizure prediction algorithm (SPA), a reliable seizure prediction system can provide seizure alarm and remind medical workers and patients to intervene in advance. In this way, the psychological pressure of patients can be reduced and the medical resources can be optimally allocated, which has great practical significance to achieve healthcare digitalization [2]. As the gold standard in epilepsy diagnosis and treatment, electroencephalogram (EEG) reflects the electrophysiological activity of brain nerve cells during an epileptic seizure [3]. The EEG pattern of epileptic patients can be divided into interictal state, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 317–324, 2024. https://doi.org/10.1007/978-3-031-51485-2_34

318

Y. Wang and W. Zhao

preictal state, ictal state and postictal state, as shown in Fig. 1. SPA aims to distinguish preictal signals from interictal signals, which is a typical binary classification task.

Fig. 1. Schematic diagram of four different periods in epilepsy EEG signals.

Up to now, SPAs can be divided into two categories, i.e. conventional machine learning method and deep learning method. The former needs to manually extract features which will be input into classifiers, such as support vector machine [4] and random forest [5]. Nevertheless, it is difficult to design the most discriminative features manually since the difference between interictal and preictal EEG is inconspicuous. In order to extract features automatically, researchers have introduced deep learning into SPA, such as convolutional neural network (CNN) [6] and long short-term memory (LSTM) [7]. Although having strong modeling ability for spatial features extraction, CNN pays less attention to temporal information. Meanwhile, LSTM can capture the time correlation but is insufficient for long-term dependence modeling. Transformer can recognize the long-term dependence phenomenon in time series and has been applied in SPA [8]. However, most of the existing transformer-based SPAs still need feature extraction manually, which cannot guarantee prediction performance among different patients with specific seizure pattern [9]. To address above problems, this paper proposes an end-to-end EEG-based prediction model using CNN and transformer. Considering that EEG is one-dimensional (1D) signal, this paper applies convolution instead of traditional two-dimensional (2D) convolution, so as to improve the feature representation ability and computation efficiency. Simultaneously, the encoding structure of transformer is introduced, and the self-attention mechanism is utilized to solve the long-term dependence problem while improving the parallel computing ability of model. The proposed SPA combines the complementary advantages of CNN and transformer and can automatically extract the best features for each patient. Experimental results show that the proposed method can effectively improve the performance of SPA.

2 Materials and Methods 2.1 Data Description The widely recognized Children’s Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) scalp EEG dataset is used in this paper [10]. The CHB-MIT dataset contains EEG data from 22 patients and the data sampling rate is 256 Hz. The preictal

An End-To-End Seizure Prediction Method Using Convolutional

319

data defined in this paper is one hour before seizures data. The interictal data in this paper was constructed by respectively selecting 1 h data from the interictal data before and after seizure. In order to avoid the interference of different seizures, each patient needs to contain at least two seizure events, and the two events need to be separated by more than 4 h. Besides, the number of EEG channels ranges from 18 to 28. To mitigate the effect of data heterogeneity and give a fair comparison, only the most consistent 18 channels are used, including FP1-F7, F7-T7, T7-P7, P7-O1, FP1-F3, F3-C3, C3-P3, P3-O1, FP2-F4, F4-C4, C4-P4, P4-O2, FP2-F8, F8-T8, T8-P8, P8-O2, FZ-CZ, and CZ-PZ. 2.2 Preprocessing In medical field, imbalance data is a common problem, which also exists in seizure prediction task. For fear of this problem, the method of overlapping sliding window is used in this paper. Specifically, EEG signals are segmented by a sliding window length of 1s. The preictal signals overlapped by 75% while the interictal signals overlapped by 50%, which approximately makes them reach the balance of 1: 1. Then Z-score was used to standardize the data, and EEG segments were converted into standard normal distribution with zero mean and unit variance. 2.3 Convolutional Neural Networks Traditional CNN is mostly used in the field of image processing. Convolution kernels are mostly two-dimensional structures, and the mechanism is limited when directly applied to one-dimensional EEG signals. Therefore, this paper utilizes 1D convolution kernel to replace the traditional 2D convolution kernel, making it compatible with EEG signals. Assuming that the length of the input x[n] is N, the number of convolution kernels is K, the weight of the neurons is w, the bias unit is expressed as b, and the activation function is linear rectification function (ReLU), then the output of the improved convolutional layer is calculated as follows:   K N (1) x[n]wi [n] + bi , 0 f = max i=1

n=1

The schematic diagram of CNN in this paper is shown in Fig. 2, which contains two one-dimensional convolutions with kernel size 5 and 3 respectively, to extract local features of EEG signals. The output channels of two convolutional layers are 32 and 64 respectively. Each convolutional layer is followed by a ReLU activation function and batch normalization layer (BN) to improve the convergence speed. After that, the model applied 1D max pooling layer with kernel size of 2 and step size of 2. Finally, in order to reduce the data dimension, a convolution layer with kernel size of 1 is employed to reduce the data channel to 32, and the multi-scale feature map is extracted. Furthermore, the convolution operation has the same step size of 1. 2.4 Transformer Block Recurrent neural networks such as LSTM are suitable for processing time series, but they cannot avoid the problem of gradient disappearance and are difficult to capture

320

Y. Wang and W. Zhao

Fig. 2. The schematic diagram of CNN structure proposed in this paper

the long- term dependence features in time series. The core module of transformer is the self-attention mechanism, which can solve the problem of long-term dependence and improve the parallel computing ability of the model [11]. At the same time, selfattention mechanism can also make up for the weak ability of CNN to process time series. Assuming that the input EEG feature maps are x 1 and x 2 , three output vectors are obtained by matrix operation, namely query vector Q, keyword vector K and value vector V. The calculation formula is as follows: q1 = x1 W Q , q2 = x2 W Q

(2)

k1 = x1 W K , k2 = x2 W K

(3)

v1 = x1 W V , v2 = x2 W V

(4)

The output vectors z1 and z2 of the self-attention mechanism are obtained by the following formula:   q k T q k T  v  v 1 1 1 (5) z1 = [θ11 , θ12 ] 1 = softmax √ 1 , √ 2 v2 dk dk v2   q k T q k T  v  v 2 2 1 z2 = [θ21 , θ22 ] 1 = softmax √ 1 , √ 2 (6) v2 dk dk v2 where d k is the dimension of vector k. The multi-head attention mechanism introduces multiple Q, K, and V on the basis of the attention mechanism to enrich the temporal

An End-To-End Seizure Prediction Method Using Convolutional

321

Fig. 3. The schematic diagram of transformer model proposed in this paper

information of EEG signals. Considering the trade-off be- tween training time and model performance, in this paper, the multi-head num is set to 8. In addition, the residual structure and layer normalization are also applied in transformer to improve the network training ability and avoid overfitting. The final seizure prediction results are behind two fully connected layers. The complete structure of transformer in this paper is shown in Fig. 3.

3 Results According to the definition of preictal and interictal EEG in the Sect. 2, 12 eligible patients were selected from CHB-MIT dataset for experiment, namely chb01, chb04, chb05, chb06, chb07, chb08, chb09, chb10, chb13, chb14, chb15 and chb22. The 10fold cross-validation technique is employed to verify the robustness of proposed SPA. Several evaluation metrics are used to assess the performance of proposed SPA, including accuracy (ACC), sensitivity (SE), specificity (SP), F1 score and receiver operating characteristic area under the curve (AUC). Table 1 gives the experiment results of proposed SPA. Apparently, the average values of ACC and SP were more than 92%, which means that the low false alarm rate can be achieved to improve patients’ compliance and reduce patients’ anxiety. The average SE of 91.8% is obtained, indicating the strong ability to classify preictal EEG samples

322

Y. Wang and W. Zhao Table 1. The experiment results of proposed SPA

Subject

ACC

SE

SP

F1

AUC

chb01

97.265%

96.816%

97.717%

0.97268

0.97267

chb04

92.030%

93.688%

90.331%

0.91799

0.92009

chb05

83.989%

85.442%

82.586%

0.83991

0.84014

chb06

89.527%

89.132%

89.379%

0.89338

0.89256

chb07

97.847%

96.894%

98.812%

0.97854

0.97853

chb08

96.076%

95.710%

96.438%

0.96106

0.96074

chb09

97.708%

96.651%

98.778%

0.97719

0.97714

chb10

95.416%

94.805%

96.014%

0.95489

0.95409

chb13

94.114%

92.091%

96.088%

0.94292

0.94090

chb14

84.025%

81.980%

86.144%

0.84122

0.84062

chb15

84.494%

81.771%

87.197%

0.84949

0.84484

chb22

97.013%

96.831%

97.191%

0.97058

0.97011

Average

92.459%

91.818%

93.056%

0.92499

0.92437

correctly. As expected, the F1 score and AUC score both exceed 92%, showing the robust seizure prediction capability of proposed SPA. Furthermore, it can be seen from Table 1 that the evaluation metrics of each person in this paper exceed 80% at the same time, which has certain significance for solving the seizure patterns of EEG signals varied from patient to patient in the field of seizure prediction. Table 2. The performance comparison of existing SPA SPA

EEG

ACC

SE

AUC

STFT + CNN[12]

scalp

NA

81.2%

NA

WT + CNN[13]

scalp

NA

87.8%

0.866

LSTM[14]

intracranial

85.1%

86.8%

0.920

CNN + LSTM[15]

scalp

87.8%

84.8%

NA

This work

scalp

92.5%

91.8%

0.924

4 Discussion Table 2 presents the comparison results between this paper with existing literatures, with evaluation results and details of EEG signals included. Apparently, ACC, SE and AUC of proposed method are all the best compared with the existing literatures. Literature

An End-To-End Seizure Prediction Method Using Convolutional

323

[12] and [13] respectively adopted short time Fourier transform (STFT) and wavelet transform (WT) as the feature, which will increase the needless workload. Literature [15] combines CNN and LSTM to achieve better performance, but the evaluation results are still lower than the model in this paper, which proves that the combination of CNN and transformer in this paper is successful and transformer model has strong potential in solving the long-term dependence of EEG signals in seizure prediction tasks. As mentioned in Sect. 2, the definition of preictal EEG in this paper is one hour before seizures data. Compared with 30 min or shorter preictal phase, SPA in this paper can give more advanced seizure prediction time and provide timely protection for patients. It should be noted that the type of EEG signal in this paper is scalp EEG. Instead of utilizing intracranial EEG such as literature [14], which has the disadvantages of invasive acquisition and huge surgical risk, scalp EEG signals are less harmful to patients and more in line with the future development trend of intelligent medical treatment.

5 Conclusion Seizure prediction system integrated with SPA is considered as a promising tool to improve the quality of life of patients with intractable epilepsy. To enhance the prediction performance, this paper proposes an end-to-end EEG-based SPA. CNN is used to automatically extract spatial features and transformer is utilized to capture long-term dependence in EEG signals through self-attention mechanism. Experimental results show that performance of SPA proposed in this paper can achieve ACC of 92.459%, SE of 91.818%, SP of 93.056%, F1 score of 0.92499 and AUC of 0.92437, which illustrated that the proposed method can predict seizures precisely and effectively. The proposed method can provide a feasible solution for prompt warning call, which can avoid sudden death caused by epilepsy and complete health management of intractable patients. Acknowledgment. This work was supported by the National Natural Science Foundation of China under Grant 61504008.

References 1. Thurman, D.J., et al.: Standards for epidemiologic studies and surveillance of epilepsy. Epilepsia 52, 2–26 (2011) 2. Elger, C.E., Hoppe, C.: Diagnostic challenges in epilepsy: seizure under-reporting and seizure detection. Lancet Neurol. 17(3), 279–288 (2018) 3. Zhang, H., Su, J., et al.: Predicting seizure by modeling synaptic plasticity based on EEG signals-a case study of inherited epilepsy. Commun. Nonlinear SCI 56, 330–343 (2018) 4. Kamel, E.M., Massoud, Y.M., et al.: EEG classification for seizure prediction using SVM vs deep ANN. In: 2021 Tenth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 2021, pp. 389–395 (2021) 5. Wang, Y., Cao, J., Lai, X., Hu, D.: Epileptic state classification for seizure prediction with wavelet packet features and random forest. In: 2019 Chinese Control And Decision Conference (CCDC). Nanchang, China, pp. 3983–3987 (2019)

324

Y. Wang and W. Zhao

6. Li, C., Lammie, C., et al.: Seizure detection and prediction by parallel memristive convolutional neural networks. IEEE Trans. Biomed. Circuits Syst. 16(4), 609–625 (2022) 7. Daoud, H., Bayoumi, M.A.: Efficient epileptic seizure prediction based on deep learning. IEEE Trans. Biomed. Circuits Syst. 13(5), 804–813 (2019) 8. Bhattacharya, A., Baweja, T., Karri, S.: Epileptic seizure prediction using deep transformer model. Int. J. Neural Syst. 32(2), 2150058 (2021) 9. Li, C., Huang, X., et al.: EEG-based seizure prediction via transformer guided CNN. Measurement 203, 111948 (2022) 10. CHB-MIT Scalp EEG Database at http://www.physionet.org/pn6/chbmit/ 11. Vaswani, A., Shazeer, N., et al.: Attention is all you need. Adv. Neural. Inf. Process. Syst. 2017, 5998–6008 (2017) 12. Truong, N.D., Nguyen, A.D., et al.: Convolutional neural networks for seizure prediction using intracranial and scalp electroencephalogram. Neural Netw. 105, 104–111 (2018) 13. Khan, H., Marcuse, L., et al.: Focal onset seizure prediction using convolutional networks. IEEE Trans. Biomed. Eng. 65(9), 2109–2118 (2017) 14. Varnosfaderani, S.M., et al.: A two-layer LSTM deep learning model for epileptic seizure prediction. In: IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), Washington DC, USA, pp. 1–4 (2021) 15. Wang, Z., Zhou, X.: Prediction of epileptic seizures based on CNN-LSTM network. In: 2022 2nd International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT), Wuhan, China, 2022, pp. 131–135 (2022)

Ensemble Feature Selection Method Using Similarity Measurement for EEG-Based Automatic Sleep Staging Desheng Zhang and Wenshan Zhao(B) School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China [email protected]

Abstract. Sleep staging is the first step in the diagnosis of sleep disorders. In recent years, the automatic sleep staging algorithm based on single channel electroencephalogram (EEG) signal has received extensive attention, but having the problems of insufficient information, poor interpretability and low accuracy. To address above problems, this paper proposes an ensemble feature selection method based on the similarity measurement. Firstly, statistical transformation and the envelope extraction are used to transform the original features obtained by different feature extraction methods in order to obtain candidate features with the desired similarity. Secondly, multiple similarity metrics are used to extract a series of highly interpretable features for sleep staging. Then, two-stage voting approach is used to select the customized features for each subject and the universal features for different subjects. Experiment results show that the proposed method can achieve the classification results with the accuracy of 97.33% when using random forest as the classifier. Keywords: Sleep stages · Similarity measurement · Feature selection · EEG

1 Introduction Sleep has attracted a tremendous amount of attention in scientific research due to the significant importance to health. According to a report of the Chinese Sleep Research Society, more than 300 million people in China have sleep disorder [1]. Persistent sleep deprivation may lead to several health issues, such as high blood pressure, sleep apnea syndrome and cardiovascular disease [2]. At present, the first step in the diagnosis of sleep disorders is to perform the classification of sleep stages of the subjects, commonly known as sleep staging. Manual sleep staging based on polysomnography (PSG) device is considered as the gold standard for measuring sleep. However, PSG devices that collect electroencephalogram (EEG) signals through multiple channels are often expensive and can be uncomfortable for subjects. Manual sleep staging based on PSG data is also time-consuming and error-prone [3]. In this context, the automatic sleep staging (ASS) algorithm based on single-channel EEG signal has received extensive attention due to the minimal interference with sleep and the ability to provide acceptable levels of accuracy. In the existing © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 325–332, 2024. https://doi.org/10.1007/978-3-031-51485-2_35

326

D. Zhang and W. Zhao

ASS algorithms, feature extraction plays an important role in enhancing the classification performance. Currently, the methods for feature extraction in ASS can be mainly divided into hand-engineered features and deep-learned features, but nevertheless, most of which do not have good interpretability and cannot achieve a balance between accuracy and computational complexity. In this paper, an ensemble feature selection method using similarity measurement is proposed for ASS based on single-channel EEG. The main contributions are as follows: (1) Certain nonlinear feature has been found to exhibit similar patterns of change as observed in sleep stage transitions, which is illustrated in Fig. 1. However, it should be noted that this special similarity is not always consistent or universal. The statistical transformation and the envelope extraction are introduced to transform the original features obtained by different feature extraction methods, so as to obtain enough candidate features with the desired similarity. (2) Based on the similarity measurement, an ensemble feature selection method using the voting mechanism in ensemble learning is proposed. The feature selection strategy is achieved through similarity evaluation using well-established similarity metrics in various fields. Also, a two-stage hard voting method is used in the selection of feature subset. (3) Compared with the existing ASS algorithms, the classifier employed in this paper has low computational complexity. Random forest is used as the classifier to demonstrate the effectiveness of the selected features, which can achieve the classification results with the accuracy of 97.33%.

Fig. 1. Pattern of change between sleep stage transition and optimal feature

2 Methods As the first step of machine learning, feature selection plays an important role in ASS. Given the fact that the ensemble mechanism has the ability to better incorporate the merits of various feature selection methods, this paper proposes an ensemble feature selection method based on similarity measurement, as shown in Fig. 2.

Ensemble Feature Selection Method Using Similarity Measurement

327

Fig. 2. Framework of ensemble feature selection method

2.1 Generation of Original Feature Set Since EEG is the non-stationary signal, it is reasonable to extract the features from different domains in order to obtain the most representative features for the classification of sleep stages. In this paper, the features are extracted in four domains, including time domain, frequency domain, time-frequency domain and nonlinear domain. As shown in Table 1, six commonly used features are used in proposed ASS algorithm. Table 1. Summary of extracted features for ASS Category

Feature name

Time domain

Segmenting method [4]

Frequency domain

Power spectral density [5]

Time-frequency domain

Wavelet transform [6]

Nonlinear dynamics

Generalized Hurst exponent [8]

Empirical mode decomposition [7] Multi-scale sample entropy [9]

2.2 Generation of Feature Subsets with Desired Similarity The similarity is used to evaluate and then select the optimal features for ASS, which is calculated by comparing the pattern of change between sleep stage transition labelled by expert and the extracted features. However, using the features in Table 1 only may lead to

328

D. Zhang and W. Zhao

unsatisfied similarity, and then low classification accuracy. Therefore, the feature subsets are constructed from the original features in Table 1 aiming for similarity enhancement. To achieve above goal, two methods including statistics transformation and envelope extraction are introduced to transform each feature in Table 1 to obtain the candidate feature subsets. Features that do not meet the desired similarity criteria are eliminated, and the remaining features are retained as a subset of candidate features. Statistics transformation: For similarity measurement, the feature waveform of each sample decomposed at each level is selected for unified statistical analysis using the maximum value, minimum value, average value and variance value. Then, the obtained statistical values derived from each feature are connected in time domain to form a waveform, which is the candidate feature subset for future selection. Envelope extraction: The time series of feature waveform can be regarded as the high-frequency signal, whose amplitude change can be reflected by the envelope of the curve. Meanwhile, the similarity is used to measure the degree of rhythm synchronization between the feature and the sleep stage transition. Therefore, the envelope of candidate features is extracted using low-pass filter for similarity measurement, in which the key parameters of the filter are determined by experiment. 2.3 Similarity Measurement Base on Base Feature Selectors The methods for similarity measurement should be sufficiently diverse in order to select the optimal features for ASS. In this paper, the ensemble learning is introduced in feature selection, and ten metrics for similarity measurement of candidate feature subsets are employed, including Euclidean distance, Manhattan distance, Chebyshev distance, Bray Curtis distance, Mahalanobis distance, angle cosine, Pearson correlation coefficient, hamming distance, structure similarity index measure [10] and Siamese neural network (SNN) [11]. Then, the top ten features with large similarity are selected as the new feature subsets. Generally, the similarity measurement metrics employed in this paper can be divided into two categories. One is based on the traditional distance measurement method. Assuming that the input vectors are x and y, the distance d 12 is calculated as follows: d12 =

n  

(x1 − y1 )2 + ... + (xk − yk )2

(1)

k=1

An alternative approach is to convert the feature curve and the expert label to a grayscale image with the same data distribution, and then use Siamese neural networks to measure their similarity. Siamese neural network employs two identical sub-networks to process two inputs followed by a module that produces the final output by combining their outputs. Therefore, compared with classification loss function types, SNN uses the comparative loss function, with the following formula [11]: L(x1 , x2 , I ) = I D2 + (1 − I ) max (0, m − D)2

(2)

D = F(x1 ) − F(x2 )2

(3)

Ensemble Feature Selection Method Using Similarity Measurement

329

where D is the Euclidean distance between the two features learned by two sub-nets F(x 1 ) and F(x 2 ), m is margin as the maximum distance, I indicates that the label refers to whether the input is of the same category. In this paper, expert label image of healthy and disease-affected individuals are randomly paired to train the Siamese neural network to identify the pattern of sleep stage transition. After training, the candidate features are evaluated using a similarity ranking approach with the expert label image.

3 Two-Stage Majority Voting Method In this paper, the two-stage majority voting method is adopted. Considering the different pattern of EEG among different subjects, this paper attempts to obtain the same features that are applicable to different subjects in order to expand the potential application scenarios. In the first stage, this paper applies the majority voting method with 10 similarity metrics for each subject. The top 10, top 5, and top 3 feature subsets with initial similarity ranking is regard as three different feature subsets for majority voting respectively. In the second stage, the updated three different feature subsets of each subject are aggregated according to subjects 1–10 respectively and the majority voting is conducted again. The votes of all features in the three feature subsets of subjects 1–10 are counted. The optimal feature subset is selected by taking the highly overlapping and top-ranked features from the final ranking results of three different feature subsets.

4 Experiments and Results 4.1 Dataset Description and Preprocessing The EEG dataset used in this paper is the Sleep-EDF Database Expanded, which is available on the PhysioNet.org website [12, 13]. Given that this desired similarity is found only in the PZ-OZ electrode in the current study, only the data from this single channel is selected. Firstly, this paper uses the data of subjects from 1 to 20, in which only the data from the first fully awake day to the first fully asleep day are retained for each subject, and the other recorded information is omitted. That is, the data about 20 h after the beginning of the experiment is used. Secondly, this paper redefines the epoch length of sleep EEG data, subdividing the original 30-s time window into 10-s time windows. It is worth emphasizing the convention of using EEG data as long as 30 s per epoch has no strict physiological basis [14]. And sleep stages are reclassified according to AASM 2007 standard. The sleep stages are divided into stage Wakefulness, stage N1, stage N2, stage slow-wave sleep and rapid eye movement (REM). Finally, this classification of sleep stages is an imbalanced five-class classification problem. Thus, this paper introduces SMOTE-Tomek Links [15], which can make the number of samples of each class basically equal. In order to further demonstrate the generalizability of the proposed method for feature selection, this paper performs optimal feature selection among subjects 1–10, and evaluates classification performance among subjects 11–20.

330

D. Zhang and W. Zhao

4.2 Results and Analysis The experiment result shows that the top five features of the three feature subsets are basically consistent in the final voting results. Therefore, this paper finally selects a feature subset containing five-dimensions feature. In addition, random forest is used as the classifier in this paper. The confusion matrix shows the imbalanced classification results of a single subject. It can be seen that even N1 and REM stages, which have the worst results in most studies, still have relatively high performance in this paper, as shown in Fig. 3.

Fig. 3. Confusion matrix classification results for individual subject

Figure 4 shows that the selected features have stable performance in the experiment of different subject.

Fig. 4. Testing results of ASS for different subjects

Figure 5 visualizes the distribution of data in the feature space for the 5-dimensional optimal feature subset. Table 2 shows the experiment results of proposed method.

Ensemble Feature Selection Method Using Similarity Measurement

331

Fig. 5. Data distribution of the selected features in feature space

Table 2. The performance comparison of average classification results Sleep staging

Stage N1 F1

ACC

Weighted F1

Kappa

[3]

0.2500

0.9048

NA

NA

[16]

0.4827

0.9190

NA

0.8730

[17]

0.6309

0.9289

NA

0.8837

This paper

0.8461

0.9733

0.9736

0.9567

5 Conclusions This paper proposes the ensemble feature selection method to find the optimal features for ASS based on single-channel EEG data. Multiple similarity metrics are used to extract highly interpretable features and two-stage voting approach is used to select the optimal universal features. The experiment result shows the proposed method has satisfactory performance such as accuracy of 97.33%, weighted F1 of 97.36%, kappa of 95.67% and stage N1 F1 of 84.61%.

References 1. Zhou, L., Kong, J., Li, X., et al.: Sex differences in the effects of sleep disorders on cognitive dysfunction. Neurosci. Biobehav. Rev. (2023) 2. Liew, S.C., Aung, T.: Sleep deprivation and its association with diseases-a review. Sleep Med. (2021) 3. Yildirim, O., Baloglu, U.B., Acharya, U.R.: A deep learning model for automated sleep stages classification using PSG signals. Int. J. Environ. Res. Public Health (2019) 4. Keogh, E., Chu, S., Hart, D., et al.: Segmenting time series: a survey and novel approach. In: Data Mining in Time Series Databases (2004)

332

D. Zhang and W. Zhao

5. Al Ghayab, H.R., Li, Y., Siuly, S., et al.: Epileptic EEG signal classification using optimum allocation based power spectral density estimation. IET Signal Process. (2018) 6. Sharmila, A., Geethanjali, P.: DWT based detection of epileptic seizure from EEG signals using naive Bayes and k-NN classifiers. IEEE Access (2016) 7. Hassan, A.R., Subasi, A.: Automatic identification of epileptic seizures from EEG signals using linear programming boosting. Comput. Methods Programs Biomed. (2016) 8. Lahmiri, S., Shmuel A.: Accurate classification of seizure and seizure-free intervals of intracranial EEG signals from epileptic patients. IEEE Trans. Instrum. Meas. (2019) 9. Costa, M., Goldberger, A.L., Peng, C.K.: Multiscale entropy analysis of biological signals. Phys. Rev. E (2005) 10. Wang, Z., Bovik, A.C., Sheikh H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. (2004) 11. Efe, E., Özsen, S.: A new approach for automatic sleep staging: siamese neural networks. Traitement du Signal (2021) 12. Kemp, B., Zwinderman, A.H., Tuk, B., et al.: Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE Trans. Biomed. Eng. (2000) 13. Mourtazaev, M., Kemp, B., Zwinderman, A., et al.: Age and gender affect different characteristics of slow waves in the sleep EEG. Sleep (1995) 14. Chazal, P., Mazzotti, D.R., Cistulli, P.A.: Automated sleep staging algorithms: have we reached the performance limit due to manual scoring? Sleep (2022) 15. Batista, G.E., Bazzan, A.L.C., Monard, M.C.: Balancing training data for automated annotation of keywords: a case study. WOB (2003) 16. Liu, C., Tan, B., Fu, M., et al.: Automatic sleep staging with a single-channel EEG based on ensemble empirical mode decomposition. Phys. A: Stat. Mech. Its Appl. (2021) 17. Xiao, W., Linghu, R., Li, H., et al.: Automatic sleep staging based on single-channel EEG signal using null space pursuit decomposition algorithm. Axioms (2022)

Biomedical Photonics

Rapid Virus Detection Using Recombinase Polymerase Amplification Assisted by Computational Amplicon-Complex Spectrum F. Yang1 , Y. Su2 , F. G. Li1 , T. Q. Zhou2 , X. S. Wang3 , H. Li2 , S. L. Zhang1 , and R. X. Fu2(B) 1 School of Mechatronical Engineering, Beijing Institute of Technology, Beijing, China 2 School of Medical Technology, Beijing Institute of Technology, Beijing, China

[email protected] 3 Beijing Glopro Optoelectronic Technology Co, Ltd, Beijing, China

Abstract. The detection of respiratory viruses is crucial in the setting of the SARS-CoV-2 outbreak. Recombinase polymerase amplification technology speeds up nucleic acid detection compared to the conventional polymerase chain reaction, and requires no professional operations. The integration, cost, and convenience issues with recombinase polymerase amplification detection limits its wide application. In this study, we used the polar GelRed dye’s computational absorption spectrum to probe the amplicon. After bonding with DNA, GelRed molecules can transform into polar electric dipoles attributing to the asymmetry structure. Following centrifugal vibration, electrostatic contact resulted in the precipitation of dipoles. The supernatant’s absorbance spectra changed once the precipitation was removed. The remaining GelRed molecule in the amplified product can be assessed to determine its composition. Based on this principle, we confirmed the viability of the suggested method, and also concentrated the GelRed dye. Finally, we tested the effectiveness of this technique using the synthetic Influenza A template and primer. 10° copies/µL of the template was the lowest concentration that can be detected. It was linearly correlated with the template concentration logarithm. This technique has offered a workable, practical, and affordable detection strategy for the quick pathogen identification. Keywords: Recombinase polymerase amplification · Virus detection · Computational spectrum · Nucleic acid detection

1 Introduction Nucleic acid detection technology is widely used in frontier research in life sciences and clinical diagnosis [1]. Taking the novel coronavirus disease 2019 (COVID-19) as an example, nucleic acid testing can accurately detect the virus, which is one of the key technologies for accurate medical diagnosis and disease prevention. It is also directly related to the prevention and treatment of major diseases and national health issues, which is significant to the stable development of society. Nucleic acid detection seeks to more © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 335–342, 2024. https://doi.org/10.1007/978-3-031-51485-2_36

336

F. Yang et al.

economical, convenient, and rapid technologies with high sensitivity and specificity in parallel [2]. In a nutshell, innovative methods to enhance the key performance of nucleic acid detection technology and applications are of great essentiality for the protection of public health, the promotion of high-tech precision medical technology. Currently, Polymerase Chain Reaction (PCR) is the most widely used nucleic acid amplification technology and the gold standard in clinical medicine and molecular biology. However, nucleic acid detection methods using PCR always rely on professional and expensive instruments and require precise temperature control, which increases the cost of detection [3]. In addition, the process is time-consuming and needs specialized personnel to operate the instrument, resulting in poor efficiency. These shortages of PCR have promoted the development of various isothermal amplification technologies, such as Recombinase Polymerase Amplification (RPA), Loop-Mediated Isothermal Amplification, Nucleic Acid Sequence-Based Amplification, Strand Displacement Amplification, Rolling Circle Amplification, and Helicase-Dependent Amplification [4]. To some extent, these techniques have solved the problems of traditional PCR, improving detection efficiency and reducing the costs and the need for specialized operators. Among various isothermal amplification techniques, RPA technique is considered as a next-generation nucleic acid amplification method that can replace PCR for its rapidity, high sensitivity, mild isothermal reaction conditions, great stability. RPA technology significantly reduces the time consumption of whole testing process by simplifying the amplification steps. However, how to detect the amplicons quickly and efficiently is the biggest problem facing the current RPA technology. Gel electrophoresis still needs several complex instruments and sophisticated operations, which are non-automated, time-consuming, and prone to cause contamination [5]. Lateral flow assay has the risk of cross-contamination. At the same time, it is difficult to completely avoid interfering products such as primer dimers [6]. Real-time fluorescence detection uses an additional probe introduced in a similar way to TaqMan in PCR reactions. However, the synthesis cost of probes for real-time fluorescence detection is much higher than that of classic primers. Moreover, the specialized fluorescence detector increases the cost of detection [7]. In conclusion, the RPA amplicon detection method which is of high efficiency and low cost is significant for the promotion of RPA technology. In this work, we used the computational absorption spectrum of polar GelRed dye to detect the amplicon of RPA. Due to its asymmetry, GelRed molecules will be converted into polar electric dipoles after binding with DNA [8]. After centrifugal vibration, the dipoles form precipitation due to the electrostatic interaction. After removing the precipitation, the absorption spectrum of the supernatant changed. The content of amplified product can be determined by the residual GelRed molecular concentration in the supernatant. Using this principle, we verified the feasibility of the proposed method through experiments, and also optimized the concentration of GelRed dye. Finally, we use the synthesized Influenza A template and primer to verify the performance of this method. The minimum detectable template concentration reaches 10° copies/ µL. It has a strong linear relationship with the logarithm of template concentration. This method provides a feasible, convenient and cheap detection scheme for rapid detection of pathogens at constant temperature.

Rapid Virus Detection Using Recombinase Polymerase

337

2 Method The schematic diagram of the nucleic acid detection instrument is shown in Fig. 1. Firstly, we measured the absorption spectra of different concentrations of GelRed (1000×, 2000×, 4000×, 6000×, 8000×, and 10000×). At a specific wavelength, the absorption intensity of 1× GelRed was calculated by dividing the spectral intensity of pure water minus the spectral intensity of c× GelRed by the corresponding concentration c. The absorption spectrum of 1× GelRed was obtained after averaging the intensities obtained from these six groups.

Fig. 1. The schematic diagram of the nucleic acid detection instrument.

Once the spectrum of 1× GelRed is determined, we can calculate the concentration of random c× GelRed solution. Thus, the concentration of remaining GelRed can be calculated from the spectrum of the supernatant measured after mixing. The spectral intensity of pure water was subtracted from that of the supernatant and divided by that of 1× GelRed to calculate the concentration at each wavelength. After taking the average, we can obtain the calculated concentration of GelRed remaining. Finally, the percentage of GelRed consumed can be determined. The concentration of GelRed remained in the supernatant was subtracted from the concentration of the initial GelRed in the solution to obtain the concentration of GelRed combined with dsDNA in sediment. Dividing this figure by the initial GelRed, we can get the percentage of GelRed consumed.

3 Results 3.1 Absorption Spectra Calculation of 1× GelRed Solution In order to obtain the spectra of 1× GelRed, we measured the absorption spectra of different concentrations of GelRed (1000×, 2000×, 4000×, 6000×, 8000×, and 10000×). As shown in Fig. 2a, the 0× curve represents the spectral intensity of pure water, and the noise curve represents the instrument background noise under the condition of turning off the light source and ambient light. Figure 2b shows the calculated absorption spectrum of 1× GelRed after averaging the intensities obtained from six groups of different concentrations, selecting 450 to 600 nm as the effective

338

F. Yang et al.

calculated wavelength. The absorption spectral intensities are obvious in the range of 500–550 nm and close to 0 near 600 nm, which is due to excitation fluorescence of GelRed, making the absorption of light approximately 0 or even below 0.

Fig. 2. a Absorption spectra of different concentrations of GelRed. b Calculated absorption spectrum of 1× GelRed.

3.2 Determination of Reasonable GelRed Concentration In order to improve the accuracy and sensitivity of the detection, it is of great significance to select the reasonable GelRed concentration. Therefore, different concentrations of dsDNA were respectively mixed with different concentrations of GelRed. Figure 3a-f shows the absorption spectral intensities of the supernatants after centrifugation. The dsDNA concentration ranged from 0 to 100 µg/µL, with 20 µg/µL increases. For a certain GelRed concentration, the curves showed an upward trend with increases in dsDNA concentration. Once the spectrum of 1× GelRed is determined, the percentage of remaining GelRed can be calculated from the spectrum of the supernatant measured after mixing and desilting (details can be seen in the method section). Figure 3g shows the relationship between the percentage of consumed GelRed and the concentration of dsDNA template. Thus, it can be concluded that the most reasonable GelRed concentration is 4000× when the concentration of dsDNA is less than 100 µg/µL. At this concentration, it can not only allow GelRed to fully form complexes with dsDNA molecules and produce a more obvious concentration gradient change, but also save experimental materials and avoid waste. 3.3 Detection of Influenza a Virus To test the quantitative performance, the detection of Influenza A (IA) virus was performed with template concentrations of 10°, 101 , 102 , 103 , and 104 copies/µL. The solutions without primer, without template, with primer of Influenza A virus and N gene of SARS-CoV-2 as template were used as the control groups [9]. The microscopic images after mixing with GelRed are shown in Fig. 4. For the negative control in Fig. 4a– c, there are no obvious sediments. For the positive control in Fig. 4d–h, it can be seen that

Rapid Virus Detection Using Recombinase Polymerase

339

Fig. 3. Absorption spectra of supernatants after mixing different concentrations of dsDNA with a 1000× b 2000× c 4000× d 6000× e 8000× and f 10000× GelRed. g The proportion of GelRed consumed after mixing different concentrations of dsDNA. All of these experiments were repeated ten times (n = 10).

340

F. Yang et al.

the area of precipitation gradually increased with the increase of the dsDNA template concentration. Moreover, the red color of the solution gradually becomes lighter, which reflects the feasibility of the detection principle.

Fig. 4. Microscopic images of the dsDNA-GelRed complexes. GelRed mixed with solutions, a without primer, b without template, c with mismatch primer, d 10°, e 101 , f 102 , g 103 , and h 104 copies/µL of dsDNA solutions.

The mixed solutions were centrifuged and the absorption spectra of their supernatants were measured. As shown in Fig. 5a, compared with the blank group (the concentration of dsDNA was 0 µg/µL) in Fig. 3c, the curves in the control group (without primer, without template, with mismatch primer) did not change significantly. For the positive control, the curve gradually shifted upward as the dsDNA concentration increased, which is the same trend as the curve in Fig. 3. As Fig. 5b shows, the calculated percentages of GelRed depletion were 12.17, 19.55, 24.19, 37.60, and 47.16%. For the negative control, the proportions of consumption were 2.15, 2.37, 2.31%. Taking the logarithm of the template concentration as x and the percentage of GelRed consumption as y, it could be acquired that y = 8.803 x + 10.526 (R2 = 0.974) using linear fitting. Thus, quantifying the dsDNA template by measuring the consumption of GelRed in the solution was feasible.

Fig. 5. a Endpoint spectra corresponding to different concentrations of IA primers after 20 min. b The proportion of GelRed consumed after mixing different concentrations of IA primers.

Rapid Virus Detection Using Recombinase Polymerase

341

4 Discussion The determination of reasonable GelRed concentration is the most significant part in this study. In Fig. 3g, when the GelRed concentration was 1000×, with the increase of dsDNA, the remaining GelRed gradually decreased and was insufficient to form dsDNA-GelRed complex. Therefore, the percentage of consumed GelRed tends to level off. While, when the concentration of GelRed was high enough (4000×, 6000×, 8000×, and 10000×), the proportion of consumed GelRed was approximately linearly related to the concentration of dsDNA. Thus, 4000× was chosen as the most reasonable GelRed concentration.

5 Conclusions In the context of the SARS-CoV-2 outbreak, the identification of respiratory viruses is essential. Compared to the traditional PCR procedure, RPA technology accelerates nucleic acid detection and does not necessitate professional operation. Amplification detection with RPA technology faces challenges with integration, cost, and convenience. This work is the following study of our previous computational spectral biosensors [10–13]. In this investigation, the RPA amplicon was located using the computational absorption spectrum of the polar GelRed dye. Due to their own asymmetry, GelRed molecules will transform into polar electric dipoles after interacting with DNA. Centrifugal vibration is followed by electrostatic interaction, which results in precipitation from the dipoles. When the precipitation was eliminated, the absorbance spectra of the supernatant changed. The amplified product’s composition can be ascertained using the amount of GelRed molecules that are still present in it. We conducted experiments to verify the applicability of the proposed method and enhance GelRed dye concentration. Finally, we used a synthetic influenza A template and primer to evaluate the efficacy of this technique. The lowest concentration at which the template can be detected is 10º copies/µL. It exhibits a strong linear correlation with the template concentration logarithm. This method provides an effective, doable, and reasonably priced detection solution for the rapid pathogen identification. Acknowledgment. This work was supported by National Natural Science Foundation of China (62105177, 62103050, 21904008). Many thanks to Prof. Guoliang Huang’s lab from Tsinghua University for the kind help in optical setup development.

References 1. Broughton, J.P., Deng, X., Yu, G., et al.: CRISPR–Cas12-based detection of SARS-CoV-2. Nat. Biotechnol. 38(7), 870–874 (2020) 2. Huang, G., Huang, Q., Ma, L., et al.: FM to aM nucleic acid amplification for molecular diagnostics in a non-stick-coated metal microfluidic bioreactor. Sci. Rep. 4(1), 1–9 (2014) 3. Tsang, N.Y., So, H.C., Ng, K.Y., et al.: Diagnostic performance of different sampling approaches for SARS-CoV-2 RT-PCR testing: a systematic review and meta-analysis. Lancet Infect. Dis. 21(9), 1233–1245 (2021)

342

F. Yang et al.

4. Zhao, Y., Chen, F., Li, Q., et al.: Isothermal amplification of nucleic acids. Chem. Rev. 115(22), 12491–12545 (2015) 5. Lobato, I.M., O’Sullivan, C.K.: Recombinase polymerase amplification: basics, applications and recent advances. Trac. Trend. Anal. Chem. 98, 19–35 (2018) 6. Liu, D., Shen, H., Zhang, Y., et al.: A microfluidic-integrated lateral flow recombinase polymerase amplification (MI-IF-RPA) assay for rapid COVID-19 detection. Lab Chip 21(10), 2019–2026 (2021) 7. Euler, M., Wang, Y., Heidenreich, D., et al.: Development of a panel of recombinase polymerase amplification assays for detection of biothreat agents. J. Clin. Microbiol. 51(4), 1110–1117 (2013) 8. Fu, R., Du, W., Jin, X., et al.: Microfluidic biosensor for rapid nucleic acid quantitation based on hyperspectral interferometric amplicon-complex analysis. ACS Sensors 6(11), 4057–4066 (2021) 9. Hu, T., Ke, X., Li, W., et al.: CRISPR/Cas12a-enabled multiplex biosensing strategy via an affordable and visual nylon membrane readout. Adv. Sci. 10(2), 2204689 (2022) 10. Fu, R., Li, Q., Wang, R., et al.: (2018) An interferometric imaging biosensor using weighted spectrum analysis to confirm DNA monolayer films with attogram sensitivity. Talanta 181, 224–231 (2018) 11. Li, Q., Fu, R., Zhang, J., et al.: Label-free method using a weighted-phase algorithm to quantitate nanoscale interactions between molecules on DNA microarrays. Anal. Chem. 89(6), 3501–3507 (2017) 12. Fu, R., Su, Y., Wang, R., et al.: Label-free tomography of living cellular nanoarchitecture using hyperspectral self-interference microscopy. Biomed. Opt. Express 10(6), 2757–2767 (2019) 13. Fu, R., Su, Y., Wang, R., et al.: Single cell capture, isolation, and long-term in-situ imaging using quantitative self-interference spectroscopy. Cytom Part A 99(6), 601–609 (2021)

Neuromodulation with Submillimeter Spatial Precision by Optoacoustic Fiber Emitter Ninghui Shao and Hyeon Jeong Lee(B) College of Biomedical Engineering and Instrument Science, Key Laboratory for Biomedical Engineering of Ministry of Education, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, Zhejiang University, Hangzhou, China {22115058,hjlee}@zju.edu.cn Abstract. New techniques to non-invasively modulate neural activities with high precision are constantly being sought in neuroscience. Ultrasound is a promising neuromodulation modality due to its non-invasive potential. However, conventional piezo-based ultrasound neuromodulation using a transducer usually have a spatial resolution of more than a few millimeters, limiting research on specific brain area or neural circuits. Here, a spatially confined optoacoustic emitter is developed for high-precision neuromodulation, allowing neurons to be activated at submillimeter spatial precision. By designing a candle soot-PDMS layered optoacoustic fiber tip (CPOF) with a diameter of 350 µm, we demonstrate the generation of local ultrasound waves with a peak frequency of 3.5 MHz. The ultrasound generated by CPOF can directly activate cortical neurons in culture within a radius of 450 µm around the fiber tip, generating intracellular calcium transients. The CPOF promises to be a novel technique for high-precision neural stimulation and for investigating the mechanism of neurostimulation. Keywords: Optoacoustic neuromodulation · High spatial resolution · Calcium imaging · Cultured cortical neurons · Candle soot

1 Introduction Neuromodulation has wide applications in medical treatment and investigation of the functional activity of neurons [1]. To date, there are several well-developed neuromodulation modalities: direct electrical stimulation, magnetic stimulation, optogenetic stimulation, thermal stimulation, acoustic/mechanical stimulation, and chemical stimulation. Among these, ultrasound neuromodulation is an emerging method showing potential applications for non-invasive, deep brain stimulation without the need for genetic manipulation, which is challenging to be applied in humans and primates. Effective ultrasound neuromodulation has been demonstrated widely both in vivo and in vitro. Using conventional piezo-based ultrasound transducers, which usually have a spatial resolution of more than a few millimeters, previous studies have shown that mechanosensitive ion channels, such as TRPP2 [2], TRPC1, Piezo1, and K2P [3] are involved in the mechanism. In addition, it was reported that ultrasound could activate astrocytes [4]. Effective parameters for in vivo ultrasound neuromodulation were also investigated [5]. Still, its mechanism is unclear, leading to high-precision neuromodulation challenging. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 343–349, 2024. https://doi.org/10.1007/978-3-031-51485-2_37

344

N. Shao and H. J. Lee

As an alternative way to generate ultrasound, the optoacoustic process is induced when certain materials absorb a short pulse of laser and convert it into heat, leading to thermal expansion and compression that generate an ultrasound pulse [6]. Recently, optoacoustic ultrasound was used for neurostimulation [7–14]. With the help of highprecision control of the laser source, the spatial resolution of the generated ultrasound was further improved. Furthermore, more efficient photoacoustic conversion materials have allowed single-cell and sub-cellular level neuromodulation [12], breaking the spatial resolution limit of ultrasound-based neuromodulation. Here, we report the development of optoacoustic fiber using a composite layer containing candle soot and PDMS and demonstrate direct neuromodulation at the submillimeter radius. Candle soot with a high light absorption coefficient and low interfacial thermal resistance can efficiently convert laser energy to heat [15]. At the same time, PDMS was selected for its excellent thermal expansion property [16]. We demonstrated that the candle soot-PDMS layered optoacoustic fiber (CPOF) is a highly efficient optoacoustic converter that generates local ultrasound with a peak frequency of 3.5 MHz. Notably, the generated ultrasound successfully stimulated cortical neurons around the CPOF tip at a 450 µm distance.

2 Methods 2.1 CPOF Fabrication CPOF fabrication consists of CS-PDMS mixture preparation and fiber end coating (Fig. 1). To prepare PDMS, the silicone elastomer base and curing agent (Sylgard 184, Dow Corning Corporation) were dispensed in a container and mixed in a ratio of 10:1 by weight. Candle soot nanoparticles (CSNPs) were produced by burning a paraffin wax candle [17]. A glass slide was placed within the flame and moved to allow uniform deposition of CSNPs on the glass slide. CSNPs were then collected in the container holding the PDMS by scratching the candle soot gently. This process was repeated to aggregate CSNPs adequately. The mixture was stirred to generate an even CS-PDMS mixture with the final weight ratio of PDMS and CSNPs at 10:1. To coat the fiber tip, the end of a 200 µm core diameter multimodal bare fiber was polished. Then, with the help of a micromanipulator, the fiber end was dipped about 100 µm into the CS-PDMS mixture under observation by a stereoscope. Lastly, the fiber coated with the CS-PDMS mixture was pulled out quickly and cured vertically at 100 °C overnight. 2.2 Optoacoustic Signal Measurement A passively Q-switched diode-pumped solid-state laser (1030 nm, 5 ns, 100 µJ, RPMC) was used as the laser source. The laser was connected to fiber jumpers and fiber optic attenuators to control the output power. The CPOF was connected to the fiber from the laser with a multimode fiber connector and a bare fiber terminator. An immersion transducer was used to measure the CPOF photoacoustic signal. The transducer and the fiber tip, controlled by a micromanipulator, were placed in a water tank. The signal measured from the transducer was first amplified by a pre-amplifier

Neuromodulation with Submillimeter Spatial Precision

345

a

b

Fig. 1. Fabrication of candle soot-PDMS layered optoacoustic fiber tip (CPOF). a Steps involved in the fabrication of CPOF. b A CPOF tip under a stereoscope.

and then transferred to a digital oscilloscope. The angler intensity distribution of the photoacoustic signal was measured by changing the angle between the transducer and fiber. 2.3 Cortical Neuron Culture Cortices were dissected from embryonic day 16 Sprague-Dawley rats and then digested with 0.05% Trypsin-EDTA at 37 °C. After 10 min, the digested cells were incubated in a dish for 30 min, and the neurons in the supernatant were plated on poly-D-lysinecoated well plates. Primary cortical neurons were first plated in 10% FBS-5% HS-2mM Glutamine-DMEM. Then, 24 h later, the medium was replaced by 2% B27–1% N2-2mM Glutamine-Neurobasal. After that, half of the medium was replaced every 3 to 4 days. Cells cultured in vitro for 10–13 days were used for the stimulation experiment. 2.4 Calcium Imaging and Neural Stimulation To measure calcium response upon neural stimulation by CPOF, we designed the stimulation/imaging setup from a fluorescence microscope (Fig. 2). The calcium imaging was performed on a Leica DMi8 wide-field fluorescence inverted microscope with a 10x air objective. The excitation light was filtered from a mercury lamp by Leica P FITC Ex 480/40 (11525307). Fluorescence images were recorded by a CMOS camera with an imaging speed of 10 frames per second. To probe calcium transient, calcium indicator Oregon Green™ 488 BAPTA-1, AM (OGD-1) (Invitrogen) was used. OGD-1 was dissolved in 20% Pluronic F-127 in dimethyl sulfoxide (DMSO) as a stock solution. To load OGD-1, the medium of cultured cortical neurons was replaced by a neurobasal containing 2 µM OGD-1; 30 min later, cells were washed and incubated with neurobasal for another 30 min. For stimulation parameters, 1030 nm 5 ns 100 µJ laser pulses were delivered through 0.8 mm and 0.2 mm gap fiber optic attenuators to CPOF. The frequency of laser pulses

346

N. Shao and H. J. Lee

was controlled at 3.6 kHz by a function generator. 720 laser pulses were delivered to CPOF, corresponding to 200 ms stimulation duration for each stimulation. The time interval between each stimulation was set as 2 s, and three successive stimulations were applied.

1030nm laser

CPOF neurons

objective

function generator DM mercury lamphouse filter filter

CMOS

Fig. 2. Schematic diagram of photoacoustic stimulation experimental set-up. DM: dichroic mirror.

3 Results The fabricated CPOF tip has a size of about 350 µm diameter (Fig. 1b). To characterize the acoustic generation of the CPOF, 1030 nm 5 ns 100 µJ laser pulses were delivered through 0.8 mm and 0.2 mm gap fiber optic attenuators. The photoacoustic signal was measured by the transducer. The waveform of the ultrasound pulse is shown in Fig. 3a. The frequency spectrum was obtained through the FFT, showing photoacoustic ultrasound ranges from 1 to 7 MHz and peaks at 3.5 MHz (Fig. 3b). The intensity distribution map shows that the propagation direction of the generated ultrasound is spherical, and most of the energy is concentrated in the forward direction (Fig. 3c). CPOF was applied to stimulate neurons monitored by wide-field fluorescence imaging (Fig. 2). We cultured cortical neurons and labeled them with calcium indicator OGD-1 (Fig. 4a). Calcium transients were observed after optoacoustic stimulation. The fiber tip was placed at the corner of the FOV. The spatial distribution of calcium response was depicted by mapping the change of fluorescence signals (F) (Fig. 4b). Neurons with the most significant fluorescence intensity change were near the fiber, and neurons

Neuromodulation with Submillimeter Spatial Precision

a

b

200

1.0

100

Magnitude (a.u.)

Voltage (mV)

150

50 0 -50

-100 -150

347

0

1

2

3

4

0.8 0.6 0.4 0.2 0.0

Time (μs)

0

2

4

6

8

10

12

14

Frequency (MHz)

c 120

90

60

150

30

180

0

0.0 0.2 0.4

210

330

0.6 0.8

240

1.0

270

300

Fig. 3. Characterization of the CPOF-generated photoacoustic ultrasound. a Representative photoacoustic ultrasound measured with a transducer. b frequency spectrum of the photoacoustic ultrasound in Fig. 3a. c Intensity distribution of the generated photoacoustic ultrasound in different angles.

with a small fluorescence intensity change were far away from the fiber tip. CPOF effectively stimulated cortical neurons around the fiber tip at about 450 µm. Typical calcium response traces were demonstrated in Fig. 4c. Figure 4d demonstrates the relationship between distance and fluorescence intensity change. The fluorescence intensity was elevated immediately after the stimulation onset. Stable responses to each optoacoustic stimulation were observed with max F/F at 6%. To check whether the calcium increase resulted from Na+ channel-dependent action potentials, we applied 3 µM final concentration tetrodotoxin (TTX) for 10 min before the stimulation experiment. After the addition of TTX, no activation was observed under the same stimulation (Fig. 4e). To verify this response was specific to excitable cells, we also applied the same stimulation to SV-HUC-1 cells labeled with OGD-1, and no evident calcium transient was observed.

4 Discussion In this study, we demonstrated a candle soot-PDMS layered optoacoustic fiber tip (CPOF) for neuromodulation with 450 µm spatial resolution. The main material, candle soot, is cost-effective and easy to make. Furthermore, properties of high light absorption coefficient and low interfacial thermal resistance make candle soot an efficient photoacoustic material. It is reported that the thickness of the coating layer is related to the photoacoustic signal amplitude [14]. The process of deposition of photoacoustic material needs to be improved to obtain a controllable thickness of the coating layer.

348

N. Shao and H. J. Lee

Fig. 4. CPOF activates cultured cortical neurons. a Calcium imaging of cortical neurons labeled with OGD-1. Dashed line: location of the fiber tip. b Spatial distribution of calcium response to optoacoustic stimulation. c Calcium trace of neurons stimulated by CPOF. d Heatmap of calcium traces with different distances between neurons and CPOF. e Calcium response before and after TTX application with the same stimulation. Arrows: laser onset.

In the TTX experiment, our result showing complete inhibition of calcium response agreed with the previous photoacoustic neuromodulation study using TFOE [12] but was different from the result using transducer-generated ultrasound, in which the calcium change was not completely inhibited [2]. One possible reason is the frequency spectrum difference between optoacoustic ultrasound and transducer ultrasound. The traditional transducer ultrasound usually has a single frequency, while the frequency band of photoacoustic ultrasound has a wide range. The mechanism of ultrasound neuromodulation remains unknown, and the potential differences in transducer- and photoacoustic-generated ultrasound need to be investigated. Ultrasound neuromodulation has been applied in various areas, including medical treatment, functional studies, connectivity, and neural regeneration. The high spatial resolution provided by photoacoustic neuromodulation is expected to offer new opportunities in these research fields.

Neuromodulation with Submillimeter Spatial Precision

349

5 Conclusions In conclusion, CPOF generated local optoacoustic ultrasound and activated cortical neurons with submillimeter spatial resolution. The calcium imaging data showed that multiple optoacoustic ultrasound pulses could stably activate neurons. The stimulation is Na+ channel and distance dependent.

References 1. Luan, S., Williams, I., Nikolic, K., Constandinou, T.G.: Neuromodulation: present and emerging methods. Front. Neuroeng. 7 (2014). https://doi.org/10.3389/fneng.2014.00027 2. Yoo, S., Mittelstein, D.R., Hurt, R.C., et al.: Focused ultrasound excites cortical neurons via mechanosensitive calcium accumulation and ion channel amplification. Nat. Commun. 13, 493 (2022). https://doi.org/10.1038/s41467-022-28040-1 3. Kubanek, J., Shi, J., Marsh, J., et al.: Ultrasound modulates ion channel currents. Sci. Rep. 6, 24170 (2016). https://doi.org/10.1038/srep24170 4. Oh, S.-J., Lee, J.M., Kim, H.-B., et al.: Ultrasonic Neuromodulation via Astrocytic TRPA1. Curr. Biol. 29, 3386-3401.e8 (2019). https://doi.org/10.1016/j.cub.2019.08.021 5. King, R.L., Brown, J.R., Newsome, W.T., Pauly, K.B.: Effective Parameters for UltrasoundInduced In Vivo Neurostimulation. Ultrasound Med. Biol. 39, 312–331 (2013). https://doi. org/10.1016/j.ultrasmedbio.2012.09.009 6. Yao, J., Wang, L.V.: Photoacoustic microscopy. Laser Photonics Rev. 7, 758–778 (2013). https://doi.org/10.1002/lpor.201200060 7. Jiang, Y., Lee, H.J., Lan, L., et al.: Optoacoustic brain stimulation at submillimeter spatial precision. Nat. Commun. 11, 881 (2020). https://doi.org/10.1038/s41467-020-14706-1 8. Li, Y., Jiang, Y., Lan, L., et al.: Optically-generated focused ultrasound for non-invasive brain stimulation with ultrahigh precision. Light Sci. Appl. 11, 321 (2022). https://doi.org/10.1038/ s41377-022-01004-2 9. Zheng, N., Jiang, Y., Jiang, S., et al.: Multifunctional Fiber-based Optoacoustic Emitter for Non-genetic Bidirectional Neural Communication (2023). https://doi.org/10.48550/arXiv. 2301.03659 Focus to learn more 10. Jiang, Y., Huang, Y., Luo, X., et al.: Neural stimulation in vitro and in vivo by photoacoustic nanotransducers. Matter 4, 654–674 (2021). https://doi.org/10.1016/j.matt.2020.11.019 11. Zheng, N., Fitzpatrick, V., Cheng, R., et al.: Photoacoustic carbon nanotubes embedded silk scaffolds for neural stimulation and regeneration. ACS Nano 16, 2292–2305 (2022). https:// doi.org/10.1021/acsnano.1c08491 12. Non-genetic photoacoustic stimulation of single neurons by a tapered fiber optoacoustic emitter | Light: Science & Applications (2021). https://www.nature.com/articles/s41377-02100580-z. Accessed 29 Jan 2023 13. Jiang, Y.: High Precision Optoacoustic Neural Modulation (2021) 14. Chen, G., Shi, L., Lan, L., et al.: High-precision neural stimulation by a highly efficient candle soot fiber optoacoustic emitter. Front. Neurosci. 16 (2022) 15. Chang, W.-Y., Zhang, X.A., Kim, J., et al.: Evaluation of photoacoustic transduction efficiency of candle soot nanocomposite transmitters. IEEE Trans. Nanotechnol. 17, 985–993 (2018). https://doi.org/10.1109/TNANO.2018.2845703 16. Wolf, M.P., Salieb-Beugelaar, G.B., Hunziker, P.: PDMS with designer functionalities—properties, modifications strategies, and applications. Prog. Polym. Sci. 83, 97–134 (2018). https:// doi.org/10.1016/j.progpolymsci.2018.06.001 17. Faraz, M., Abbasi, M.A., Sang, P., et al.: Stretchable and robust candle-soot nanoparticlepolydimethylsiloxane composite films for laser-ultrasound transmitters. Micromachines 11 (2020). https://doi.org/10.3390/mi11070631

Medical Laboratory Engineering

A Novel Poly(3-hexylthiophene) Microelectrode for Ascorbic Acid Monitoring During Brain Cytotoxic Edema Zexuan Meng1 , Yuchan Zhang1 , Lu Yang1 , Shuang Zhao2,3 , Qiang Zhou1,4 , Jiajia Chen1 , Jiuxi Sui1 , Jian Wang1 , Lizhong Guo1 , Luyue Chang1 , Guixue Wang2,3(B) , and Guangchao Zang1,3,4(B) 1 Institute of Life Science, Laboratory of Tissue and Cell Biology Lab Teaching & Management

Center, Chongqing Medical University, Chongqing 400016, People’s Republic of China [email protected] 2 Key Laboratory for Biorheological Science and Technology of Ministry of Education State and Local Joint Engineering Laboratory for Vascular Implants, Bioengineering College of Chongqing University, Chongqing 400030, People’s Republic of China [email protected] 3 Jinfeng Laboratory, Chongqing 401329, People’s Republic of China 4 Department of Pathophysiology, Chongqing Medical University, Chongqing 400016, People’s Republic of China

Abstract. Neuroelectrochemical sensing technology offers unique benefits for neuroscience research,being able to target measurements of marker changes during physiological or pathological processes. In this paper, we introduce poly(3hexylthiophene) (P3HT) and nitrogen-doped multiwalled carbon nanotubes (NMWCNTs) to construct a composite membrane-modified carbon fiber (CF) microelectrode (CFME/P3HT-N-MWCNTs) to detect ascorbic acid (AA). We apply CFME/P3HT-N-MWCNTs to monitor the AA of intercellular fluid, we achieve the observation of the cellular mechanical process of brain cytotoxic edema at the electrochemical level. Our study can apply a new in vivo electrochemical sensing technology to construct in vivo implanted microelectrodes and monitor key biomarkers. Thanks to this research approach, we will explore the molecular basis of biomechanic-related diseases and discover some underlying mechanisms. Keywords: Microelectrodes · Neuroscience · Brain Cytotoxic Edema · Ascorbic Acid · Poly 3-hexylthiophene

1 Introduction Neuroelectrochemical sensing techniques offer direct and quantitative monitoring of the dynamic level of neurochemicals and can help confirm or identify prospective chemical parameters in the nervous system [1, 2]. Herein, We use CFME/P3HT-N-MWCNTs electrodes for electrochemical monitoring of ascorbic acid release during the mechanical process of brain edema. The electrodes are prepared using a one-pot method under ambient temperature and atmospheric © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 G. Wang et al. (Eds.): APCMBE 2023, IFMBE Proceedings 104, pp. 353–357, 2024. https://doi.org/10.1007/978-3-031-51485-2_38

354

Z. Meng et al.

pressure for in situ electrochemical sensing monitoring of ascorbic acid (AA) in real biological systems. Then, CFME/P3HT-N-MWCNTs are used to monitoring AA in a cellular level and brain slices. The results indicate that these materials have excellent potential for in situ tissue monitoring. Finally, we examine how brain cytotoxic edemas cause the AA release using the produced sensors. We monitor the AA concentration in intercellular fluid to determine the activation and inhibition of N-methyl-D-aspartic acid receptor (NMDAR) and Cl− channels. We demonstrate that the process of cytotoxic edemas brought on by the Na+ and Cl− influx induced by glutamate controls the AA release. Monitoring the dynamics of AA release in brain slices and neurons under physiological and pathological settings by CFME/P3HT-N-MWCNTs demonstrates their practical monitoring performances and the potential for mechanistic studying of biomechanics.

2 Materials and Methods 2.1 Carbon Fiber Microelectrodes (CFMEs) Preparation Briefly, a glass capillary was pulled from the using a micropipette puller to created two tapered glass-coated carbon fiber tip electrodes. The pulled capillary was used as the sheath of CFMEs. A single carbon fiber on a clean glass plate was attached to a copper wire slightly longer than the glass capillary with silver conducting paste, and place it at 80 °C for 1 h. Then, CFMEs were made by carefully inserting the carbon fiber attached copper wire into the capillary with carbon fiber exposed to the fine open end of the capillary and Cu wire exposed to the other end of the capillary. Next, immerse the end of the glass capillaries vertically in the molten paraffin pool, taking care that the paraffin does not touch the carbon fiber at the tip, and then place it in room temperature air to solidify the paraffin. 2.2 Fabrication of CFME/P3HT-N-MWCNTs Sensor Prior to the formal modification of the electrode surface, a mixed solution of 3Hexylthiophene(3-HT) and N-MWCNTs should be prepared initially. In one volume of chloroform, N-MWCNTs (2.8 wt%) and 3-HT (28 µM) were added. Excess Iron(III) chloride hexahydrate was then added to two volumes of chloroform separately. After that, the two solutions were then Respectively sonicated for 1 h each to facilitate dispersion and dissolution. After ultrasonication, the saturated ferric chloride solution was filtered out and placed into the N-MWCNTs and 3-HT mixed solution, which was agitated equally. The CFME was immersed in a mixed solution of 3-HT, N-MWCNTs and Iron(III) chloride hexahydrate for 72 h. Finally, wash the CFME with ultrapure water (18.2 M) until the washing solution is colorless and the CFME is dry at room temperature. 2.3 Brain Slices Experiments Before decapitation, adult Sprague-Dawley rats (male, 300–350 g) were anesthetized using Isoflurane (4% for induction and 2% for maintenance) through a gas pump, brains

A Novel Poly(3-hexylthiophene) Microelectrode for Ascorbic Acid

355

were rapidly extracted (