Neural Information Processing: 30th International Conference, ICONIP 2023, Changsha, China, November 20–23, 2023, Proceedings, Part X (Communications in Computer and Information Science) 9819981409, 9789819981403

The nine-volume set constitutes the refereed proceedings of the 30th International Conference on Neural Information Proc

175 61 65MB

English Pages 632 [629] Year 2023

Table of contents :
Preface
Organization
Contents – Part X
Human Centred Computing
Paper Recommendation with Multi-view Knowledge-Aware Attentive Network
1 Introduction
2 Problem Formulation
3 The Proposed Framework
3.1 Knowledge Graph Construction
3.2 Multi-view Knowledge-Aware Attentive Layers
3.3 Information Aggregation and High-Order Propagation
3.4 Prediction and Optimization
4 Experiments
4.1 Experiment Settings
4.2 Experiment Results and Analysis
4.3 Comparison of Ablation Experiments
5 Conclusion and Future Work
References
Biological Tissue Sections Instance Segmentation Based on Active Learning
1 Introduction
2 Methodology
2.1 Loss Prediction Module
2.2 Uncertainty of Instance Segmentation Result
2.3 Active Learning Strategy for Instance Segmentation
3 Experimental Results
3.1 Dataset
3.2 Results
4 Conclusion
References
Research on Automatic Segmentation Algorithm of Brain Tumor Image Based on Multi-sequence Self-supervised Fusion in Complex Scenes
1 Introduction
2 Methods
2.1 Motivation for Model Design
2.2 Multi-sequence Self-supervised Fusion Segmentation Network
2.3 Backbone: Bi-ConvLSTM
2.4 Neck: Transformer+MLP
2.5 Head: Self-supervised Multi-sequence Integration
3 Experiment and Results
3.1 Dataset
3.2 Experimental Setup
3.3 Results
3.4 Ablation Experiment
4 Comparison with Existing Methods
4.1 Traditional Image Processing Methods
4.2 Methods Based on Deep Learning
4.3 Our Proposed Method
5 Conclusions
6 Outlook and Prospects Outlook
References
An Effective Morphological Analysis Framework of Intracranial Artery in 3D Digital Subtraction Angiography
1 Introduction
2 Related Work
3 Method
3.1 RGFNet for Segmentation of Intracranial Artery in 3D-DSA
3.2 Automatic Morphological Analysis Algorithm of Intracranial Artery
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Segmentation Results
4.4 Ablation Study of RGFNet
4.5 Morphological Analysis Results
5 Conclusion
References
Effective Domain Adaptation for Robust Dysarthric Speech Recognition
1 Introduction
2 Proposed Domain Adaptation Methods
2.1 Data Preprocessing
2.2 Domain-Adapted Transformer
3 Experiments
3.1 Data Description
3.2 Experimental Setup
3.3 Evaluation on Speaker-Dependent DSR
3.4 Evaluation on Speaker-Independent DSR
4 Conclusions
References
TCNet: Texture and Contour-Aware Model for Bone Marrow Smear Region of Interest Selection
1 Introduction
2 Related Work
2.1 Bone Marrow Smear Cell Morphology
2.2 Bone Marrow Smear ROI Selection
3 Method
3.1 Overall Model Structure
3.2 Texture Deep Supervision Module (TDSM)
3.3 Contour Deep Supervision Module (CDSM)
3.4 Loss Function
4 Experiment
4.1 Experimental Setup
4.2 Comparison Experiments
4.3 Ablation Experiments
5 Conclusion
References
Diff-Writer: A Diffusion Model-Based Stylized Online Handwritten Chinese Character Generator
1 Introduction
2 Related Work
2.1 Autoregressive Methods for Online Handwriting Generation
2.2 Online Chinese Character Generation
2.3 Non-autoregressive Diffusion Model
3 Methods
3.1 Denoising Diffusion Probabilistic Models
3.2 Data Representation
3.3 Diff-Writer Model
3.4 Training Process
3.5 Generation Process
4 Experiment and Results
4.1 Dataset
4.2 Implementation Details
4.3 Evaluations
5 Conclusion and Future Work
References
EEG Epileptic Seizure Classification Using Hybrid Time-Frequency Attention Deep Network
1 Introduction
2 The Construction of TFACBNet
2.1 Time-Frequency Attention Module
2.2 Hybrid Deep Network Architecture
3 Experiment and Results
3.1 Database
3.2 Evaluation Metric
3.3 Experimental Results
4 Conclusion
References
A Feature Pyramid Fusion Network Based on Dynamic Perception Transformer for Retinal OCT Biomarker Image Segmentation
1 Introduction
2 Proposed Method
2.1 Feature Pyramid Fusion Module
2.2 Dynamic Scale Transformer Module
2.3 Loss Function
3 Experiment
3.1 Dataset
3.2 Comparison Methods and Evaluation Metrics
3.3 Results
3.4 Ablation Experiment
4 Conclusion
References
LDW-RS Loss: Label Density-Weighted Loss with Ranking Similarity Regularization for Imbalanced Deep Fetal Brain Age Regression
1 Introduction
2 Method
2.1 Label Density-Weighted Loss
2.2 Ranking Similarity Regularization
3 Experiments and Results
3.1 Experimental Settings
3.2 Comparison with Other State-of-the-Art Methods
3.3 Correlation and Agreement Analysis
3.4 Ablation Study
4 Conclusion
References
Segment Anything Model for Semi-supervised Medical Image Segmentation via Selecting Reliable Pseudo-labels
1 Introduction
2 Method
2.1 The Selection of Reliable Pseudo Label (Assisted by SAM)
2.2 Re-training the Model with Reliable Pseudo-labels (Taking SDA-MT as an Example)
3 Experiment and Results
3.1 Dataset
3.2 Implementation Details and Evaluation Metrics
3.3 Results
4 Conclusion
References
Aided Diagnosis of Autism Spectrum Disorder Based on a Mixed Neural Network Model
1 Introduction
2 Preliminaries
2.1 Datasets and Preprocessing
2.2 The Calculation of Functional Connectivity Matrix
2.3 Global Features Selection
2.4 Functional Gradient
3 Methodology and Materials
3.1 The Architecture of the Proposed Model
3.2 The Comparison Models
3.3 Experimental Setting and Evaluation Metrics
4 Experiment Results and Discussion
4.1 Brain Structure Analysis
5 Conclusion
References
A Supervised Spatio-Temporal Contrastive Learning Framework with Optimal Skeleton Subgraph Topology for Human Action Recognition
1 Introduction
2 Relation Work
2.1 Skeleton-Based Action Recognition
2.2 Contrastive Learning
3 Method
3.1 Supervised Spatio-Temporal Contrastive Learning Framework
3.2 Supervised Contrastive Loss
3.3 Optimal Skeleton Subgraph Topology Determination
4 Experiments
4.1 Datasets
4.2 Experimental Configuration
4.3 Ablation Study
4.4 Comparison with Other Methods
5 Conclusion
References
Multi-scale Feature Fusion Neural Network for Accurate Prediction of Drug-Target Interactions
1 Introduction
2 Materials and Methods
2.1 Drug Representation Learning Branch
2.2 Protein Representation Learning Branch
2.3 Feature Fusion and Prediction Module
3 Experiments and Results
3.1 Datasets
3.2 Implementation Details
3.3 DTI Prediction
3.4 Binding Affinity Prediction
4 Conclusion
References
GoatPose: A Lightweight and Efficient Network with Attention Mechanism
1 Introduction
2 Related Work
3 Approach
3.1 Parallel Multi-branch Architecture
3.2 Model Lightweighting
3.3 Adaptive Attention Mechanism
3.4 GoatPose
4 Experiments
4.1 Dataset
4.2 Setting
4.3 Results
4.4 Deployment
5 Conclusion
References
Sign Language Recognition for Low Resource Languages Using Few Shot Learning
1 Introduction
2 Related Work
3 SSL50 Dataset
4 Proposed ProtoSign Framework
4.1 Skeleton Locations Extraction
4.2 Sign Video Encoder
5 Experimental Study
5.1 Datasets
5.2 Implementation Details
5.3 Experimental Results
6 Conclusion
References
T Cell Receptor Protein Sequences and Sparse Coding: A Novel Approach to Cancer Classification
1 Introduction
2 Related Work
3 Proposed Approach
3.1 Incorporating Domain Knowledge
3.2 Cancer Types and Immunogenetics
3.3 Algorithmic Pseudocode and Flowchart
4 Experimental Setup
4.1 Dataset Statistics
4.2 Data Visualization
5 Results and Discussion
6 Conclusion
References
Weakly Supervised Temporal Action Localization Through Segment Contrastive Learning
1 Introduction
2 Related Work
3 Method
3.1 Feature Extraction and Embedding
3.2 Action Proposal
3.3 Inter- and Intra-segment Contrastive Learning
3.4 Training Objectives
3.5 Inference
4 Experiments
4.1 Experimental Setups
4.2 Comparison with the State-of-the-Arts
4.3 Ablation Study
4.4 Qualitative Results
5 Conclusion
References
S-CGRU: An Efficient Model for Pedestrian Trajectory Prediction
1 Introduction
2 Related Work
2.1 Complex-Valued Neural Network
2.2 Recurrent Neural Networks
2.3 Human-Human Interactions
3 Method
3.1 Problem Definition
3.2 Architecture Overview
3.3 Complex Gated Recurrent Unit
3.4 Exploring Pedestrians' Field of Vision in Different Environments
3.5 Loss Functions
4 Experiments
4.1 Dataset and Evaluation Metrics
4.2 Baselines
4.3 Evaluations on ADE/FDE Metrics
4.4 Model Parameter Amount and Inference Speed
4.5 The Range of Visual Focus of Pedestrians While Walking in Different Scenarios
5 Conclusions and Future Works
References
Prior-Enhanced Network for Image-Based PM2.5 Estimation from Imbalanced Data Distribution
1 Introduction
2 Related Work
3 Methodology
3.1 Estimation of DC-IS Maps
3.2 PE Network Architecture
3.3 Histogram Smoothing Algorithm
4 Experiments
4.1 Experiment Setup
4.2 Comparison with the SOTA Methods
4.3 Ablation Studies
4.4 Examples on Various Shooting Scenes
5 Conclusion
References
Dynamic Data Augmentation via Monte-Carlo Tree Search for Prostate MRI Segmentation
1 Introduction
2 Methodology
2.1 Search Space
2.2 Tree Construction
2.3 Tree Updating and Pruning
2.4 Tree Sampling
3 Implementation and Experiments
3.1 Datasets and Implementation Details
3.2 Experimental Results
4 Conclusion
References
Language Guided Graph Transformer for Skeleton Action Recognition
1 Introduction
2 Related Works
3 Method
3.1 Skeleton Data Preprocessing
3.2 GCN
3.3 Graph Transformer
3.4 Text Feature Module
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Ablation Study
4.4 Comparison with State-of-the-Art Methods
5 Conclusion
References
A Federated Multi-stage Light-Weight Vision Transformer for Respiratory Disease Detection
1 Introduction
2 Proposed Framework
2.1 Federated Learning
2.2 Light-Weight Vision Transformer
3 Experimental Dataset and Results
3.1 Classification Dataset
3.2 Results
3.3 Results on Centralized Data
3.4 Results on Federated Learning (IID and Non-IID Data Distribution)
3.5 Communication Time and Client Memory Requirement
3.6 Infection Localization and Error Analysis
4 Conclusion and Future Work
References
Curiosity Enhanced Bayesian Personalized Ranking for Recommender Systems
1 Introduction
2 Related Work
3 Notations and Problem Definition
4 Curiosity Enhanced Bayesian Personalized Ranking
4.1 Model Assumption
4.2 Modeling of Curiosity
4.3 Learning the CBPR
4.4 Item Recommendation
5 Experiments and Results Analysis
5.1 Experimental Settings
5.2 Recommendation Performance
5.3 Parameters Analysis
6 Conclusion
References
Modeling Online Adaptive Navigation in Virtual Environments Based on PID Control
1 Introduction
1.1 Cybersickness Evaluation
1.2 Adaptive Navigation
2 Adaptive Navigation Design
2.1 PID Controller
2.2 1D Convolutional Neural Network
2.3 Formulation of Adaptive Navigation
3 Data Collection and Parameters Computation
3.1 Data Acquisition
3.2 Model Architecture
3.3 Computing the Adaptive Coefficients
3.4 Results
4 Discussion
5 Conclusion
References
Lip Reading Using Temporal Adaptive Module
1 Introduction
2 The Proposed Work
2.1 Lip Reading Models Based on TAM-Net Models
2.2 The Overview of TAM-Net
2.3 Partially Dense TCN
3 Experiments
3.1 Datasets and Implementation Details
3.2 Evaluation of TAM
4 Conclusion
References
AudioFormer: Channel Audio Encoder Based on Multi-granularity Features
1 Introduction
2 Related Work
2.1 Feature Engineering
2.2 High-Order Feature Representation
3 Methodology
3.1 Problem Definition
3.2 Multi-granularity Features Extraction
3.3 Channel Audio Encoder Model Structure
3.4 Feature Encoder
4 Experiments and Dataset
4.1 Dataset Introduction
4.2 Evaluation Standard
4.3 Compared Baselines
4.4 Experimental Results
4.5 Real Scene Verification
5 Ablation Experiments
5.1 Channel Audio Encoder Ablation Experiment
5.2 Multi-granularity Features Ablation Experiment
6 Discussion
7 Conclusion
References
A Context Aware Lung Cancer Survival Prediction Network by Using Whole Slide Images
1 Introduction
2 Related Work
3 The Proposed CA-SurvNet
3.1 Pre-training the WSI_Encoder
3.2 The Network Architecture of CA-SurvNet
4 Experiment and Analysis
4.1 Dataset and Experiment Setup
4.2 Comparison with State-of-the-Arts
4.3 Ablation Study
5 Conclusion
References
A Novel Approach for Improved Pedestrian Walking Speed Prediction: Exploiting Proximity Correlation
1 Introduction
2 Related Work
2.1 Pedestrian Walking Speed Analysis and Forecast
2.2 Time Series Forecasting
3 Methodology
3.1 Problem Definition
3.2 Framework Overview
3.3 Time Series Forecast Module
3.4 Weighted Average Forecast Module
3.5 Fusion Prediction Module
4 Experiment
4.1 Dataset
4.2 Correlation Verification
4.3 Hyperparameter Settings
5 Results
5.1 Correlation Verification
5.2 Model Performance
6 Conclusion
References
MView-DTI: A Multi-view Feature Fusion-Based Approach for Drug-Target Protein Interaction Prediction
1 Introduction
2 Methodology
2.1 Drug Feature Extraction
2.2 Protein Feature Extraction
2.3 Mutual Feature Learning
3 Experiment
3.1 Dataset
3.2 Results
3.3 Robustness Experiment
3.4 Ablation Study
4 Conclusions
References
User Multi-preferences Fusion for Conversational Recommender Systems
1 Introduction
2 Related Work
2.1 Conversational Recommender Systems
2.2 Wide&Deep
3 Preliminary
4 Method
4.1 User's Own Preference Learner
4.2 Third-Party Information Learner
4.3 Attentive Wide&Deep Modeling
4.4 Optimization
5 Experiment
5.1 Experimental Setting
5.2 Evaluation on Recommendation Task
5.3 Evaluation on Conversation Task
5.4 Ablation Study
6 Conclusion and Future Work
References
Debiasing Medication Recommendation with Counterfactual Analysis
1 Introduction
2 Problem Formulation
3 The CAMeR
3.1 Historical Score Computation
3.2 Medication Generate Network
3.3 Counterfactual Debiasing Framework
3.4 Training Objective
4 Experiments
4.1 Dataset
4.2 Baselines and Metrics
4.3 Results
4.4 Ablation Study
4.5 Case Study
5 Related Work
6 Conclusion
References
Early Detection of Depression and Alcoholism Disorders by EEG Signal
1 Introduction
2 Materials and Method
2.1 Used Databases
2.2 Empirical Wavelet Transform
2.3 Feature Extraction
2.4 Feature Selection
2.5 Classifiers
3 Results and Discussion
4 Conclusion
References
Unleash the Capabilities of the Vision-Language Pre-training Model in Gaze Object Prediction
1 Introduction
2 Related Work
2.1 Gaze Following
2.2 Adapter for LLM
3 Method
3.1 EdgeCLIP
3.2 Gaze Prediction
3.3 Object Detection
3.4 Regulatory Loss
4 Experiments
4.1 Setups
4.2 Comparison with State-of-the-Arts
4.3 Ablation Studies
4.4 Qualitative Analysis
5 Conclusion
References
A Two-Stage Network for Segmentation of Vertebrae and Intervertebral Discs: Integration of Efficient Local-Global Fusion Using 3D Transformer and 2D CNN
1 Introduction
2 Methods
2.1 Overall Architecture Design
2.2 3D Coarse Segmentation Stage
2.3 2D Refinement Segmentation Stage
2.4 Graph Convolution Module
3 Experiments
3.1 Dataset
3.2 Implementation Details
3.3 Evaluation Metrics
3.4 Experiment Results
3.5 Ablation Study
3.6 Effect of the Two-Stage Framework
3.7 Effect of the GCM
4 Conclusion
References
Integrating Multi-view Feature Extraction and Fuzzy Rank-Based Ensemble for Accurate HIV-1 Protease Cleavage Site Prediction
1 Introduction
2 Proposed Work
2.1 Data Set
2.2 Feature Extraction
2.3 Fuzzy Rank-Based Ensemble
2.4 Steps of Proposed Technique
3 Experimental Results
4 Discussion
5 Conclusion
References
KSHFS: Research on Drug-Drug Interaction Prediction Based on Knowledge Subgraph and High-Order Feature-Aware Structure
1 Introduction
2 Method
2.1 Extracting Knowledge Graph Subgraph
2.2 Feature Information Perception Module
2.3 Binary DDI Prediction
2.4 Multi-relational DDIs Prediction
3 Experimental Setup and Result Analysis
3.1 Experimental Setup
3.2 Results Analysis
3.3 Ablation Study
3.4 Parameter Discussion
4 Conclusion
References
Self-supervised-Enhanced Dual Hierarchical Graph Convolution Network for Social Recommendation
1 Introduction
2 Related Work
3 Preliminaries
4 Methodology
4.1 Link Encoded-Weight Balanced Message Passing Paradigm
4.2 Dual Hierarchical Self-supervised Enhancement Based on Hypergraph
4.3 Model Training
5 Experiments
5.1 Experimental Settings
5.2 Performance Comparison
5.3 Ablation Study of SSHGCN Framework
6 Conclusion
References
Dynamical Graph Echo State Networks with Snapshot Merging for Spreading Process Classification
1 Introduction
2 Preliminaries
3 The Proposed Model
3.1 The Merged Snapshot Converter
3.2 The Multiple-Reservoir Encoder
3.3 The Linear Classifier
4 The Analysis of the Computational Complexity
5 Experiments
5.1 Descriptions of Datasets
5.2 Tested Models and Experimental Settings
5.3 Evaluation Metrics
5.4 Experimental Results
6 Discussion
References
Trajectory Prediction with Contrastive Pre-training and Social Rank Fine-Tuning
1 Introduction
2 Related Work
3 Our Approach
3.1 Contrastive History-Prediction Learning
3.2 Differentiable Social Interaction Ranking
3.3 Bi-Gaussian Regression Loss
3.4 Two-Stage Multi-task Training Objective
4 Evaluation Metrics
4.1 Standard ADE and FDE
4.2 Surrogate Social Distance Accuracy (SDA)
5 Experiments
5.1 Our Approaches and Baselines
5.2 ADE and FDE Results
5.3 SDA Results
5.4 Ablation Study of Model Adaptivity
5.5 Visualizations of Ranking
6 Conclusion
References
Three-Dimensional Medical Image Fusion with Deformable Cross-Attention
1 Introduction
2 Methods
2.1 Overview of Proposed Method
2.2 Deformable Cross Feature Blend (DCFB)
3 Experiments
3.1 Data Preparation and Evaluation Metrics
3.2 Implementation Details
3.3 Results and Analysis
4 Conclusion
References
Handling Class Imbalance in Forecasting Parkinson's Disease Wearing-off with Fitness Tracker Dataset
1 Introduction
2 Methodology
2.1 Data Collection
2.2 Data Processing
2.3 Class Imbalance Techniques, Experiments
3 Results
3.1 How Can Data Collection Be Improved for Nursing Care Staff to Increase the Number of Wearing-off Labels in the Dataset?
3.2 Which Class Imbalance Techniques Performed Well in Forecasting Wearing-Off in the Next Hour?
3.3 How Does Reweighting the Forecast Probabilities Due to Resampling Affect the Forecasting Models?
4 Discussions
5 Conclusion
References
Real-Time Instance Segmentation and Tip Detection for Neuroendoscopic Surgical Instruments
1 Introduction
2 Related Work
3 Approach
3.1 Framework
3.2 Data Augmentation
4 Results and Discussion
4.1 Datasets and Metrics
4.2 Results
5 Conclusion
References
Spatial Gene Expression Prediction Using Hierarchical Sparse Attention
1 Introduction
2 Related Work
3 Method
3.1 Preliminaries
3.2 Hierarchical Sparse Attention Network (HSATNet)
4 Experiments
4.1 Datasets
4.2 Experimental Set-Up
4.3 Experimental Results
4.4 Ablation Study
5 Conclusion
References
Author Index