Pattern Recognition and Computer Vision: 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part III (Lecture Notes in Computer Science) 9819984343, 9789819984343

The 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recogni

132 10 61MB

English Pages 536 [532] Year 2024

Table of contents :
Preface
Organization
Contents – Part III
Machine Learning
Loss Filtering Factor for Crowd Counting
1 Introduction
2 Related Work
3 Proposed Method
3.1 Background and Motivation
3.2 Loss Filtering Factor
4 Experiments
4.1 Evaluation Metrics
4.2 Datasets
4.3 Neural Network Model
4.4 Experimental Evaluations
4.5 Key Issues and Discussion
5 Conclusions and Future Work
References
Classifier Decoupled Training for Black-Box Unsupervised Domain Adaptation
1 Introduction
2 Related Work
3 Method
3.1 Problem Definition
3.2 Overall Framework
3.3 Classifier Decoupled Training (CDT)
3.4 ETP-Entropy Sampling
4 Experiment
4.1 Setup
4.2 Performance Comparison
4.3 Analysis
5 Conclusion
References
Unsupervised Concept Drift Detection via Imbalanced Cluster Discriminator Learning
1 Introduction
2 Related Works
2.1 Concept Drift Detection
2.2 Imbalance Data Clustering
3 Propose Method
3.1 Imbalanced Distribution Learning
3.2 Multi-cluster Descriptor Training
3.3 Concept Drift Detection Based on MCD
4 Experiments
4.1 Experimental Setup
4.2 Comparative Results
4.3 Ablation Study
4.4 Study on Imbalance Rate and Drift Severity
5 Conclusion
References
Unsupervised Domain Adaptation for Optical Flow Estimation
1 Introduction
2 Related Work
3 Method
3.1 Overview
3.2 Domain Adaptive Autoencoder
3.3 Incorporating with RAFT
3.4 Overall Objective
3.5 Network Architecture
4 Experiments
4.1 Experimental Setup
4.2 Experiment Results
4.3 Ablation Study
5 Conclusion
References
Continuous Exploration via Multiple Perspectives in Sparse Reward Environment
1 Introduction
2 Related Works
3 Method
3.1 Continuous Exploration via Multiple Perspectives
3.2 Global Reward Model
3.3 Local Reward Model
4 Experiment
4.1 Comparison Algorithms and Evaluation Metrics
4.2 Network Architectures and Hyperparameters
4.3 Experimental Results
5 Conclusion
References
Network Transplanting for the Functionally Modular Architecture
1 Introduction
2 Related Work
3 Network Transplanting
3.1 Space-Projection Problem of Standard Distillation and Jacobian Distillation
3.2 Solution: Learning with Back-Distillation
4 Experiments
4.1 Implementation Details
4.2 Experimental Results and Analysis
5 Conclusion
References
TiAM-GAN: Titanium Alloy Microstructure Image Generation Network
1 Introduction
2 Related Work
2.1 Image Generation
2.2 Mixture Density Network
3 Method
3.1 Framework
3.2 Feature-Fusion CcGAN
3.3 Feature-Extraction-Mapping GAN
4 Experiments
4.1 Dataset
4.2 Metric
4.3 Comparison Experiment
4.4 Ablation Experiment
5 Conclusion
References
A Robust Detection and Correction Framework for GNN-Based Vertical Federated Learning
1 Introduction
2 Related Works
2.1 Attack and Defense in Graph Neural Networks
2.2 Attack and Defense in Vertical Federated Learning
2.3 Attack and Defense in GNN-Based Vertical Federated Learning
3 Methodology
3.1 GNN-Based Vertical Federated Learning
3.2 Threat Model
3.3 Framework Overview
3.4 Malicious Participant Detection
3.5 Malicious Embedding Correction
4 Experiment
4.1 Experiment Settings
4.2 Detection Performance(RQ1)
4.3 Defense Performance(RQ2-RQ3)
5 Conclusion
References
QEA-Net: Quantum-Effects-based Attention Networks
1 Introduction
2 Related Works
2.1 Revisiting QSM
2.2 Attention Mechanism in CNNs
3 Quantum-Effects-based Attention Networks
3.1 Spatial Attention Module Based on Quantum Effects
4 Experiments
4.1 Implementation Details
4.2 Comparisons Using Different Deep CNNs
4.3 Comparisons Using MLP-Mixer
5 Conclusion
References
Learning Scene Graph for Better Cross-Domain Image Captioning
1 Introduction
2 Related Work
2.1 Scene Graph
2.2 Cross Domain Image Captioning
3 Methods
3.1 The Principle of SGCDIC
3.2 Parameters Updating
3.3 Object Geometry Consistency Losses and Semantic Similarity
4 Experiments and Results Analysis
4.1 Datasets and Implementation Details
4.2 Quantitative Comparison
4.3 Qualitative Comparison
5 Conclusion
References
Enhancing Rule Learning on Knowledge Graphs Through Joint Ontology and Instance Guidance
1 Introduction
2 Related Work
2.1 Reasoning with Embeddings
2.2 Reasoning with Rules
3 Methodology
3.1 Framework Details
4 Experiment
4.1 Experiment Settings
4.2 Results
5 Conclusions
References
Explore Across-Dimensional Feature Correlations for Few-Shot Learning
1 Introduction
2 Related Work
2.1 Few-Shot Learning
2.2 Attention Mechanisms in Few-Shot Learning
3 Methodology
3.1 Preliminary
3.2 Overall Framework
3.3 Three-Dimensional Offset Position Encoding (TOPE)
3.4 Across-Dimensional Attention (ADA)
3.5 Learning Object
4 Experiments
4.1 Experiment Setup
4.2 Comparison with State-of-the-art
4.3 Ablation Studies
4.4 Convergence Analysis
4.5 Visualization
5 Conclusion
References
Pairwise-Emotion Data Distribution Smoothing for Emotion Recognition
1 Introduction
2 Method
2.1 Pairwise Data Distribution Smoothing
2.2 CLTNet
3 Experiment
3.1 Dataset and Evaluation Metrics
3.2 Implementation Details
3.3 Validation Experiment
3.4 Ablation Study
4 Conclusion
References
SIEFusion: Infrared and Visible Image Fusion via Semantic Information Enhancement
1 Introduction
2 Method
2.1 Problem Formulation
2.2 Network Architecture
2.3 Loss Function
3 Experiments
3.1 Experimental Configurations
3.2 Results and Analysis
3.3 Ablation Study
3.4 Segmentation Performance
4 Conclusion
References
DeepChrom: A Diffusion-Based Framework for Long-Tailed Chromatin State Prediction
1 Introduction
2 Related Work
2.1 Chromatin State Prediction
2.2 Long-Tailed Learning
3 Methods
3.1 Methodology Overview
3.2 Pseudo Sequences Generation
3.3 Chromatin State Prediction
3.4 Equalization Loss
4 Experiments
4.1 Experimental Settings
4.2 Effectiveness of Our Proposed Long-Tailed Learning Methods
4.3 Ablation Study
5 Conclusion and Discussion
References
Adaptable Conservative Q-Learning for Offline Reinforcement Learning
1 Introduction
2 Related Work
3 Prelinminaries
4 Methodology
4.1 Adaptable Conservative Q-Learning
4.2 Variants and Practical Object
4.3 Implementation Settings
5 Experiments
5.1 Experimental Details
5.2 Q-Value Distribution and Effect of the Percentile
5.3 Deep Offline RL Benchmarks
5.4 Ablation Study
6 Conclusion
References
Boosting Out-of-Distribution Detection with Sample Weighting
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Setting
3.2 Weighted Distance as Score Function
3.3 Contrastive Training for OOD Detection
4 Experiment
4.1 Common Setup
4.2 Main Results
4.3 Ablation Studies
5 Conclusion
References
Causal Discovery via the Subsample Based Reward and Punishment Mechanism
1 Introduction
2 Related Work
3 Introduction to the Algorithmic Framework
3.1 Introduction to SRPM Method
3.2 Correlation Measures and Hypothesis Testing
3.3 Skeleton Discovery Algorithm Based on SRPM Method (SRPM-SK)
4 Experimental Results and Analysis
4.1 Benchmark Network and Data Sets
4.2 High Dimensional Network Analysis
4.3 Real Data Analysis
5 Conclusion and Outlook
References
Local Neighbor Propagation Embedding
1 Introduction
2 Related Work
3 Local Neighbor Propagation Embedding
3.1 Motivation
3.2 Mathematical Background
3.3 Local Neighbor Propagation Framework
3.4 Computational Complexity
4 Experimental Results
4.1 Synthetic Datasets
4.2 Real-World Datasets
5 Conclusion
References
Inter-class Sparsity Based Non-negative Transition Sub-space Learning
1 Introduction
2 Related Work
2.1 Notations
2.2 StLSR
2.3 ICS_DLSR
2.4 SN-TSL
3 The Proposed Method
3.1 Problem Formulation and Learning Model
3.2 Solution to ICSN-TSL
3.3 Classification
3.4 Computational Time Complexity
3.5 Convergence Analysis
4 Experiments and Analysis
4.1 Data Sets
4.2 Experimental Results and Analysis
4.3 Parameter Sensitivity Analysis
4.4 Ablation Study
5 Conclusion
References
Incremental Learning Based on Dual-Branch Network
1 Introduction
2 Related Work
3 Method
3.1 Problem Description
3.2 Baseline Method
3.3 Model Extension
3.4 Two Distillation
3.5 Two Stage Training
3.6 Sample Update Policy
4 Experience
4.1 Baseline Result
4.2 Result on Imagenet100
5 Conclusion
References
Inter-image Discrepancy Knowledge Distillation for Semantic Segmentation
1 Introduction
2 Method
2.1 Notations
2.2 Overview
2.3 Attention Discrepancy Distillation
2.4 Soft Probability Distillation
2.5 Optimization
3 Experiments
3.1 Datasets and Setups
3.2 Comparisons with Recent Methods
3.3 Ablation Studies
4 Conclusion
References
Vision Problems in Robotics, Autonomous Driving
Cascaded Bilinear Mapping Collaborative Hybrid Attention Modality Fusion Model
1 Introduction
2 Related Work
3 Method
3.1 Overall Architecture
3.2 Proposed Cascade Bilinear Mapping Hybrid Attention Fusion
4 Experiments
4.1 Implementation Details
4.2 State-of-the-Art Comparison
4.3 Performance Analysis
5 Conclusion
References
CasFormer: Cascaded Transformer Based on Dynamic Voxel Pyramid for 3D Object Detection from Point Clouds
1 Introduction
2 Related Work
2.1 CNN-Based 3-D Object Detection
2.2 Transformer-Based 3-D Object Detection
3 Methods
3.1 Overview
3.2 Cascaded Transformer Refinement
3.3 Training Losses
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Detection Results on the KITTI
4.4 Detection Results on the Waymo
4.5 Ablation Studies
5 Conclusion
References
Generalizable and Accurate 6D Object Pose Estimation Network
1 Introduction
2 Related Work
2.1 Direct Methods
2.2 Geometric Methods
3 Method
3.1 Random Offset Distraction
3.2 Full Convolution Asymmetric Feature Extractor
3.3 Pose Estimation
3.4 Loss Functions
4 Experiments
4.1 Datasets
4.2 Metrics
4.3 Results
4.4 Ablation Study
5 Conclusion
References
An Internal-External Constrained Distillation Framework for Continual Semantic Segmentation
1 Introduction
2 Related Work
3 Method
3.1 Problem Definition and Notation
3.2 Internal-External Constrained Distillation Framework
3.3 Multi-information-based Internal Feature Distillation Module
3.4 Attention-Based Extrinsic Feature Distillation Module
4 Experiments
4.1 Experimental Setups
4.2 Experimental Results
4.3 Ablation Study
5 Conclusion
References
MTD: Multi-Timestep Detector for Delayed Streaming Perception
1 Introduction
2 The Methods
2.1 Delay Analysis Module (DAM)
2.2 Timestep Branch Module (TBM)
3 Experiments and Evaluation
3.1 Experimental Settings
3.2 Experimental Results
4 Conclusion
References
Semi-Direct SLAM with Manhattan for Indoor Low-Texture Environment
1 Introduction
2 Related Work
3 System Overview
3.1 Tracking
3.2 Loop Closing
4 Experiments
4.1 Pose Estimation
4.2 Loop Closure
5 Conclusions
References
L2T-BEV: Local Lane Topology Prediction from Onboard Surround-View Cameras in Bird's Eye View Perspective
1 Introduction
2 Related Works
2.1 Road Network Extraction
2.2 BEV Representation
3 Methodology
3.1 Overview
3.2 Lane Topologies Representation
3.3 Image Encoder
3.4 Lane Graph Detection
3.5 Loss Function
4 Experiments
4.1 Experimental Settings
4.2 Comparison with Other Methods
4.3 Ablation
4.4 Limitations and Future Work
5 Conclusion
References
CCLane: Concise Curve Anchor-Based Lane Detection Model with MLP-Mixer
1 Introduction
2 Related Work
2.1 Segmentation-Based Methods
2.2 Polynomial Curve-Based Methods
2.3 Row-Wise-Based Methods
2.4 Line-Anchor-Based Methods
3 Method
3.1 The Lane Representation
3.2 Global Information Extraction Module
3.3 Local Information Extraction Module
3.4 Loss Function
3.5 Training and Infercence Details
4 Experiment
4.1 Datasets and Evaluation Metric
4.2 Results
5 Ablation Study
6 Conclusion
References
Uncertainty-Aware Boundary Attention Network for Real-Time Semantic Segmentation
1 Introduction
2 Related Work
3 Method
3.1 Low-Level Guided Attention Feature Fusion Module
3.2 Uncertainty Modeling
3.3 Uncertainty-Aware Boundary Attention Network
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Comparisons
4.4 Ablation Study
5 Conclusion
References
URFormer: Unified Representation LiDAR-Camera 3D Object Detection with Transformer
1 Introduction
2 Methodology
2.1 Depth-Aware Lift Module
2.2 Sparse Transformer
2.3 Unified Representation Fusion
3 Experiments
3.1 Dataset and Evaluation Metrics
3.2 Implementation Detail
3.3 Main Results
3.4 Abaltion Studies
4 Conclusion
References
A Single-Stage 3D Object Detection Method Based on Sparse Attention Mechanism
1 Introduction
2 Related Work
3 Method
3.1 The Overall Design of the Framework
3.2 SPConvNet
3.3 DEF
3.4 Loss Function
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Results on Datasets
4.4 Ablation Study
4.5 Visualization
5 Conclusion
References
WaRoNav: Warehouse Robot Navigation Based on Multi-view Visual-Inertial Fusion
1 Introduction
2 System Overview
3 WaRoNav Initialization
3.1 Initialization on {CF} and {CD}
3.2 IMU Initialization
3.3 3D Point Initialization
4 WaRoNav Multi-view Fusion
4.1 WaRoNav-MVF Frontend
4.2 WaRoNav-MVF Backend
5 Experiments
5.1 Platform
5.2 Results and Analysis
6 Conclusion
References
Enhancing Lidar and Radar Fusion for Vehicle Detection in Adverse Weather via Cross-Modality Semantic Consistency
1 Introduction
2 Related Work
3 Proposed Method
3.1 Losses
3.2 Ground Removal
3.3 Laser-ID Based Slicing
4 Experiments
4.1 Datasets Preparation
4.2 Implementation Details
4.3 Quantitative Results
4.4 Qualitative Results
4.5 Ablation Study
5 Conclusion and Future Work
References
Enhancing Active Visual Tracking Under Distractor Environments
1 Introduction
2 Related Work
2.1 Active Visual Tracking
2.2 Curiosity Exploration
3 Method
3.1 AS-RND Intrinsic Reward Mechanism
3.2 Distractor-Occlusion Avoidance Reward
3.3 Reward Structure
3.4 Classification Score Map Prediction Module
4 Experiments and Analysis
4.1 Comparison Experiments with Baseline
4.2 Ablation Study
4.3 Analysis of the Intrinsic Reward Mechanism of AS-RND
4.4 Analysis Experiment of Distractor-Occlusion Avoidance Reward
5 Conclusion
References
Cross-Modal and Cross-Domain Knowledge Transfer for Label-Free 3D Segmentation
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Definition
3.2 Method Overview
3.3 Data Pre-processing
3.4 Instance-Wise Relationship Extraction
3.5 Feature Alignment
3.6 Loss
4 Experiments
4.1 Datasets
4.2 Metrics and Implementation Details
4.3 Comparison
4.4 Ablation Studies
5 Conclusion
References
Cross-Task Physical Adversarial Attack Against Lane Detection System Based on LED Illumination Modulation
1 Introduction
2 Related Work
3 Problem Analysis
4 The Proposed Scheme
4.1 Threat Model
4.2 Physical Adversarial Attack Based on LED Illumination Modulation
4.3 Search Space for Adversarial Perturbation Parameters
5 Experimental Results and Discussion
5.1 Experimental Setup
5.2 Experimental Results of Digital World Attack
5.3 Experimental Results of Physical World Attack
6 Conclusion
References
RECO: Rotation Equivariant COnvolutional Neural Network for Human Trajectory Forecasting
1 Introduction
2 Related Work
2.1 Trajectory Analysis and Forecasting
2.2 Multimodality Forecast
2.3 Group Equivariant Deep Learning
3 Approach
3.1 Method Overview
3.2 Rotation Equivariant Convolutional Network
3.3 Loss Function
4 Experiments
4.1 Dataset and Evaluation Metrics
4.2 Implementation Details
4.3 Results
5 Conclusion
References
FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection
1 Introduction
2 Related Work
2.1 LiDAR-Only 3D Detection
2.2 Fusion-Based 3D Detection
3 FGFusion
3.1 Motivations and Pipeline
3.2 Camera Stream Architecture
3.3 LiDAR Stream Architecture
3.4 Multi-scale Fusion Module
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Experimental Results and Analysis
4.4 Ablation Study
5 Conclusion
References
Author Index