Pattern Recognition: 7th Asian Conference, ACPR 2023, Kitakyushu, Japan, November 5–8, 2023, Proceedings, Part III (Lecture Notes in Computer Science) 3031476646, 9783031476648

This three-volume set LNCS 14406-14408 constitutes the refereed proceedings of the 7th Asian Conference on Pattern Recog

138 51 65MB

English Pages 416 [415] Year 2023

Table of contents :
Preface for ACPR 2023 Proceedings
Organization
Contents – Part III
Anisotropic Operator Based on Adaptable Metric-Convolution Stage-Depth Filtering Applied to Depth Completion
1 Introduction
2 Related Works
3 Proposed Model
3.1 Interpolation Model
3.2 Depth Map Filtering
4 Practical Model Implementation
4.1 Numerical Model for the Infinity Laplacian
4.2 Considered Metric
5 Training the Model
6 Dataset and Experiments
6.1 Dataset
6.2 Experiments
7 Results
7.1 Ablation Test
8 Conclusions
References
Stable Character Recognition Strategy Using Ventilation Manipulation in ERP-Based Brain-Computer Interface
1 Introduction
2 Methods
2.1 Experimental Paradigm
2.2 Data Acquisition
2.3 Data Analysis
3 Results
3.1 Average Character Recognition Accuracy Across All Subjects
3.2 Individual Subject Character Recognition and ERPs Patterns
4 Discussion and Conclusion
References
Exploring CycleGAN for Bias Reduction in Gender Classification: Generative Modelling for Diversifying Data Augmentation
1 Introduction
2 Related Work
2.1 Gender Classification
2.2 CycleGAN
2.3 Bias Reduction
3 Methodology
3.1 Adience Data Set
3.2 CycleGAN Pipeline
3.3 Gender Classification Pipeline
4 Evaluation
5 Results
5.1 CycleGAN
5.2 Gender Classification
6 Conclusion
References
Knee Osteoarthritis Diagnostic System Based on 3D Multi-task Convolutional Neural Network: Data from the Osteoarthritis Initiative
1 Introduction
2 Methodology
2.1 Data Acquisition and Pre-processing
2.2 Multi-task Neural Network Architecture
2.3 Transfer Learning
2.4 Training Specification
2.5 Evaluation
3 Results and Discussion
4 Conclusion
References
Multi-step Air Quality Index Forecasting Based on Parallel Multi-input Transformers
1 Introduction
2 Related Work
3 Proposed Method
3.1 Data Preprocessing
3.2 Variable Selection
3.3 Parallel Multi-input Transformer
4 Experimental Details
4.1 Datasets
4.2 Experimental Settings
5 Results and Discussion
5.1 Evaluation Metrics
5.2 Performance of Single-Input Multi-step Model
5.3 Performance of Multi-input Multi-step Model
5.4 Conclusion
References
One-shot Video-based Person Re-identification based on SAM Attention and Reciprocal Nearest Neighbors Metric
1 Introduction
2 Related Works
3 Method
3.1 Main Framework
3.2 SAM Attention Module
3.3 Reciprocal Nearest Neighbor Metric
4 Experiments and Results
4.1 Datasets and Evaluation Indicators
4.2 Experimental settings
4.3 Experimental Results and Comparison
5 Conclusion
References
Three-Dimensional Object Detection Network Based on Coordinate Attention and Overlapping Region Penalty Mechanisms
1 Introduction
1.1 Research Background
1.2 Related Research
1.3 Proposed Method
2 Network Structure
2.1 Backbone Network Module
2.2 Voting Feature Point Generation Module
2.3 Target Center Prediction Module
2.4 Overlapping Region Penalty Mechanism
2.5 Loss Function
3 Experiment
3.1 Dataset
3.2 Training Parameter Setting
3.3 Evaluation Metrics
3.4 Experimental Data
3.5 Ablation Experiment
3.6 Visualization Results Analysis
4 Conclusion
References
Discriminative Region Enhancing and Suppression Network for Fine-Grained Visual Categorization
1 Introduction
2 Related Work
2.1 Fine-Grained Feature Learning
2.2 Discriminative Region Localization
2.3 Visual Attention Mechanisms
3 Method
3.1 Salient Region Diffusion Module
3.2 Loss Function
4 Experiments
4.1 Dataset and Implementation Details
4.2 Experimental Results
4.3 Visualization Analysis
5 Conclusion
References
Edge Based Architecture for Total Energy Regression Models for Computational Materials Science
1 Introduction
2 Background and Base Model
3 Datasets
4 Edge Based Architecture of GemNet-OC
5 Results
6 Conclusion
References
Multiscale Transformer-Based for Multimodal Affective States Estimation from Physiological Signals
1 Introduction
2 Related Works
2.1 Continuous Emotion Recognition from Multimodal Physiological Signal
2.2 Transformer-Based Method for in Multimodal Emotion Recognition from Physiological Signal
3 Proposed Approach
3.1 Problem Definition
3.2 Methodology
4 Experimental and Results
4.1 Dataset
4.2 Experiments Setup
4.3 Results
5 Conclusion
References
Frequency Mixup Manipulation Based Unsupervised Domain Adaptation for Brain Disease Identification
1 Introduction
2 Related Work
2.1 Frequency-Based UDA
2.2 Adversarial Training for Domain-Invariant Features
3 Proposed Method
3.1 Intra-domain Adaptation
3.2 Attention-Based Feature Encoder
3.3 Inter-domain Adaptation
3.4 Objective Function
4 Experiments
4.1 Dataset
4.2 Implementation
4.3 Experiments and Analysis
4.4 Ablation Analysis
5 Conclusion
References
SSC-l0: Sparse Subspace Clustering with the l0 Inequality Constraint
1 Introduction
2 Related Works
2.1 Notations
2.2 Sparse Subspace Clustering
3 Sparse Subspace Clustering with the l0 Inequality Constraint
3.1 Motivation of SSC-l0
3.2 Objective Function of SSC-l0
3.3 Optimization of SSC-l0
4 Experiments
4.1 Datasets and Comparison Methods
4.2 Clustering Analysis
4.3 Robustness Analysis
5 Conclusions
References
Pixel-Wise Detection of Road Obstacles on Drivable Areas by Introducing Maximized Entropy to Synboost Framework
1 Introduction
2 Proposed Framework
2.1 Related Methods
2.2 Our Framework Architecture
2.3 Ensemble
3 Experiments
3.1 Dataset
3.2 Experimental Conditions
3.3 Results
4 Conclusions
References
Gender Classification from Gait Energy and Posture Images Using Multi-stage Network
1 Introduction
2 Related Work
3 Gender Classification Framework
4 Experiments and Results
5 Ablation Study
6 Conclusion
References
Oil Temperature Prediction Method Based on Deep Learning and Digital Twins
1 Introduction
2 Research Methods and Model Establishment
2.1 PSO
2.2 Pyraformer
2.3 Proposed PSO-Pyraformer Prediction Model
3 Digital Twins System
4 Analytical Study
4.1 Experimental Data
4.2 Evaluation Indicators
4.3 Experimental Results
4.4 Evaluation of Experimental Results
5 Conclusion
References
Feature Channel Adaptive Enhancement for Fine-Grained Visual Classification
1 Introduction
2 Related Work
2.1 Local Feature Enhancement
2.2 Global Feature Enhancement
2.3 Attention Mechanism
2.4 Feature Suppression
2.5 Feature Diversification
3 Method
3.1 Feature Channel Adaptive Enhancement Module
3.2 Information Sufficient Mining Module
3.3 Network Design
4 Experiments
4.1 Implementation Details
4.2 Compared with State-of-the-Art Methods
4.3 Ablation Experiment
4.4 Visualization
5 Conclusion
References
Visualizing CNN: An Experimental Comparative Study
1 Introduction
2 Related Works
3 Visualization Algorithms
3.1 Occlusion
3.2 Gradients
3.3 Integrated Gradients
3.4 Deconvolution
3.5 Class Activation Maps
4 Evaluation Metrics
4.1 Causality
4.2 Anti-disturbance Capability
4.3 Usability
4.4 Computational Complexity
5 Experiments
5.1 Causality
5.2 Anti-disturbance Capability
5.3 Usability
5.4 Computational Complexity
6 Discussion
6.1 Causality
6.2 Anti-disturbance Capability
6.3 Usability
6.4 Computational Complexity
7 Conclusion
References
Cross-Domain Few-Shot Sparse-Quantization Aware Learning for Lymphoblast Detection in Blood Smear Images
1 Introduction
2 Related Works
3 Methodology
3.1 Prototypical Networks
3.2 Sparse-Aware Meta-training
3.3 Quantization-Aware Meta-training
4 Experimental Setup
4.1 Datasets
5 Results
6 Conclusion
References
MAREPVGG: Multi-attention RepPVGG to Facefake Detection
1 Introduction
2 Related Work
3 Our Approach
3.1 Backbone
3.2 Multiple Attention Maps Generation
3.3 Textural Feature Enhancement
3.4 Bilinear Attention Pooling
3.5 Loss
4 Experiment
4.1 Dataset and Implementation Setting
4.2 Result
5 Conclusion
References
SGFNeRF: Shape Guided 3D Face Generation in Neural Radiance Fields
1 Introduction
2 Related Work
2.1 Neural Radiance Field Representations
2.2 Generative Face Geometry-Aware Synthesis
3 Methods
4 Experiments
4.1 Datasets
4.2 Baselines
4.3 Quantitative Evaluations
4.4 Qualitative Results
4.5 Ablation Analysis
5 Conclusion
References
Hybrid Spatio-Temporal Network for Face Forgery Detection
1 Introduction
2 Related Work
3 Method
3.1 Overview
3.2 Hybrid Architecture for Face Forgery Detection
3.3 Feature Alignment Block
3.4 Vector Selection Block
4 Experiments
4.1 Experimental Settings
4.2 Comparison with SOTA Methods
4.3 Ablation Study
4.4 Visualization
5 Conclusions
References
Dewarping Document Image in Complex Scene by Geometric Control Points
1 Introduction
2 Related Work
2.1 Hand Crafted Features Based Methods
2.2 Deep Learning Based Methods
3 Approach
3.1 Network Architecture
3.2 Post-processing
3.3 Loss Functions
3.4 Training Details
4 Experiments
4.1 Datasets
4.2 Evaluation Metrics
4.3 Results
4.4 Ablation Study and Limitation Discussion
5 Conclusion
References
Multi-discriminative Parts Mining for Fine-Grained Visual Classification
1 Introduction
2 Relate Work
2.1 Part Locating
2.2 Fine-Grained Feature Learning
3 Method
3.1 Attention Map Based Locate-Mine Module (AMLM)
3.2 Multi-layer Feature Fusion Module (MFF)
3.3 Adaptive Label Loss
4 Experiment
4.1 Comparison with the Latest Methods
4.2 Ablation Experiments
4.3 Visualization
5 Summary
References
Geometric Distortion Correction for Extreme Scene Distortion of UAV-Acquired Images
1 Introduction
2 Related Work
2.1 Feature-Based Image Recognition
2.2 Geometric Distortion Correction
3 Proposed Method
3.1 Dataset Description
3.2 Geometric Distortion Detection
3.3 Extreme Scene Distortion Correction Proposed Solution
3.4 Image Quality Metrics
4 Experimental Results
5 Conclusion
References
Q-YOLO: Efficient Inference for Real-Time Object Detection
1 Introduction
2 Related Work
2.1 Quantization
2.2 Real-Time Object Detection
3 Methodology
3.1 Preliminaries
3.2 Quantization Range Setting
3.3 Unilateral Histogram-Based (UH) Activation Quantization
4 Experiments
4.1 Implementation Details
4.2 Main Results
4.3 Ablation Study
4.4 Inference Speed
5 Conclusions
References
VITS, Tacotron or FastSpeech? Challenging Some of the Most Popular Synthesizers
1 Introduction
2 Systems Description
2.1 Tacotron2 + Multi-band MelGAN
2.2 FastSpeech2 + Multi-band MelGAN
2.3 VITS
3 Test Description
3.1 Loanwords and Homographs
3.2 Phonetic vs Orthographic Input
3.3 Overall System Comparison
3.4 Voices Used in the Experiments
4 Results
4.1 Loanwords and Homographs
4.2 Phonetic vs Orthographic Input
4.3 Overall System Comparison
5 Conclusions
References
T5G2P: Multilingual Grapheme-to-Phoneme Conversion with Text-to-Text Transfer Transformer
1 Introduction
2 T5 Models
3 G2P Data and Models
4 T5G2P Results
4.1 Unseen Words in Testing Data
5 Conclusion
References
Transformer-Based Encoder-Encoder Architecture for Spoken Term Detection
1 Introduction
2 Neural Spoken Term Detection Framework
3 Transformer-Based Architecture
3.1 Preliminary Experiments
3.2 Proposed Architecture
3.3 Simplifications over Deep LSTM
3.4 Transformer Block
3.5 Input Transformation
4 Dataset and Model Description
5 Experimental Results
6 Conclusion
References
Will XAI Provide Real Explanation or Just a Plausible Rationalization?
1 Introduction
2 The Relation of the AI Paradigms to the Working of the Human Mind
2.1 Two Major AI Paradigms
2.2 Human Mind - The Dual Process Theory
2.3 The Relation Between the AI and the Mind
3 More Detailed Specification of Explainability
4 The Danger of Rationalization
5 Conclusion
References
Voice-Interactive Learning Dialogue on a Low-Cost Device
1 Introduction
2 Related Work
2.1 Interactive Learning
3 Target Device
4 Datasets
5 Interactive Learning Loop
5.1 FaceID and Learning New Faces
5.2 Network Architecture for Image Classification
5.3 Human Feedback and Network Retraining
6 Experimental Setup and Evaluation
7 Conclusion
References
Composite Restoration of Infrared Image Based on Adaptive Threshold Multi-parameter Wavelet
1 Introduction
2 Mixed Noise Model
3 Compound Restoration Method Based on ATMW
3.1 ATMW for Multiplicative Noise Removal
3.2 Fast Non-local Means (FNLM) Algorithm for Additive Noise Removal
4 Results and Discussion
4.1 Analysis of Experimental Results
5 Conclusions
References
Author Index

Recommend Papers

Pattern Recognition: 7th Asian Conference, ACPR 2023, Kitakyushu, Japan, November 5–8, 2023, Proceedings, Part II (Lecture Notes in Computer Science) 3031476360, 9783031476365

This three-volume set LNCS 14406-14408 constitutes the refereed proceedings of the 7th Asian Conference on Pattern Recog

104 92 75MB Read more

Pattern Recognition: 7th Asian Conference, ACPR 2023, Kitakyushu, Japan, November 5–8, 2023, Proceedings, Part I (Lecture Notes in Computer Science) 3031476336, 9783031476334

This three-volume set LNCS 14406-14408 constitutes the refereed proceedings of the 7th Asian Conference on Pattern Recog

125 98 76MB Read more