Computer Vision – ACCV 2022 Workshops: 16th Asian Conference on Computer Vision, Macao, China, December 4–8, 2022, Revised Selected Papers 3031270657, 9783031270659

This book constitutes the refereed post-conference proceedings of the workshops held at the 16th Asian Conference on Com

326 95 84MB

English Pages 381 [382] Year 2023

Table of contents :
Preface
Organization
Contents
Learning with Limited Data for Face Analysis
FAPN: Face Alignment Propagation Network for Face Video Super-Resolution
1 Introduction
2 Related Work
2.1 Video Super-Resolution
2.2 Facial Information Extraction and Recognition
3 Network Architecture
3.1 Framework of FAPN
3.2 Neighborhood Information Coupling
3.3 Face Super-Resolution Cell (FSRC)
3.4 Loss Function
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Comparisons to the State-of-the-Art
4.4 Ablation Study
5 Conclusion
References
Micro-expression Recognition Using a Shallow ConvLSTM-Based Network
1 Introduction
2 Literature Review
2.1 Traditional Micro-expression Recognition Techniques
2.2 Micro-expression Recognition Using Neural Networks
3 Proposal
3.1 Pre-processing
3.2 ConvLSTM
3.3 Dense Layers
4 Evaluation Result
4.1 Evaluation Protocol
4.2 Result and Discussion
5 Conclusion
References
Adversarial Machine Learning Towards Advanced Vision Systems
ADVFilter: Adversarial Example Generated by Perturbing Optical Path
1 Introduction
2 Related Work
3 ADVFilter Overview
4 ADVFilter Design
5 Experiments
5.1 Experiment Setup
5.2 Correctness Test
5.3 Comparison with SOTA Universal Adversarial Examples
5.4 The Number of Corresponding Points
6 Conclusion and Future Work
References
Enhancing Federated Learning Robustness Through Clustering Non-IID Features
1 Introduction
2 Related Work
2.1 Poisoning Attacks on Federated Learning
2.2 Byzantine-Robust Aggregation Rules for FL
3 Problem Setup
3.1 Adversary's Objective and Capability
3.2 Defense Objective and Capability
4 Mini-FL Design and Analysis
4.1 Overview of Mini-FL
4.2 Mini-FL Framework
4.3 Security Enhancement Analysis
5 Evaluation
5.1 Experimental Setup
5.2 Experimental Results
6 Discussion and Future Work
7 Conclusion
References
Towards Improving the Anti-attack Capability of the RangeNet++
1 Introduction
2 Related Work
2.1 Local Features of Point Clouds
2.2 Generation of Adversarial Examples
3 Methods
3.1 Mapping of Points to Range Image
3.2 Local Feature Extraction
3.3 The Generation of Adversarial Samples
3.4 Adversarial Attacks Against Models
4 Experiments and Results
4.1 Adversarial Attacks with Randomly Perturbed Parameters
4.2 Adversarial Attack with Specified Perturbation Parameters
5 Conclusions
References
Computer Vision for Medical Computing
Ensemble Model of Visual Transformer and CNN Helps BA Diagnosis for Doctors in Underdeveloped Areas
1 Introduction
2 Related Works
2.1 Conventional BA Diagnosis Methods
2.2 CNN Models
2.3 Visual Transformer Models
3 Methods and Data Materials
3.1 EDLM of CNNs
3.2 ViT-CNN EDLM
3.3 Datasets
3.4 Data Processing
4 Experiments and Results
4.1 Experimental Settings
4.2 Results
5 Discussion
5.1 Comparison Between Methods
5.2 Ensemble Strategy: Simple Majority or Average
References
Understanding Tumor Micro Environment Using Graph Theory
1 Introduction
1.1 Lymphoid Neoplastic Cells
2 Literature Review
3 Research Methodology
3.1 Proposed Framework
3.2 Dataset Description
3.3 Description of Methods Used in This Framework
4 Results and Discussion
4.1 Discussion and Comparison of This Study with Related Existing Hierarchical Graph Modeling Study
5 Conclusion and Future Work
References
Handling Domain Shift for Lesion Detection via Semi-supervised Domain Adaptation
1 Introduction
2 Methodology
2.1 Universal Lesion Detection
2.2 Feature Alignment via Adversarial Learning
2.3 Proposed Joint Few-Shot Learning (FSL)
2.4 Few-Shot Domain Adaptation (FDA)
2.5 Self-supervision
3 Experiments and Results
3.1 Overall Training Scheme of TiLDDA
3.2 Data and Evaluation Metric
3.3 Experimental Setup
3.4 Result and Ablation Study
4 Conclusion and Future Work
References
Photorealistic Facial Wrinkles Removal
1 Introduction
2 Related Work
3 Method
3.1 Wrinkle Segmentation
3.2 Wrinkle Cleaning
3.3 Inference
4 Experimental Results
4.1 Dataset and Metrics
4.2 Implementation Details
4.3 Comparisons to the Baselines
5 Ablations
6 Conclusions
References
Improving Segmentation of Breast Arterial Calcifications from Digital Mammography: Good Annotation is All You Need
1 Introduction
2 Materials
2.1 Dataset
2.2 Annotation
3 Methods
3.1 Image Pre-processing
3.2 Other Training Variables
3.3 Label Correction Algorithm
3.4 Length-Based Dice Score
4 Experiments
4.1 Data Split
4.2 Secondary Annotation on Test Data
4.3 Training Settings
4.4 Results
4.5 Additional Results
5 Discussion
References
Machine Learning and Computing for Visual Semantic Analysis
Towards Scene Understanding for Autonomous Operations on Airport Aprons
1 Introduction
2 Related Work
3 The Apron Dataset
3.1 Data Acquisition
3.2 Sequence and Image Annotation
3.3 Instance Annotation
3.4 Data Aggregation
4 Fine-Grained Classification
4.1 Classification Results
4.2 Robustness Analysis
5 Detection
5.1 Detection Results
5.2 Robustness Analysis
6 Conclusion
References
Lightweight Hyperspectral Image Reconstruction Network with Deep Feature Hallucination
1 Introduction
2 Related Work
2.1 Optimization-Based Methods
2.2 Deep Learning-Based Methods
3 Proposed Lightweight Hyperspectral Reconstruction Network
3.1 CASSI Observation Model
3.2 Overview of the Lightweight Reconstruction Model
3.3 The DFHM Module
4 Experimental Results
5 Conclusions
References
A Transformer-Based Model for Preoperative Early Recurrence Prediction of Hepatocellular Carcinoma with Muti-modality MRI
1 Introduction
2 Proposed Method
2.1 Pre-trained ResNet Feature Extractor
2.2 Transformer Encoder Block
3 Experiments
3.1 Patient Selection
3.2 Dataset Preparation and Metrics
3.3 Ablation Study
3.4 Comparison with Existing Methods
4 Conclusion
References
CaltechFN: Distorted and Partially Occluded Digits
1 Introduction
2 Related Work
3 Caltech Football Numbers (CaltechFN) Dataset
3.1 Dataset Construction
3.2 Properties
4 Model Performance
5 Human Performance Benchmark
6 Applications of CaltechFN
6.1 Player Detection and Tracking in Sports
6.2 Self-driving Cars
7 Discussion and Future Work
7.1 Further Labeling
References
Temporal Extension Topology Learning for Video-Based Person Re-identification
1 Introduction
2 Related Work
2.1 Image-Based Person ReID
2.2 Video-Based Person ReID
2.3 Graph Convolution
3 Method
3.1 Semantic Feature Extraction
3.2 Temporal Extension Adaptive Graph Convolution Layer
3.3 Model Optimizing
4 Experiments
4.1 Dataset and Implementation
4.2 Comparison with State-of-the-Arts
4.3 Model Component Analysis
5 Conclusion
References
Deep RGB-Driven Learning Network for Unsupervised Hyperspectral Image Super-Resolution
1 Introduction
1.1 Traditional Mathematical Model-Based Methods
1.2 Deep Learning-Based Methods
2 Proposed Method
2.1 Problem Formulation
2.2 Proposed Deep RGB-Driven Generative Network
3 Experiment Result
3.1 Experimental Setting
3.2 Comparisons with the State-of-the-Art Methods
3.3 Ablation Study
4 Conclusion
References
Gift from Nature: Potential Energy Minimization for Explainable Dataset Distillation*-12pt
1 Introduction
2 Related Work
2.1 Dataset Distillation
2.2 Distribution Calibration
2.3 Potential Energy
3 Method
3.1 Problem Definition
3.2 PEM-Based Transformation
3.3 How to Fuse Datasets
3.4 Few-Shot Learning
3.5 Analysis
4 Experiment and Discussion
4.1 Experiment Setup
4.2 Empirical Understanding of PEMDD
4.3 Data Efficiency Application
4.4 Few-Shot Application
4.5 Hyper-parameters
5 Conclusion and Future Work
References
Object Centric Point Sets Feature Learning with Matrix Decomposition*-12pt
1 Introduction
2 Related Work
2.1 Point Cloud Features
2.2 Deep 3D Representations
3 Object-Centric Representation Learning
3.1 Problem Definition
3.2 Matrix Decomposition for Invariant and Equivariant Feature Learning
3.3 Encoder
3.4 Reconstruction Decoder
3.5 Theoretical Analysis
3.6 Measurement of Learned Features on Avatar Generation
4 Experiment Results
4.1 Reconstruction Result
4.2 Classification Result
4.3 Avatar Generation
5 Conclusion
References
Aerial Image Segmentation via Noise Dispelling and Content Distilling
1 Introduction
2 Related Work
2.1 Traditional Semantic Segmentation
2.2 Aerial Segmentation
2.3 Disentangle Learning
3 Proposed Method
3.1 Framework
3.2 Channel Reconstruction Loss
3.3 Content Consistency Loss
3.4 Style Consistency Loss
3.5 Semantic Segmentation Loss
3.6 Total Batch Loss
4 Experiment Result
4.1 LandCover.ai
4.2 Investigation on RGB Channels
4.3 Aerial Segmentation Using Disentangle Learning on G-Channel and B-Channel
5 Conclusion
References
Vision Transformers Theory and Applications
Temporal Cross-Attention for Action Recognition*-12pt
1 Introduction
2 Related Work
2.1 2D/3D CNN, and TSM
2.2 ViT-Based Video Models
3 Method
3.1 TokenShift and ViT
3.2 MSCA
4 Experimental Results
4.1 Setup
4.2 The Amount of Shift of MSCA-KV
4.3 Comparison of MSCA Variations Shifting in the Head Direction
4.4 The Number of Encoder Blocks with MSCA
4.5 Comparison of MSCA Variations Shifting in the Patch Direction
5 Conclusions
References
Transformer Based Motion In-Betweening
1 Introduction
2 Related Work
3 Methodology
3.1 Tween Transformers (TWTR)
3.2 Loss Computation
3.3 Training
4 Setup and Experimental Results
4.1 Evaluation Metrics
4.2 Dataset
4.3 Data Preprocessing
4.4 Hyperparameters
4.5 Visualizer and Motion Generator
4.6 Inferences
5 Conclusion
References
Convolutional Point Transformer
1 Introduction
2 Related Literature
3 Convolutional Point Transformer
3.1 Point Embedding Module
3.2 Convolutional Point Transformer Layer
3.3 Adaptive Global Feature
4 Experiments and Results
4.1 Implementation Details
4.2 Classification with ModelNet40
4.3 Segmentation Results with ShapeNet Part
4.4 Normal Estimation
5 Ablation Study
6 Conclusions and Future Work
References
Cross-Attention Transformer for Video Interpolation
1 Introduction
2 Related Work
2.1 Video Interpolation
2.2 Vision Transformers
3 Method
3.1 CAIN
3.2 Cross Similarity (CS) Module
3.3 Image Attention (IA) Module
3.4 TAIN Architecture and Training Details
4 Datasets and Performance Metrics
5 Results
6 Ablation Study
7 Conclusion
References
Deep Learning-Based Small Object Detection from Images and Videos
Evaluating and Bench-Marking Object Detection Models for Traffic Sign and Traffic Light Datasets
1 Introduction
2 Related Work
2.1 Traffic Sign Detection
2.2 Traffic Light Detection
3 Deep Learning for Object Detection
4 Experiments
4.1 TL and TS Datasets
4.2 Training Setup
4.3 Performance Evaluation
5 Results and Discussion
5.1 Traffic Light Results
5.2 Traffic Sign Results
5.3 Object Detection Model's Statistics
6 Conclusion
References
Exploring Spatial-Temporal Instance Relationships in an Intermediate Domain for Image-to-Video Object Detection
1 Introduction
2 Related Work
3 Our Method
3.1 Overview
3.2 Intermediate Domain
3.3 Domain Adaptive Faster R-CNN
3.4 Spatial-Temporal Instance Relationships Construction
4 Experiment
4.1 Datasets
4.2 Experiment Settings
4.3 Results
4.4 Ablation Study
4.5 Parameter Analysis
4.6 Qualitative Analysis
5 Conclusion
References
Author Index

Recommend Papers