Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part V 3031250710, 9783031250712

The 8-volume set, comprising the LNCS books 13801 until 13809, constitutes the refereed proceedings of 38 out of the 60

858 64 156MB

English Pages 796 [797] Year 2023

Table of contents :
Foreword
Preface
Organization
Contents – Part V
W18 - Challenge on Mobile Intelligent Photography and Imaging
W18 - Challenge on Mobile Intelligent Photography and Imaging
MIPI 2022 Challenge on RGB+ToF Depth Completion: Dataset and Report*-4pt
1 Introduction
2 Challenge
2.1 Problem Definition
2.2 Dataset: TetrasRGBD
2.3 Challenge Phases
2.4 Scoring System
2.5 Running Time Evaluation
3 Challenge Results
4 Challenge Methods
4.1 ZoomNeXt
4.2 GAMEON
4.3 Singer
4.4 NPU-CVR
4.5 JingAM
4.6 Anonymous
4.7 MainHouse113
4.8 UCLA Vision Lab
5 Conclusions
A Teams and Affiliations
References
MIPI 2022 Challenge on Quad-Bayer Re-mosaic: Dataset and Report
1 Introduction
2 Challenge
2.1 Problem Definition
2.2 Dataset: Tetras-Quad
2.3 Challenge Phases
2.4 Scoring System
3 Challenge Results
4 Challenge Methods
4.1 MegNR
4.2 IMEC-IPI & NPU-MPI
4.3 HITZST01
4.4 BITSpectral
4.5 JHC-SJTU
4.6 Op-summer-po
5 Conclusions
A Teams and Affiliations
References
MIPI 2022 Challenge on RGBW Sensor Re-mosaic: Dataset and Report
1 Introduction
2 Challenge
2.1 Problem Definition
2.2 Dataset: Tetras-RGBW-RMSC
2.3 Challenge Phases
2.4 Scoring System
3 Challenge Results
4 Challenge Methods
4.1 Op-summer-po
4.2 HIT-IIL
4.3 Eating, Drinking and Playing
5 Conclusions
A Teams and Affiliations
References
MIPI 2022 Challenge on RGBW Sensor Fusion: Dataset and Report
1 Introduction
2 Challenge
2.1 Problem Definition
2.2 Dataset: Tetras-RGBW-Fusion
2.3 Challenge Phases
2.4 Scoring System
3 Challenge Results
4 Challenge Methods
4.1 BITSpectral
4.2 BIVLab
4.3 HIT-IIL
4.4 Jzsherlock
4.5 LLCKP
4.6 MegNR
5 Conclusions
A Teams and Affiliations
References
MIPI 2022 Challenge on Under-Display Camera Image Restoration: Methods and Results
1 Introduction
2 MIPI 2022 Under-Display Camera Image Restoration
2.1 Datasets
2.2 Evaluation
2.3 Challenge Phase
3 Challenge Results
4 Challenge Methods and Teams
5 Conclusions
References
Continuous Spectral Reconstruction from RGB Images via Implicit Neural Representation
1 Introduction
2 Related Work
2.1 Implicit Neural Representation
2.2 Spectral Reconstruction from RGB Images
3 Neural Spectral Reconstruction
3.1 Overview
3.2 Spectral Profile Interpolation
3.3 Neural Attention Mapping
3.4 Loss Function
4 Experiments
4.1 Comparison to State-of-the-Art Methods
4.2 Continuous Spectral Reconstruction
4.3 Extreme Spectral Reconstruction
4.4 Spectral Super Resolution
4.5 Ablation Studies
5 Conclusion
References
Event-Based Image Deblurring with Dynamic Motion Awareness
1 Introduction
2 Related Work
3 Method
3.1 System Overview
3.2 Deblur Module
3.3 Multi-scale Coarse-to-Fine Approach
3.4 Loss Function
4 RGBlur+E Dataset
5 Experiments
5.1 Experimental Settings
5.2 Ablation of the Effectiveness of the Components
5.3 Comparison with State-of-the-Art Models
6 Conclusions
References
UDC-UNet: Under-Display Camera Image Restoration via U-shape Dynamic Network
1 Introduction
2 Related Work
2.1 UDC Restoration
2.2 Image Restoration
3 Methodology
3.1 Problem Formulation
3.2 Network Structure
3.3 Loss Function
4 Experiments
4.1 Experimental Setup
4.2 Ablation Study
4.3 Comparison with State-of-the-Art Methods
5 Results of the MIPI UDC Image Restoration Challenge
6 Conclusions
References
Enhanced Coarse-to-Fine Network for Image Restoration from Under-Display Cameras
1 Introduction
2 Related Work
3 Method
3.1 Enhanced Encoder
3.2 Enhanced Decoder
3.3 Cross-Gating Fusion Module
3.4 Loss Functions
4 Experiments
4.1 Implementation Details
4.2 Dataset
4.3 Evaluation Metrics
4.4 Comparations
4.5 Ablation Study
5 Conclusions
References
Learning to Joint Remosaic and Denoise in Quad Bayer CFA via Universal Multi-scale Channel Attention Network
1 Introduction
2 Related Work
2.1 Color Filter Arrays
2.2 Denoise Raw Images
3 Proposed Method
3.1 Problem Formulation
3.2 Network Structure
3.3 Overall Framework
3.4 Loss Function
4 Experiment
4.1 Datasets
4.2 Evaluation Metrics
4.3 Implementation Details
4.4 Testing Results of MIPI 2022 Challenge on Quad Joint Remosaic and Denoise
4.5 Ablation Study
4.6 Qualitative Evaluation
5 Conclusion
References
Learning an Efficient Multimodal Depth Completion Model
1 Introduction
2 Related Work
2.1 Unguided Depth Completion
2.2 RGB-Guided Depth Completion
3 Methods
3.1 Global and Local Depth Prediction with Fusion
3.2 Funnel Convolutional Spatial Propagation Network
3.3 Loss Functions
4 Experiments
4.1 Datasets and Metrics
4.2 Implementational Details
4.3 Experimental Results
5 Conclusions
References
Learning Rich Information for Quad Bayer Remosaicing and Denoising
1 Introduction
2 Related Works
2.1 Denoising
2.2 ISP and Demosaicing
3 Proposed Method
3.1 Quad Bayer Pre-processing
3.2 Network for Jointly Remosaicing and Denoising
3.3 Two-Stage Training Strategy
4 Experimental Results
4.1 Dataset
4.2 Implementation Details
4.3 Evaluation Metrics
4.4 Ablation Study
4.5 Model Complexity and Runtime
4.6 Challenge Submission
4.7 Limitations
5 Conclusions
References
Depth Completion Using Laplacian Pyramid-Based Depth Residuals
1 Introduction
2 Related Work
3 Our Approach
3.1 Network Atructure
3.2 Global-Local Refinement Network
3.3 Affinity Decay Spatial Propagation Network
3.4 The Training Loss
4 Experiments
4.1 Implementation Details
4.2 Datasets
4.3 Metrics
4.4 Evaluation on KITTI Dataset
4.5 Ablation Study on ToF Synthetic Dataset
5 Conclusion
References
W19 - Challenge on People Analysis: From Face, Body and Fashion to 3D Virtual Avatars
W19 - Challenge on People Analysis: From Face, Body and Fashion to 3D Virtual Avatars
PSUMNet: Unified Modality Part Streams Are All You Need for Efficient Pose-Based Action Recognition
1 Introduction
2 Related Work
3 Methodology
3.1 Part Stream Factorization
3.2 PSUMNet
3.3 Multi Modality Data Generator (MMDG)
3.4 Spatio Temporal Relational Module (STRM)
4 Experiments
4.1 Datasets
4.2 Implementation and Optimization Details
4.3 Results
4.4 Analysis
4.5 Ablations
5 Conclusion
References
YOLO5Face: Why Reinventing a Face Detector
1 Introduction
2 Related Work
2.1 Object Detection
2.2 Face Detection
2.3 YOLO
3 YOLO5Face Face Detector
3.1 Network Architecture
3.2 Landmark Regression
3.3 Stem Block Structure
3.4 SPP with Smaller Kernels
3.5 P6 Output Block
3.6 ShuffleNetV2 as Backbone
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Ablation Study
4.4 YOLO5Face for Face Recognition
4.5 YOLO5Face on WiderFace Dataset
4.6 YOLO5Face on FDDB Dataset
5 Conclusion
References
Counterfactual Fairness for Facial Expression Recognition
1 Introduction
2 Literature Review
2.1 Fairness in Machine Learning
2.2 Facial Affect Fairness
2.3 Counterfactuals and Bias
3 Methodology
3.1 Notation and Problem Definition
3.2 Counterfactual Fairness
3.3 Counterfactual Image Generation
3.4 Baseline Approach
3.5 Pre-processing: Data Augmentation with Counterfactuals
3.6 In-processing: Contrastive Counterfactual Fairness
3.7 Post-processing: Reject Option Classification
4 Experimental Setup
4.1 Dataset
4.2 Implementation and Training Details
4.3 Evaluation Measures
5 Results
5.1 An Analysis of Dataset Bias and Counterfactual Bias
5.2 Bias Mitigation Results with Counterfactual Images
6 Conclusion and Discussion
References
Improved Cross-Dataset Facial Expression Recognition by Handling Data Imbalance and Feature Confusion
1 Introduction
2 Related Work
3 Problem Definition and Notations
4 Proposed Approach
4.1 Handling Data Imbalance in DIFC
4.2 Handling Feature Confusion in DIFC
4.3 Choosing Baseline UDA Approach
4.4 Integrating DIFC with Baseline UDA
5 Experiments
5.1 Datasets Used
5.2 Implementation Details
5.3 Results on Benchmark Datasets
5.4 Additional Analysis
6 Conclusion
References
Video-Based Gait Analysis for Spinal Deformity
1 Introduction
1.1 State-of-the-Art Review
1.2 Motivation and Contribution
2 Proposed System
2.1 Pose Estimation
2.2 BiLSTM-Based Network
3 Spinal Deformity Dataset
4 Results
5 Conclusions
References
TSCom-Net: Coarse-to-Fine 3D Textured Shape Completion Network
1 Introduction
2 Related Works
3 Proposed Approach – TSCom-Net
3.1 Joint Shape and Texture Completion
3.2 Texture Refinement
4 Experimental Results
4.1 Network Training Details
4.2 Results and Evaluation
4.3 Ablation Study
5 Conclusion
References
.26em plus .1em minus .1emDeep Learning-Based Assessment of Facial Periodic Affect in Work-Like Settings
1 Introduction
2 A Protocol for Human Facial Behaviour Data Acquisition in Work-Like Settings
3 The WorkingAge Facial Behaviour Dataset
4 Periodical Facial Affect Recognition
5 Experiments
5.1 Experimental Setup
5.2 Baseline Results of Leave-One-Site-Out Cross-Validation
5.3 Ablation Studies
6 Conclusion
References
Supervision by Landmarks: An Enhanced Facial De-occlusion Network for VR-Based Applications
1 Introduction
2 Related Work
2.1 Facial De-occlusion and HMD Removal Methods
2.2 Structure-Guided Image Inpainting
3 Proposed Method
3.1 The Architecture
3.2 Spatial Supervision Using Landmarks
3.3 Loss Functions
4 Experiments and Results
4.1 Dataset and Training Settings
4.2 Results
5 Ablation Studies
6 Conclusion
References
Consistency-Based Self-supervised Learning for Temporal Anomaly Localization
1 Introduction
2 Related Work
3 Proposed Method
3.1 Model
3.2 Training Objective
3.3 Temporal Proposal
4 Experiments
5 Conclusions
References
Perspective Reconstruction of Human Faces by Joint Mesh and Landmark Regression*-4pt
1 Introduction
2 Our Method
2.1 3D Face Geometric Reconstruction
2.2 6DoF Estimation
3 Experimental Results
3.1 Dataset
3.2 Evaluation Metrics
3.3 Implementation Details
3.4 Ablation Study
3.5 Benchmark Results
3.6 Result Visualization
4 Conclusion
References
Pixel2ISDF: Implicit Signed Distance Fields Based Human Body Model from Multi-view and Multi-pose Images
1 Introduction
2 Related Work
3 Methodology
3.1 Multi-view and Multi-pose Image Feature Encoding
3.2 Structured Latent Codes
3.3 Implicit Neural Shape Field
3.4 Loss Function
4 Experiments
4.1 Datasets
4.2 Image Preprocessing
4.3 SMPLX Estimation and Optimization
4.4 Implementation Details
4.5 Results
5 Conclusion
References
UnconFuse: Avatar Reconstruction from Unconstrained Images*-6pt
1 Method
1.1 Data Preprocessing
1.2 Model Design and Configuration
2 Results
3 Discussion
References
HiFace: Hybrid Task Learning for Face Reconstruction from Single Image
1 Introduction
2 Related Work
2.1 3DMM Based
2.2 Model-Free
3 Method
3.1 Architecture
3.2 Vertex Refinement Module
3.3 Multi-task Learning Module
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Comparisons and Ablation Studies
4.4 Discussions
4.5 Conclusion
References
Multi-view Canonical Pose 3D Human Body Reconstruction Based on Volumetric TSDF
1 Introduction
2 Dataset
2.1 Data Analysis
2.2 Data Preprocessing
3 Method
3.1 Network Architecture
3.2 TSDF
3.3 Loss Function
3.4 Implementation Details
4 Experiment
5 Conclusion
References
End to End Face Reconstruction via Differentiable PnP
1 Introduction
2 Methodology
2.1 Data Preparation
2.2 Facial Landmark Detection
2.3 3D Face Reconstruction
2.4 PnPLoss
2.5 Inference Phase
3 Experiments
3.1 Experimental Details
3.2 Head Pose Estimation
3.3 Face Reconsturction
3.4 Finetuning with PnP Layer
3.5 Qualitative Results
4 Tricks
4.1 Flip and Merge
5 Conclusions
References
W20 - Safe Artificial Intelligence for Automated Driving
W20 - Safe Artificial Intelligence for Automated Driving
One Ontology to Rule Them All: Corner Case Scenarios for Autonomous Driving
1 Introduction
2 Related Work
3 Method
3.1 Master Ontology
3.2 Scenario Ontology Generation
3.3 Scenario Simulation
4 Evaluation
4.1 Scenario Ontologies
5 Conclusion
References
Parametric and Multivariate Uncertainty Calibration for Regression and Object Detection
1 Introduction
2 Definitions for Regression Calibration and Related Work
3 Joint Parametric Regression Calibration
4 Measuring Miscalibration
5 Experiments
6 Conclusion
References
Reliable Multimodal Trajectory Prediction via Error Aligned Uncertainty Optimization
1 Introduction
2 Related Work
3 Setup for Vehicle Trajectory Prediction
4 Error Aligned Uncertainty Optimization
5 Experiments and Results
5.1 Multimodal Vehicle Trajectory Prediction
5.2 UCI Regression
5.3 Hyperparameter Selection
6 Conclusions
References
PAI3D: Painting Adaptive Instance-Prior for 3D Object Detection
1 Introduction
2 Related Work
2.1 LiDAR Based 3D Object Detection
2.2 Multi-modal Fusion Based 3D Object Detection
3 Painting Adaptive Instance for Multi-modal 3D Object Detection
3.1 Instance Painter
3.2 Adaptive Projection Refiner
3.3 Additional Enhancements for 3D Object Detection
4 Experiments
4.1 The NuScenes Dataset
4.2 Training and Evaluation Settings
4.3 nuScenes Test Set Results
4.4 Ablation Study
5 Conclusions
References
Validation of Pedestrian Detectors by Classification of Visual Detection Impairing Factors
1 Introduction
2 Related Works
3 Methodology
3.1 Synthetic Data Generation
3.2 Visual Detection Impairing Factors
3.3 Classification of Detectable Pedestrians
4 Results and Discussion
4.1 Evaluation of Pedestrian Detection Data Biases
4.2 Detection Impact of Visual Impairment Factors
5 Conclusions
References
Probing Contextual Diversity for Dense Out-of-Distribution Detection
1 Introduction
2 Related Work
3 Multi-head Context Networks
3.1 Contextual Diversity in Semantic Segmentation Networks
3.2 Probing Contextual Diversity
3.3 Out-of-Distribution Detection with MOoSe
4 Experiments
4.1 Datasets and Benchmarks
4.2 Evaluation Metrics
4.3 Experimental Setup
4.4 Comparison with Ensembles
4.5 Comparison with the State of the Art
5 Analysis
5.1 Quantifying Diversity: Variance and Mutual Information
5.2 Context as a Source of Diversity
5.3 Effect of Contextual Diversity on OoD Detection
6 Conclusion
References
Adversarial Vulnerability of Temporal Feature Networks for Object Detection
1 Introduction
2 Related Work
2.1 Adversarial Attacks
2.2 Adversarial Training
3 Adversarial Attacks
3.1 Threat Model
3.2 Adversarial Training
4 Experimental Setup
4.1 Dataset
4.2 Baselines
4.3 Adversarial Noise Attack
4.4 Adversarial Patch Attack
4.5 Adversarial Training
5 Evaluation
5.1 Attacks on 1-Class Vs. on 2-Class Baselines
5.2 Impact of Temporal Horizon
5.3 Robustness of the Adversarially Trained Models
5.4 Robustness of the AT-Trained Models Against Per-instance Attacks
6 Conclusion
References
Towards Improved Intermediate Layer Variational Inference for Uncertainty Estimation
1 Introduction
2 Related Work
2.1 State-of-the-Art Uncertainty Estimation Approaches
2.2 Dirichlet Distributions for Uncertainty Estimation
2.3 Out-of-distribution Detection
3 Methodology
3.1 Dirichlet Models
3.2 Proposed Loss Function
4 Experiments and Results
4.1 OoD Methodology
4.2 Full Training with All Classes
5 Conclusion and Future Work
References
Explainable Sparse Attention for Memory-Based Trajectory Predictors
1 Introduction
2 Related Works
2.1 Trajectory Prediction
2.2 Memory and Attention
3 Memory-Based Trajectory Predictors
4 Method
4.1 Sparsemax vs Softmax
5 Experiments
5.1 Evaluation Metrics and Datasets
5.2 Results
5.3 Explainability
6 Conclusions
References
Cycle-Consistent World Models for Domain Independent Latent Imagination
1 Introduction
2 Preliminaries
3 Cycle-Consistent World Models
4 Related Work
5 Experiments
6 Conclusion
References
W21 - Real-World Surveillance: Applications and Challenges
W21 - Real-World Surveillance: Applications and Challenges
Strengthening Skeletal Action Recognizers via Leveraging Temporal Patterns
1 Introduction
2 Related Work
2.1 Recognizing Skeleton-Based Actions
2.2 Capturing Chronological Information
2.3 Noise Alleviation
3 Proposed Approach
3.1 Discrete Cosine Encoding
3.2 Chronological Loss Function
3.3 Framework
4 Experiments
4.1 Datasets
4.2 Experimental Setups
4.3 Ablation Studies
4.4 Improvement Analysis
4.5 Noise Alleviation
4.6 Compatibility with Existing Models
4.7 Comparison with SOTA Accuracy
5 Conclusion
References
Which Expert Knows Best? Modulating Soft Learning with Online Batch Confidence for Domain Adaptive Person Re-Identification
1 Introduction
2 Related Work
3 Method
3.1 Supervised Pre-training on Source Domain
3.2 Unsupervised Adaptation on Target Domain
4 Experiments
4.1 Datasets and Evaluation
4.2 Architectures and Setup
4.3 Comparison with State-of-the-Art Methods
4.4 Ablation Study
5 Conclusions
References
Cross-Modality Attention and Multimodal Fusion Transformer for Pedestrian Detection
1 Introduction
2 Related Work
2.1 Multimodal Pedestrian Detection
2.2 Multimodal Transformers
3 Proposed Method
3.1 Cross-Modality Attention Transformer
3.2 Multimodal Fusion
4 Experiments
4.1 Dataset and Implementation Details
4.2 Quantitative Results
4.3 Ablation Study
5 Conclusions
References
See Finer, See More: Implicit Modality Alignment for Text-Based Person Retrieval
1 Introduction
2 Related Work
3 Methodology
3.1 Overview
3.2 Unified Visual-Textual Network
3.3 Implicit Semantic Alignment
3.4 Loss Function
4 Experiment
4.1 Experimental Setup
4.2 Comparison with State-of-the-Art Methods
4.3 Ablation Study
4.4 Qualitative Results
4.5 Computational Efficiency Analysis
5 Discussion
6 Conclusion
7 Broader Impact
References
Look at Adjacent Frames: Video Anomaly Detection Without Offline Training
1 Introduction
2 Related Work
3 Proposed Solution
3.1 Workflow
3.2 Incremental Learner
4 Experiments
4.1 Performance
4.2 Further Studies
5 Conclusion
References
SOMPT22: A Surveillance Oriented Multi-pedestrian Tracking Dataset
1 Introduction
2 Related Work
2.1 MOT Methods
2.2 Datasets
2.3 Person Detection Datasets
3 Problem Description
4 SOMPT22 Dataset
4.1 Dataset Construction
4.2 Dataset Statistics
4.3 Evaluation Metrics
5 Experiments
5.1 Experiment Setup
5.2 Benchmark Results
6 Conclusion and Future Work
References
Detection of Fights in Videos: A Comparison Study of Anomaly Detection and Action Recognition
1 Introduction
2 Related Work
2.1 General Anomaly Detection
2.2 Fight Detection Using Action Recognition
3 Proposed Methods
3.1 Action Recognition of Fights
3.2 Anomaly Detection of Fights
3.3 Iterative Anomaly Detection and Action Recognition
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Action Recognition Results
4.4 Anomaly Detection or Action Recognition?
4.5 Iterative Anomaly Detection and Action Recognition
4.6 Comparison with SOTA Results
5 Conclusion
References
Privacy-Preserving Person Detection Using Low-Resolution Infrared Cameras
1 Introduction
2 Related Work
3 Person Detection with Different Levels of Supervision
3.1 Detection Through Thresholding
3.2 Unsupervised Anomaly-Based Detection Using Auto-encoders
3.3 Weakly-Supervised Detection Using Class Activation Mapping
3.4 Fully-Supervised Detection Using Single Shot Detectors
4 Experimental Methodology
4.1 Datasets
4.2 Implementation Details
4.3 Performance Metrics
5 Results and Discussion
6 Conclusions
References
Gait Recognition from Occluded Sequences in Surveillance Sites
1 Introduction
2 Related Work
3 Proposed Approach
3.1 Occlusion Detection Using VGG-16
3.2 Occlusion Reconstruction Through RGait-Net
4 Experimental Analysis
4.1 Training Details
4.2 Results
4.3 Comparative Analysis
5 Conclusions and Future Work
References
Visible-Infrared Person Re-Identification Using Privileged Intermediate Information
1 Introduction
2 Related Work
3 Proposed Method
3.1 LUPI Framework
3.2 Intermediate Domain Generation
3.3 Loss Functions
4 Results and Discussion
4.1 Experimental Methodology
4.2 Comparison with the State-of-Art
4.3 Ablation Study
5 Conclusions
References
Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy
1 Introduction
2 Related Work
3 Few-Bit VideoQA
3.1 Problem Formulation
3.2 Approach: Task-Specific Feature Compression
4 Experiments
4.1 Few-Bit VideoQA Results
4.2 Qualitative Analysis
5 Applications of Few-Bit VideoQA
5.1 Tiny Datasets
5.2 Privacy Advantages from Tiny Features
6 Conclusion
References
ChaLearn LAP Seasons in Drift Challenge: Dataset, Design and Results
1 Introduction
2 Related Work
3 Challenge Design
3.1 The Dataset
3.2 Evaluation Protocol
3.3 The Baseline
4 Challenge Results and Winning Methods
4.1 The Leaderboard
4.2 Top-1: Team GroundTruth
4.3 Top-2: Team Heboyong
4.4 What Challenge the Models the Most?
5 Conclusions
References
Author Index