Document Analysis and Recognition – ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part I (Image Processing, Computer Vision, Pattern Recognition, and Graphics) 303086197X, 9783030861971

130 33 104MB

English Pages [499]

Table of contents :
Foreword
Preface
Organization
Contents – Part I
Contents – Part II
ICDAR 2021 Workshop on Graphics Recognition (GREC)
GREC 2021 Preface
Organization
General Chair
Program Committee Chairs
Steering Committee
Program Committee
Relation-Based Representation for Handwritten Mathematical Expression Recognition
1 Introduction
2 Related Works
2.1 Sequence Transcription Approach for HME Recognition.
2.2 Representation of MEs in Sequence Transcription Model
3 Our Approach
3.1 Relation-Based Representation
3.2 End-to-End HME Recognition
4 Experiments
4.1 Datasets
4.2 Ablation Experiments of Relation-Based Method
4.3 Detailed Experiment Results on CROHME Datasets
4.4 In-depth Analysis
5 Conclusion
References
A Public Ground-Truth Dataset for Handwritten Circuit Diagram Images
1 Introduction
2 Related Work
3 Images
3.1 Drawing Surfaces and Instruments
3.2 Capturing
4 Annotations
4.1 Auxiliary Classes
4.2 Symbol Classes
4.3 Geometry
5 Statistics
6 Baseline Performance
References
A Self-supervised Inverse Graphics Approach for Sketch Parametrization
1 Introduction
2 Related Work
2.1 Inverse Graphics and Parametrized Curves
2.2 Transformers and Parallel Decoding
3 Methodology
3.1 Mathematical Background
3.2 Problem Formulation
3.3 Model Architecture
3.4 Objective Functions
4 Experimental Evaluation
4.1 Dataset and Implementation Details
4.2 Model Evaluation
4.3 Zero-Shot Evaluation
4.4 Towards High Fidelity Approximations via Overfitting
5 Conclusion
References
Border Detection for Seamless Connection of Historical Cadastral Maps
1 Introduction
2 Related Work
3 Historical Cadastral Maps
4 Proposed Detection Methods
4.1 Landmark Detector
4.2 Border Line Detector
4.3 Break Point Detector
4.4 Edge-Line Detector
5 Experiments
5.1 Dataset
5.2 Evaluation Criteria
5.3 Landmark Detection
5.4 Border Line and Break Points Detection
5.5 Edge-Line Detection
5.6 Final Results
6 Conclusions and Future Work
References
Data Augmentation for End-to-End Optical Music Recognition
1 Introduction
2 Methodology
2.1 Neural End-to-End Recognition Framework
2.2 Data Augmentation Procedures
3 Experiments
3.1 Corpora
3.2 Metrics
3.3 CRNN Configuration
4 Results
5 Conclusions
References
Graph-Based Object Detection Enhancement for Symbolic Engineering Drawings
1 Introduction
2 Related Work
3 Printed Circuit ED Dataset
4 Methodology
4.1 Symbol Recognition
4.2 Connection Identification
4.3 Anomaly Detection
4.4 Training Set for Anomaly Detection
4.5 Anomaly Detection Techniques
4.6 Detection of False Negatives
4.7 Graph Refinement
5 Results
5.1 Fine-Tuning of Faster RCNN
5.2 Anomaly Detection
5.3 Improving Faster RCNN Recall Using Anomaly Detection
5.4 Graph Refinement
6 Conclusion
References
ScanSSD-XYc: Faster Detection for Math Formulas
1 Introduction
2 ScanSSD-XYc
3 Results
4 Conclusion
References
Famous Companies Use More Letters in Logo: A Large-Scale Analysis of Text Area in Logo
1 Introduction
2 Related Work
2.1 Logo Design Analysis
2.2 Clustering and Ranking with Deep Representation
3 Logo Image Dataset—LLD-logo ch8Sage:2018
4 Analysis 1: How Much Are Texts Used in Logo?
4.1 Text Detection in Logo Image
4.2 Text Area Ratio and the Number of Text Boxes
4.3 The Ratio of Three Logo Types
4.4 Distribution of the Text Area Ratio
4.5 The Location of the Text Area
4.6 Does a Famous Company Show the Name on Its Logo or Not?
5 Analysis 2: Cluster-Wise Correlation Analysis Between the Number of Followers and the Text Area Ratio
5.1 Logo Image Clustering by DeepCluster ch8Caron2018deepcluster
5.2 The Number of Followers at Each Cluster
5.3 The Text Area Ratio of Each Cluster
5.4 Logos in Several Clusters
5.5 Cluster-Wise Correlation Analysis Between the Number of Followers and the Text Area Ratio
6 Analysis 3: Estimation of the Number of Followers from Logo Image
6.1 Regression-Based Estimation
6.2 Ranking-Based Estimation
7 Conclusion
References
MediTables: A New Dataset and Deep Network for Multi-category Table Localization in Medical Documents
1 Introduction
2 Related Work
3 MediTables Dataset
4 The Modified U-Net Deep Network
5 Experiment Setup
5.1 Datasets
5.2 Training and Implementation Details
6 Experiments and Analysis
7 Conclusion
References
Online Analysis of Children Handwritten Words in Dictation Context
1 Introduction
2 Existing Copying Analysis Engine
2.1 Segmentation
2.2 Letter Hypothesis Computation
2.3 Best Segmentation Path Search
3 Adaption of the Engine in a Dictation Context
3.1 Double Input, Baseline Strategy
3.2 Phonetic Hypotheses Generation Strategy
4 Feedback Typology
5 Results
5.1 Analysis Results
5.2 Feedback Results
6 Conclusion
References
A Transcription Is All You Need: Learning to Align Through Attention
1 Introduction
2 Related Work
3 Proposed Method
3.1 Sequence to Sequence Model
3.2 Attention Mask Tuning
4 Experiments and Results
5 Conclusion
References
Accurate Graphic Symbol Detection in Ancient Document Digital Reproductions-7pt
1 Introduction
2 Related Work
3 Data Preprocessing
3.1 Scraping Public Databases
3.2 Cleaning Duplicates
3.3 Binarization
3.4 Dataset Design
3.5 Initial Symbol Clustering
4 Modeling Approach
4.1 Updating Identification Probabilities
4.2 Latent Clustering
4.3 Optimization Objective
5 Quantifying Model Performance
6 Conclusions
References
ICDAR 2021 Workshop on Camera-Based Document Analysis and Recognition (CBDAR)
CBDAR 2021 Preface
Organization
Workshop Chairs
Program Committee
Inscription Segmentation Using Synthetic Inscription Images for Text Detection at Stone Monuments
1 Introduction
2 Related Work
3 Proposed Method
3.1 Creating Images of Pseudo-inscription
3.2 Network Structure
4 Experiments
4.1 Training Data, Validation Data, and Data Augmentation
4.2 Implementation Details
4.3 Results for Validation Data
4.4 Results for Real Inscription Images
5 Conclusion
References
Transfer Learning for Scene Text Recognition in Indian Languages
1 Introduction
1.1 Related Work
2 Datasets and Motivation
3 Models
4 Experiments
5 Results
6 Conclusion
References
How Far Deep Learning Systems for Text Detection and Recognition in Natural Scenes are Affected by Occlusion?
1 Introduction
2 Literature Review
2.1 Text Detection in Natural Scenes
2.2 Text Recognition in Natural Scenes
2.3 Datasets of Text in Natural Images
3 ISTD-OC Dataset
4 Experimental Evaluation
4.1 Evaluation Metrics
4.2 Evaluation of Text Detection Approaches
4.3 Evaluation of Text Recognition Approaches
5 Conclusion and Future Work
References
CATALIST: CAmera TrAnsformations for Multi-LIngual Scene Text Recognition
1 Introduction
2 Related Work
3 Motivation
4 Methodology
4.1 The CATALIST Model
4.2 The ALCHEMIST Videos
4.3 The Videos
5 Experiments
6 Results
7 Frame-Wise Accuracies for all Transformations
8 Conclusion
9 Future Work
References
DDocE: Deep Document Enhancement with Multi-scale Feature Aggregation and Pixel-Wise Adjustments
1 Introduction
2 Related Work
3 Multi-scale Feature Aggregation for Document Enhancement
4 Experiments
4.1 Quantitative Evaluation
4.2 Qualitative Evaluation
4.3 Ablation Studies
5 Conclusions
References
Handwritten Chess Scoresheet Recognition Using a Convolutional BiLSTM Network
1 Introduction
2 Offline Chess Scoresheet Recognition
2.1 Preprocessing
2.2 BiLSTM Neural Network Architecture
2.3 Post-processing
2.4 Data Augmentation
3 The Handwritten Chess Scoresheet (HCS) Dataset
4 Training and Results
5 Conclusion
References
ICDAR 2021 Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)
ASAR 2021 Preface
Organization
General Chairs
Program Committee
RASAM – A Dataset for the Recognition and Analysis of Scripts in Arabic Maghrebi*-10pt
1 Introduction
2 Dataset and Arabic Maghrebi Manuscripts of the BULAC
2.1 Existing Resources for Arabic Scripts
2.2 Dataset Composition
2.3 Selected Manuscripts
3 Ground-Truth Content and Creation
3.1 Structure Description
3.2 Specifications for Transcription
4 Evaluation of the Crowdsourcing Campaign and HTR Models
4.1 General Considerations and Implementation Protocol
4.2 Benefits of Fine-Tuning and Transfer Learning for a Under-Resourced Language
5 Conclusion
References
Towards Boosting the Accuracy of Non-latin Scene Text Recognition
1 Introduction
2 Related Work
3 Motivation and Datasets
4 Underlying Model
5 Results
6 Conclusion
References
Aolah Databases for New Arabic Online Handwriting Recognition Algorithm
1 Introduction
2 Present Databases for Handwritten Arabic Letters
2.1 LMCA (2008) [11, 12]
2.2 OHASD (2010) [13]
2.3 ADAB (2011) [14, 15]
2.4 ALTEC (2014) [16]
2.5 QHW (2014) [17]
2.6 Online-KHATT (2018) [18]
3 Proposed AOLAH Databases for Arabic Online Handwriting Letters and Strokes
4 Proposed Arabic Online Handwriting Recognition Algorithm
4.1 Preprocessing Stage
4.2 Feature Extraction Stage
4.3 Classification Stage
5 Optimum Proposed Recognition Model
5.1 Testing of the Optimum Model
6 Conclusion
References
Line Segmentation of Individual Demographic Data from Arabic Handwritten Population Registers of Ottoman Empire
1 Introduction
2 Related Works
3 Dataset Description
4 Automatic Object Detection and Line Segmentation Method
4.1 CNN-based Object Detection Method
4.2 Line Segmentation Method
5 Experimental Results and Discussion
5.1 Individual Object Detection Results
5.2 Line Detection Results
6 Conclusion
References
Improving Handwritten Arabic Text Recognition Using an Adaptive Data-Augmentation Algorithm
1 Introduction and Related Works
2 Challenges and Characteristics of Arabic Handwriting
3 Deep Neural Networks for Handwritten Text Recognition
4 Adaptive Data Augmentation
5 Experimentation and Results
5.1 Handwritten Arabic Text Databases
5.2 Applying Adaptive Data Augmentation Algorithm
5.3 Results of Arabic Handwriting Recognition
5.4 Comparison with Other State-Of-The-Art Systems
6 Conclusion
References
High Performance Urdu and Arabic Video Text Recognition Using Convolutional Recurrent Neural Networks
1 Introduction
2 Related Work
3 Methodology
3.1 Preprocessing
3.2 Model Architecture
3.3 Model Training
3.4 Loss Function
4 Experimental Evaluation
4.1 Datasets
4.2 Data Preparation
4.3 Data Augmentation
4.4 Evaluation Metrics
4.5 Hyper-parameter Selection
4.6 Experiments Performed
5 Results and Discussion
6 Conclusion
References
ASAR 2021 Online Arabic Writer Identification Competition
1 Introduction
2 ADAB Database
3 Participating Systems
4 Results and Discussion
5 Conclusion
References
ASAR 2021 Competition on Online Signal Restoration Using Arabic Handwriting Dhad Dataset
1 Introduction
2 Dataset and Task Description
3 Participating Methods
3.1 RHM (REGIM-Heuristic-Method)
3.2 RDL (REGIM-DL-LSTM)
3.3 RDV (REGIM-DL-VGG)
3.4 RDAG (REGIM-DL-Attention-GRU)
4 Evaluation and Results
4.1 Root Mean Square Error
4.2 Euclidean Distance
4.3 Visual Comparison of the Recovered Velocity
4.4 Analysis and Discussion
5 Conclusion
References
ASAR 2021 Competition on Online Arabic Character Recognition: ACRC
1 Introduction
2 LMCA Database
3 ACRC 2021: Competition Setup
3.1 Tasks
3.2 Evaluation Criteria
3.3 Experimental Protocol
4 Description of Participated Systems
4.1 REGIM-GS-DBLSTM
4.2 REGIM-DBLSTM-SVM
4.3 REGIM-EPC-LSTM
4.4 RDAG (REGIM-DL-Attention-GRU)
5 Experimental Results
5.1 Evaluation Results
5.2 Discussion
6 Conclusion
References
ASAR 2021 Competition on Online Arabic Word Recognition
1 Introduction
2 ADAB Database
2.1 Presentation
3 Participating Systems
3.1 REGIM-FPC-Segmentation
3.2 REGIM-GS-Segmentation
3.3 REGIM-DBLSTM-SVM
4 Tests and Results
4.1 General Remarks
4.2 Challenge 1
4.3 Challenge 2
4.4 Challenge 3
5 Conclusion
References
ICDAR 2021 Workshop on Computational Document Forensics (IWCDF)
IWCDF 2021 Preface
Organization
Workshop Chairs
Recognition of Laser-Printed Characters Based on Creation of New Laser-Printed Characters Datasets
1 Introduction
2 Related Work
2.1 Document Forgery Detection
2.2 Causes of Differences on Laser Printed Characters Qualities
3 Method
3.1 Creation of New Datasets
3.2 Features
3.3 Recognition
4 Experiments
4.1 Data Acquirement
4.2 Data Recognition
5 Conclusion
5.1 Create New Laser-Printed Characters Dataset
5.2 Confirm the Proposed Method Accuracy
References
CheckSim: A Reference-Based Identity Document Verification by Image Similarity Measure
1 Introduction
2 State of the Art
3 Problem Statement
4 Proposed Method
4.1 Model Architectures
4.2 CNN Architecture Selection
4.3 A Custom Cost Function for Prediction
5 Experiments and Results
5.1 Training Data Preparation
5.2 Some Implementation Details
5.3 Results and Interpretation
6 Conclusion
References
Crossing Number Features: From Biometrics to Printed Character Matching
1 Introduction
2 Challenges of Printed Document Protection
3 Proposed Method
3.1 Pre-processing Operations
3.2 Feature Extraction
3.3 Template Matching
4 Experimental Results
4.1 Database Description
4.2 Feature Extraction
4.3 Character Matching
5 Discussion
6 Conclusions
References
Writer Characterization from Handwriting on Papyri Using Multi-step Feature Learning
1 Introduction
2 Related Work
3 Materials and Methods
3.1 Dataset
3.2 Pre-processing
3.3 Data Preparation
3.4 Two-Step Fine Tuning
4 Experiments and Results
5 Conclusion
References
Robust Hashing for Character Authentication and Retrieval Using Deep Features and Iterative Quantization
1 Introduction
2 Literature Review
2.1 Non-deep Learning Based Hashing
2.2 Deep Learning Based Hashing
3 Proposed Model
3.1 Feature Extraction
3.2 Hash Construction
3.3 Character Authentication and Retrieval
4 Experimental Results
4.1 Character Datasets and Performance Metrics
4.2 Parameter Determination
4.3 Performance Test
4.4 Performance Comparison
5 Conclusion
References
Correction to: Accurate Graphic Symbol Detection in Ancient Document Digital Reproductions
Correction to: Chapter “Accurate Graphic Symbol Detection in Ancient Document Digital Reproductions” in: E. H. Barney Smith and U. Pal (Eds.): Document Analysis and Recognition – ICDAR 2021 Workshops, LNCS 12916, https://doi.org/10.1007/978-3-030-86198-8_12
Author Index

Recommend Papers

Document Analysis and Recognition – ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II (Image Processing, Computer Vision, Pattern Recognition, and Graphics) 3030861589, 9783030861582

This book constitutes the proceedings of the international workshops co-located with the 16th International Conference o

117 25 85MB Read more

Document Analysis and Recognition – ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part I (Image ... Vision, Pattern Recognition, and Graphics) 3030865487, 9783030865481

This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 1

114 58 108MB Read more

Document Analysis and Recognition – ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part IV ... Vision, Pattern Recognition, and Graphics) 3030863360, 9783030863364

This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 1

103 64 134MB Read more