An Introduction to Image Classification: From Designed Models to End-to-End Learning [1 ed.]
9819978815, 9789819978816, 9789819978823, 9789819978847
Image classification is a critical component in computer vision tasks and has numerous applications. Traditional methods
126
20
59MB
English
Pages xvi, 290
Year 2024
Report DMCA / Copyright
DOWNLOAD EPUB FILE
Table of contents :
Preface
What the Book Is About
How Should the Book Be Used
How the Book Is Structured
Acknowledgments
Contents
Chapter 1: Image Classification: A Computer Vision Task
1.1 What Is Image Classification and Why Is It Difficult?
1.2 Image Classification as a Structured Process
1.3 Python Implementation Details
1.3.1 Python Development Environment
1.3.2 Basic Modules
1.3.3 Some Basic Operations on Images
1.4 Exercises
1.4.1 Programming Project P1.1: Reading and Displaying MNIST Data
1.4.2 Exercise Questions
References
Chapter 2: Image Features: Extraction and Categories
2.1 Image Acquisition Artifacts
2.2 Using Pixel Values as Features
2.3 Using Texture Features
2.3.1 Haralick´s Texture Features
2.3.2 Gabor Filter Banks
2.3.3 Local Binary Pattern
2.3.4 Using Texture Measures for Image Classification
2.4 Using Edges and Corners
2.4.1 Edge Detection
2.4.2 Corners
2.5 HOG Features
2.6 SIFT Features and the Bag-of-Visual-Words Approach
2.6.1 SIFT Computation
2.6.2 From SIFT to Secondary Features: Bag of Visual Words
2.7 Exercises
2.7.1 Programming Project P2.1: Orientation Histograms
2.7.2 Programming Project P2.2: A Cluster Measure in Feature Space
2.7.3 Exercise Questions
References
Chapter 3: Feature Reduction
3.1 Unsupervised Feature Reduction
3.1.1 Selection Based on Feature Variance
3.1.2 Principal Component Analysis
3.2 Supervised Feature Reduction
3.2.1 Forward and Backward Feature Selection
3.2.2 Linear Discriminant Analysis
3.3 Exercises
3.3.1 Programming Project P3.1: Comparing Variance in Feature Space
3.3.2 Exercise Questions
References
Chapter 4: Bayesian Image Classification in Feature Space
4.1 Bayesian Decision Making
4.2 Generative Classification Models
4.2.1 Likelihood Functions from Feature Histograms
4.2.2 Parametrized Density Functions as Likelihood Functions
4.3 Practicalities of Classifier Training
4.3.1 The Use of Benchmark Databases
4.3.2 Feature Normalization
4.3.3 Training and Test Data
4.3.4 Cross-validation
4.3.5 Hyperparameter
4.3.6 Measuring the Classifier Performance
4.3.7 Imbalanced Data Sets
4.4 Exercises
4.4.1 Programming Project P4.1: Classifying MNIST Data
4.4.2 Exercise Questions
References
Chapter 5: Distance-Based Classifiers
5.1 Nearest Centroid Classifier
5.1.1 Using the Euclidean Distance
5.1.2 Using the Mahalanobis Distance
5.2 The kNN Classifier
5.2.1 Why Does the kNN Classifier Estimate a Posteriori Probabilities
5.2.2 Efficient Estimation by Space Partitioning
5.3 Exercises
5.3.1 Programming Project P5.1: Features and Classifiers
5.3.2 Exercise Questions
References
Chapter 6: Decision Boundaries in Feature Space
6.1 Heuristic Linear Decision Boundaries
6.1.1 Linear Decision Boundary
6.1.2 Non-linear Decision Boundaries
6.1.3 Solving a Multiclass Problem
6.1.4 Interpretation of Sample Distance from the Decision Boundary
6.2 Support Vector Machines
6.2.1 Optimization of a Support Vector Machine
6.2.2 Soft Margins
6.2.3 Kernel Functions
6.2.4 Extensions to Multiclass Problems
6.3 Logistic Regression
6.3.1 Binomial Logistic Regression
6.3.2 Multinomial Logistic Regression
6.3.3 Kernel Logistic Regression
6.4 Ensemble Models
6.4.1 Bagging
6.4.2 Boosting
6.5 Exercises
6.5.1 Programming Project P6.1: Support Vector Machines
6.5.2 Programming Project P6.2: Label the Imagenette DATA I
6.5.3 Exercise Questions
References
Chapter 7: 7 Multi-Layer Perceptron for Image Classification
7.1 The Perceptron
7.1.1 Feedforward Step
7.1.2 Logistic Regression by a Perceptron
7.1.3 Stochastic Gradient Descent, Batches, and Minibatches
7.2 Multi-Layer Perceptron
7.2.1 A Universal, Trainable Classifier
7.2.2 Networks with More Than Two Layers
7.3 Training a Multi-Layer Perceptron
7.3.1 The Backpropagation Algorithm
7.3.2 The Adam Optimizer
7.4 Exercises
7.4.1 Programming Project P7.1: MNIST- and CIFAR10-Labeling by MLP
7.4.2 Exercise Questions
References
Chapter 8: Feature Extraction by Convolutional Neural Network
8.1 The Convolution Layer
8.1.1 Limited Perceptive Field, Shared Weights, and Filters
8.1.2 Border Treatment
8.1.3 Multichannel Input
8.1.4 Stride
8.2 Convolutional Building Blocks
8.2.1 Sequences of Convolution Layers in a CBB
8.2.2 Pooling
8.2.3 1 x 1 Convolutions
8.2.4 Stacking Building Blocks
8.3 End-to-End Learning
8.3.1 Gradient Descent in a Convolutional Neural Network
8.3.2 Initial Experiments
8.4 Exercises
8.4.1 Programming Project P8.1: Inspection of a Trained Network
8.4.2 Exercise Questions
References
Chapter 9: Network Set-Up for Image Classification
9.1 Network Design
9.1.1 Convolution Layers in a CBB
9.1.2 Border Treatment in the Convolution Layer
9.1.3 Pooling for Classification
9.1.4 How Many Convolutional Building Blocks?
9.1.5 Inception Blocks
9.1.6 Fully Connected Layers
9.1.7 The Activation Function
9.2 Data Set-Up
9.2.1 Preparing the Training Data
9.2.2 Data Augmentation
9.3 Exercises
9.3.1 Programming Project P9.1: End-to-End Learning to Label CIFAR10
9.3.2 Programming Project P9.2: CIFAR10 Labeling with Data Augmentation
9.3.3 Exercise Questions
References
Chapter 10: Basic Network Training for Image Classification
10.1 Training, Validation, and Test
10.1.1 Early Stopping of Network Training
10.1.2 Fixing Further Hyperparameters
10.2 Basic Decisions for Network Training
10.2.1 Weight Initialization
10.2.2 Loss Functions
10.2.3 Optimizers, Learning Rate, and Minibatches
10.2.4 Label Smoothing
10.3 Analyzing Loss Curves
10.3.1 Loss After Convergence Is Still Too High
10.3.2 Loss Does not Reach a Minimum
10.3.3 Training and Validation Loss Deviate
10.3.4 Random Fluctuation of the Loss Curves
10.3.5 Discrepancy Between Validation and Test Results
10.4 Exercises
10.4.1 Programming Project P10.1: Label the Imagenette Data II
10.4.2 Exercise Questions
References
Chapter 11: Dealing with Training Deficiencies
11.1 Advanced Augmentation Techniques
11.1.1 Cutout Augmentation
11.1.2 Adding Noise to the Input
11.1.3 Adversarial Attacks
11.1.4 Virtual Adversarial Training
11.1.5 Data Augmentation by a Generative Model
11.1.6 Semi-supervised Learning with Unlabeled Samples
11.2 Improving Training
11.2.1 Transfer Learning
11.2.2 Weight Regularization
11.2.3 Batch Normalization and Weight Normalization
11.2.4 Ensemble Learning and Dropout
11.2.5 Residual Neural Networks
11.3 Exercises
11.3.1 Programming Project P11.1: Transfer Learning
11.3.2 Programming Project P11.2: Label the Imagenette Data III
11.3.3 Programming Project P11.3: Residual Networks
11.3.4 Exercise Questions
References
Chapter 12: Learning Effects and Network Decisions
12.1 Inspection of Trained Filters
12.1.1 Display of Trained Filter Values
12.1.2 Deconvolution for Analyzing Filter Influence
12.1.3 About Linearly Separable Features
12.2 Optimal Input
12.3 How Does a Network Decide?
12.3.1 Occlusion Analysis
12.3.2 Class Activation Maps
12.3.3 Grad-CAM
12.4 Exercises
12.4.1 Programming Project P12.1: Compute Optimal Input Images
12.4.2 Programming Project P12.2: Occlusion Analysis
12.4.3 Exercise Questions
References
Index