Table of contents : Foreword Preface Acknowledgements Contents About the Authors Abbreviations Part IIntroduction 1 Overview and Contributions 1.1 Introduction 1.2 Research Issues on Unsupervised Outlier Detection 1.3 Overview of the Book 1.4 Contributions 1.5 Conclusions 2 Developments in Unsupervised Outlier Detection Research 2.1 Introduction 2.1.1 A Brief Overview of the Early Developments in Outlier Analysis 2.2 Some Standard Unsupervised Outlier Detection Approaches 2.2.1 Probabilistic Model-Based Outlier Detection Approach 2.2.2 Clustering-Based Outlier Detection Approaches 2.2.3 Distance-Based Outlier Detection Approaches 2.2.4 Density-Based Outlier Detection Approaches 2.2.5 Outlier Detection for Time Series 2.3 Performance Evaluation Metrics of Outlier Detection Approaches 2.3.1 Precision, Recall and Rank Power 2.4 Conclusions References Part IINew Developments in Unsupervised Outlier Detection Research 3 A Fast Distance-Based Outlier Detection Technique Using a Divisive Hierarchical Clustering Algorithm 3.1 Introduction 3.2 Related Work 3.2.1 Distance-Based Outlier Detection Research 3.2.2 A Divisive Hierarchical Clustering Algorithm for Approximate kNN Search 3.2.3 An Efficiency Analysis of DHCA for Distance-Based Outlier Detection 3.3 The Proposed Fast Distance-Based Outlier Detection Algorithm 3.3.1 A Simple Idea 3.3.2 The Proposed CPU-Efficient DB-Outlier Detection Method 3.3.3 Time Complexity Analysis 3.3.4 Data Structure for Implementing DHCA 3.4 Scale to Very Large Databases with I/O Efficiency 3.5 Performance Evaluation 3.5.1 Data Characteristics 3.5.2 The Impact of Input K on Running Time 3.5.3 Comparison with Other Methods 3.5.4 Effectiveness of DHCA for kNN Search 3.5.5 The Impact of Curse of Dimensionality 3.5.6 Scale to Very Large Databases with I/O Efficiency 3.5.7 Discussion 3.6 Conclusions References 4 A k-Nearest Neighbor Centroid-Based Outlier Detection Method 4.1 Introduction 4.2 K-means Clustering and Its Application to Outlier Detection 4.2.1 K-means Clustering 4.2.2 K-means Clustering-Based Outlier Detection 4.3 A kNN-Centroid-Based Outlier Detection Algorithm 4.3.1 General Idea 4.3.2 Definition for an Outlier Indicator 4.3.3 Formal Definition of kNN-Based Centroid 4.3.4 Two New Formulations of Outlier Factors 4.3.5 Determination of k 4.3.6 The Complexity Analysis 4.3.7 The Proposed Outlier Detection Algorithm 4.4 A Performance Study 4.4.1 Performance on Synthetic Datasets 4.4.2 Performance on Real Datasets 4.4.3 Performance on High-Dimensional Real Datasets 4.4.4 Discussion 4.5 Conclusions References 5 A Minimum Spanning Tree Clustering-Inspired Outlier Detection Technique 5.1 Introduction 5.2 Background 5.2.1 Minimum Spanning Tree-Based Clustering 5.2.2 Minimum Spanning Tree Clustering-Based Outlier Detection 5.3 An Improved MST-Clustering-Inspired Outlier Detection Algorithm 5.3.1 A Simple Idea 5.3.2 Two New Outlier Factors 5.3.3 The Proposed MST-Clustering-Inspired Outlier Detection Algorithm 5.3.4 Time Complexity Analysis 5.4 A Performance Study 5.4.1 Performance on Synthetic Datasets 5.4.2 Performance on Multi-dimensional Real Datasets 5.4.3 Performance of the Proposed Algorithm with Varying SOM-TH 5.5 Concluding Remarks References 6 A k-Nearest Neighbour Spectral Clustering-Based Outlier Detection Technique 6.1 Introduction 6.2 Spectral Clustering and Its Application to Outlier Detection 6.2.1 Preliminaries 6.2.2 Spectral Clustering-Based Outlier Detection 6.3 The Proposed Spectral Clustering-Based Outlier Mining Algorithm 6.3.1 A Simple Idea 6.3.2 The Proposed Outlier Detection Algorithm 6.3.3 Complexity Analysis 6.4 Experimental Results 6.4.1 Performance of Our Algorithm on Synthetic Data 6.4.2 Performance of Our Algorithm on Real Data 6.5 Discussion 6.6 Conclusions References 7 Enhancing Outlier Detection by Filtering Out Core Points and Border Points 7.1 Introduction 7.2 Related Work 7.2.1 Density-Based Clustering with DBSCAN 7.2.2 Density-Based Clustering for Outlier Detection 7.3 The Proposed Enhancer for Outlier Mining 7.3.1 A Simple Idea 7.3.2 Some Definitions 7.3.3 Our Proposed Outlier Detection Algorithm 7.3.4 The Complexity Analysis 7.4 Experiments and Results 7.4.1 Performance of Our Algorithm on Synthetic Data 7.4.2 Performance of Our Algorithm on Real Data 7.5 Conclusions References Part IIIApplications 8 An Effective Boundary Point Detection Algorithm Via k-Nearest Neighbors-Based Centroid 8.1 Introduction 8.2 Related Work 8.2.1 Outlier and Boundary Point Detection 8.2.2 EMST Algorithms 8.3 Boundary Point Detection Based on kNN Centroid 8.3.1 Definitions 8.3.2 The Proposed Boundary Point and Outlier Detection Algorithm 8.3.3 The Complexity Analysis 8.4 The Proposed Fast Approximate EMST Algorithm 8.4.1 Our Clustering-Inspired EMST Algorithm 8.4.2 Time Complexity Analysis 8.5 Experiments and Results 8.5.1 Performance Evaluation of the Proposed Boundary Point Detection Algorithm 8.5.2 Performance Evaluation of the Fast Approximate EMST Algorithm 8.6 Conclusions References 9 A Nearest Neighbor Classifier-Based Automated On-Line Novel Visual Percept Detection Method 9.1 Introduction 9.2 A Percept Learning System 9.2.1 Feature Generation 9.2.2 Similarity Measure 9.2.3 Percept Formation 9.2.4 A Fast Approximate Nearest Neighbor Classifier 9.3 An On-Line Novelty Detection Method 9.3.1 A Threshold Selection Method 9.3.2 Eight-Connected Structure Element Filter 9.3.3 Tree Update Method 9.4 Experiments and Results 9.4.1 Experiment I: An Indoor Environment 9.4.2 Experiment II: An Outdoor Environment 9.5 Conclusions References 10 Unsupervised Fraud Detection in Environmental Time Series Data 10.1 Introduction 10.2 Related Work 10.2.1 Point Outliers 10.2.2 Shape Outliers 10.3 Method 10.3.1 A Simple Idea 10.3.2 Selecting an Appropriate Threshold for Fraud Detection 10.3.3 The Complexity Analysis 10.4 Experiments and Results 10.4.1 Fraud Detection on Wastewater Discharge Concentration Data 10.4.2 Fraud Detection on Gas Emission Concentration Data 10.5 Conclusions References