New Developments in Unsupervised Outlier Detection: Algorithms and Applications 9789811595189, 9789811595196

This book enriches unsupervised outlier detection research by proposing several new distance-based and density-based out

223 74 10MB

English Pages 277 [287] Year 2021

Table of contents :
Foreword
Preface
Acknowledgements
Contents
About the Authors
Abbreviations
Part IIntroduction
1 Overview and Contributions
1.1 Introduction
1.2 Research Issues on Unsupervised Outlier Detection
1.3 Overview of the Book
1.4 Contributions
1.5 Conclusions
2 Developments in Unsupervised Outlier Detection Research
2.1 Introduction
2.1.1 A Brief Overview of the Early Developments in Outlier Analysis
2.2 Some Standard Unsupervised Outlier Detection Approaches
2.2.1 Probabilistic Model-Based Outlier Detection Approach
2.2.2 Clustering-Based Outlier Detection Approaches
2.2.3 Distance-Based Outlier Detection Approaches
2.2.4 Density-Based Outlier Detection Approaches
2.2.5 Outlier Detection for Time Series
2.3 Performance Evaluation Metrics of Outlier Detection Approaches
2.3.1 Precision, Recall and Rank Power
2.4 Conclusions
References
Part IINew Developments in Unsupervised Outlier Detection Research
3 A Fast Distance-Based Outlier Detection Technique Using a Divisive Hierarchical Clustering Algorithm
3.1 Introduction
3.2 Related Work
3.2.1 Distance-Based Outlier Detection Research
3.2.2 A Divisive Hierarchical Clustering Algorithm for Approximate kNN Search
3.2.3 An Efficiency Analysis of DHCA for Distance-Based Outlier Detection
3.3 The Proposed Fast Distance-Based Outlier Detection Algorithm
3.3.1 A Simple Idea
3.3.2 The Proposed CPU-Efficient DB-Outlier Detection Method
3.3.3 Time Complexity Analysis
3.3.4 Data Structure for Implementing DHCA
3.4 Scale to Very Large Databases with I/O Efficiency
3.5 Performance Evaluation
3.5.1 Data Characteristics
3.5.2 The Impact of Input K on Running Time
3.5.3 Comparison with Other Methods
3.5.4 Effectiveness of DHCA for kNN Search
3.5.5 The Impact of Curse of Dimensionality
3.5.6 Scale to Very Large Databases with I/O Efficiency
3.5.7 Discussion
3.6 Conclusions
References
4 A k-Nearest Neighbor Centroid-Based Outlier Detection Method
4.1 Introduction
4.2 K-means Clustering and Its Application to Outlier Detection
4.2.1 K-means Clustering
4.2.2 K-means Clustering-Based Outlier Detection
4.3 A kNN-Centroid-Based Outlier Detection Algorithm
4.3.1 General Idea
4.3.2 Definition for an Outlier Indicator
4.3.3 Formal Definition of kNN-Based Centroid
4.3.4 Two New Formulations of Outlier Factors
4.3.5 Determination of k
4.3.6 The Complexity Analysis
4.3.7 The Proposed Outlier Detection Algorithm
4.4 A Performance Study
4.4.1 Performance on Synthetic Datasets
4.4.2 Performance on Real Datasets
4.4.3 Performance on High-Dimensional Real Datasets
4.4.4 Discussion
4.5 Conclusions
References
5 A Minimum Spanning Tree Clustering-Inspired Outlier Detection Technique
5.1 Introduction
5.2 Background
5.2.1 Minimum Spanning Tree-Based Clustering
5.2.2 Minimum Spanning Tree Clustering-Based Outlier Detection
5.3 An Improved MST-Clustering-Inspired Outlier Detection Algorithm
5.3.1 A Simple Idea
5.3.2 Two New Outlier Factors
5.3.3 The Proposed MST-Clustering-Inspired Outlier Detection Algorithm
5.3.4 Time Complexity Analysis
5.4 A Performance Study
5.4.1 Performance on Synthetic Datasets
5.4.2 Performance on Multi-dimensional Real Datasets
5.4.3 Performance of the Proposed Algorithm with Varying SOM-TH
5.5 Concluding Remarks
References
6 A k-Nearest Neighbour Spectral Clustering-Based Outlier Detection Technique
6.1 Introduction
6.2 Spectral Clustering and Its Application to Outlier Detection
6.2.1 Preliminaries
6.2.2 Spectral Clustering-Based Outlier Detection
6.3 The Proposed Spectral Clustering-Based Outlier Mining Algorithm
6.3.1 A Simple Idea
6.3.2 The Proposed Outlier Detection Algorithm
6.3.3 Complexity Analysis
6.4 Experimental Results
6.4.1 Performance of Our Algorithm on Synthetic Data
6.4.2 Performance of Our Algorithm on Real Data
6.5 Discussion
6.6 Conclusions
References
7 Enhancing Outlier Detection by Filtering Out Core Points and Border Points
7.1 Introduction
7.2 Related Work
7.2.1 Density-Based Clustering with DBSCAN
7.2.2 Density-Based Clustering for Outlier Detection
7.3 The Proposed Enhancer for Outlier Mining
7.3.1 A Simple Idea
7.3.2 Some Definitions
7.3.3 Our Proposed Outlier Detection Algorithm
7.3.4 The Complexity Analysis
7.4 Experiments and Results
7.4.1 Performance of Our Algorithm on Synthetic Data
7.4.2 Performance of Our Algorithm on Real Data
7.5 Conclusions
References
Part IIIApplications
8 An Effective Boundary Point Detection Algorithm Via k-Nearest Neighbors-Based Centroid
8.1 Introduction
8.2 Related Work
8.2.1 Outlier and Boundary Point Detection
8.2.2 EMST Algorithms
8.3 Boundary Point Detection Based on kNN Centroid
8.3.1 Definitions
8.3.2 The Proposed Boundary Point and Outlier Detection Algorithm
8.3.3 The Complexity Analysis
8.4 The Proposed Fast Approximate EMST Algorithm
8.4.1 Our Clustering-Inspired EMST Algorithm
8.4.2 Time Complexity Analysis
8.5 Experiments and Results
8.5.1 Performance Evaluation of the Proposed Boundary Point Detection Algorithm
8.5.2 Performance Evaluation of the Fast Approximate EMST Algorithm
8.6 Conclusions
References
9 A Nearest Neighbor Classifier-Based Automated On-Line Novel Visual Percept Detection Method
9.1 Introduction
9.2 A Percept Learning System
9.2.1 Feature Generation
9.2.2 Similarity Measure
9.2.3 Percept Formation
9.2.4 A Fast Approximate Nearest Neighbor Classifier
9.3 An On-Line Novelty Detection Method
9.3.1 A Threshold Selection Method
9.3.2 Eight-Connected Structure Element Filter
9.3.3 Tree Update Method
9.4 Experiments and Results
9.4.1 Experiment I: An Indoor Environment
9.4.2 Experiment II: An Outdoor Environment
9.5 Conclusions
References
10 Unsupervised Fraud Detection in Environmental Time Series Data
10.1 Introduction
10.2 Related Work
10.2.1 Point Outliers
10.2.2 Shape Outliers
10.3 Method
10.3.1 A Simple Idea
10.3.2 Selecting an Appropriate Threshold for Fraud Detection
10.3.3 The Complexity Analysis
10.4 Experiments and Results
10.4.1 Fraud Detection on Wastewater Discharge Concentration Data
10.4.2 Fraud Detection on Gas Emission Concentration Data
10.5 Conclusions
References

New Developments in Unsupervised Outlier Detection: Algorithms and Applications
9789811595189, 9789811595196

Author / Uploaded
Xiaochun Wang
Xiali Wang
Mitch Wilkes

Similar Topics
Computers

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Recommend Papers

New Developments in Unsupervised Outlier Detection: Algorithms and Applications [1st ed.] 9789811595189, 9789811595196

This book enriches unsupervised outlier detection research by proposing several new distance-based and density-based out

270 53 10MB Read more

Outlier Detection in Python (MEAP V01)

Outlier Detection in Python is a comprehensive guide to the statistical methods, machine learning, and deep learning app

117 14 8MB Read more

New Applications and Developments in Synthetic Peptide Chemistry 9783036596013

This is a reprint of articles from the Special Issue published online in the open access journal Pharmaceuticals (ISSN 1

115 96 12MB Read more

Metaheuristic Optimization Algorithms in Civil Engineering: New Applications 303045472X, 9783030454722

This book discusses the application of metaheuristic algorithms in a number of important optimization problems in civil

453 80 14MB Read more

New Algorithms, Architectures and Applications for Reconfigurable Computing 1402031270, 9781402031274

New Algorithms, Architectures and Applications for Reconfigurable Computing consists of a collection of contributions fr

117 117 4MB Read more

Landslides: Detection, Prediction and Monitoring: Technological Developments 3031238583, 9783031238581

This book intends to decipher the knowledge in the advancement of understanding, detecting, predicting, and monitoring l

190 70 18MB Read more

Machine Learning: Fundamental Algorithms for Supervised and Unsupervised Learning With Real-World Applications (Advanced Data Analytics Book 1)

Computers can't LEARN... Right?! Machine Learning is a branch of computer science that wants to stop programming co

214 49 1MB Read more

Collision Detection for Robot Manipulators: Methods and Algorithms 3031301943, 9783031301940

This book provides a concise survey and description of recent collision detection methods for robot manipulators. Beginn

164 102 3MB Read more

New Developments and Applications in Sensing Technology [1 ed.] 3642179428, 9783642179426

This book has focussed on different aspects of smart sensors and sensing technology, i.e. intelligent measurement, infor

311 157 14MB Read more

Practice of Intramedullary Locked Nails: New Developments in Techniques and Applications 3540253491, 9783540253495

The third volume of the "Practice of Intramedullary Locked Nails" places a special focus on recent advancement

118 54 12MB Read more