Table of contents : Cover image Title page Table of Contents Copyright Dedication Preface Acknowledgment Foreword Foreword 1: Introduction Abstract 1.1. Data, information, and knowledge 1.2. Data Science: the art of data exploration 1.3. What is not Data Science? 1.4. Data Science tasks 1.5. Data Science objectives 1.6. Applications of Data Science 1.7. How to read the book? References 2: Data, sources, and generation Abstract 2.1. Introduction 2.2. Data attributes 2.3. Data-storage formats 2.4. Data sources 2.5. Data generation 2.6. Summary References 3: Data preparation Abstract 3.1. Introduction 3.2. Data cleaning 3.3. Data reduction 3.4. Data transformation 3.5. Data normalization 3.6. Data integration 3.7. Summary References 4: Machine learning Abstract 4.1. Introduction 4.2. Machine Learning paradigms 4.3. Inductive bias 4.4. Evaluating a classifier 4.5. Summary References 5: Regression Abstract 5.1. Introduction 5.2. Regression 5.3. Evaluating linear regression 5.4. Multidimensional linear regression 5.5. Polynomial regression 5.6. Overfitting in regression 5.7. Reducing overfitting in regression: regularization 5.8. Other approaches to regression 5.9. Summary References 6: Classification Abstract 6.1. Introduction 6.2. Nearest-neighbor classifiers 6.3. Decision trees 6.4. Support-Vector Machines (SVM) 6.5. Incremental classification 6.6. Summary References 7: Artificial neural networks Abstract 7.1. Introduction 7.2. From biological to artificial neuron 7.3. Multilayer perceptron 7.4. Learning by backpropagation 7.5. Loss functions 7.6. Activation functions 7.7. Deep neural networks 7.8. Summary References 8: Feature selection Abstract 8.1. Introduction 8.2. Steps in feature selection 8.3. Principal-component analysis for feature reduction References 9: Cluster analysis Abstract 9.1. Introduction 9.2. What is cluster analysis? 9.3. Proximity measures 9.4. Exclusive clustering techniques 9.5. High-dimensional data clustering 9.6. Biclustering 9.7. Cluster-validity measures 9.8. Summary References 10: Ensemble learning Abstract 10.1. Introduction 10.2. Ensemble-learning framework 10.3. Supervised ensemble learning 10.4. Unsupervised ensemble learning 10.5. Semisupervised ensemble learning 10.6. Issues and challenges 10.7. Summary References 11: Association-rule mining Abstract Acknowledgement 11.1. Introduction 11.2. Association analysis: basic concepts 11.3. Frequent itemset-mining algorithms 11.4. Association mining in quantitative data 11.5. Correlation mining 11.6. Distributed and parallel association mining 11.7. Summary References 12: Big Data analysis Abstract 12.1. Introduction 12.2. Characteristics of Big Data 12.3. Types of Big Data 12.4. Big Data analysis problems 12.5. Big Data analytics techniques 12.6. Big Data analytics platforms 12.7. Big Data analytics architecture 12.8. Tools and systems for Big Data analytics 12.9. Active challenges 12.10. Summary References 13: Data Science in practice Abstract 13.1. Need of Data Science in the real world 13.2. Hands-on Data Science with Python 13.3. Dataset preprocessing 13.4. Feature selection and normalization 13.5. Classification 13.6. Clustering 13.7. Summary References 14: Conclusion Abstract Index