Data Science and Predictive Analytics. Biomedical and Health Applications using R [2 ed.]
9783031174827, 9783031174834
331
54
50MB
English
Pages [944]
Year 2023
Report DMCA / Copyright
DOWNLOAD PDF FILE
Table of contents :
Preface
Second Edition Preface
DSPA Application and Use Disclaimer
Biomedical, Biosocial, Environmental, and Health Disclaimer
Book Content
Notations
Contents
Chapter 1: Introduction
1.1 Motivation
1.1.1 DSPA Mission and Objectives
1.1.2 Examples of Driving Motivational Problems and Challenges
1.1.2.1 Alzheimer´s Disease
1.1.2.2 Parkinson´s Disease
1.1.2.3 Swiss Cancer Study
1.1.2.4 Amyotrophic Lateral Sclerosis
1.1.2.5 Normal Brain Visualization
1.1.2.6 Neurodegeneration
1.1.2.7 Genomics Computing
1.1.2.7.1 Genetic Forensics-2013-2016 Ebola Outbreak
1.1.2.7.2 Next-Generation Sequence (NGS) Analysis
1.1.2.7.3 Neuroimaging-Genetics
1.1.3 Common Characteristics of Big (Biomedical and Health) Data
1.1.4 Data Science
1.1.5 Predictive Analytics
1.1.6 High-Throughput Big Data Analytics
1.1.7 Examples of Data Repositories, Archives, and Services
1.1.8 Responsible Data Science and Ethical Predictive Analytics
1.1.8.1 Promoting FAIR Resource Sharing
1.1.8.2 Research Ethics
1.1.8.3 Understanding the Benefits and Detriments of Analytical Findings
1.1.8.4 Regulatory and Practical Issues in Handling Sensitive Data
1.1.8.5 Protection of Sensitive Information Versus Data Utility
1.1.8.6 Resource Provenance and Longevity
1.1.8.7 Examples of Inappropriate, Fake, or Malicious Use of Resources
1.1.9 DSPA Expectations
1.2 Foundations of R
1.2.1 Why Use R?
1.2.2 Getting Started with R
1.2.2.1 Install Basic Shell-Based R
1.2.2.2 GUI-Based R Invocation (RStudio)
1.2.2.3 RStudio GUI Layout
1.2.2.4 Software Updates
1.2.2.5 Some Notes
1.2.2.6 Help
1.2.2.7 Simple Wide-to-Long Data Format Translation
1.2.2.8 Data Generation
1.2.2.9 Input/Output (I/O)
1.2.2.10 Slicing and Extracting Data
1.2.2.11 Variable Conversion and Meta-Data
1.2.2.12 Data Selection and Manipulation
1.2.3 Mathematics, Statistics, and Optimization
1.2.3.1 Math Functions
1.2.3.2 Matrix Operations
1.2.3.3 Optimization and Model Fitting
1.2.3.4 Statistics
1.2.3.5 Distributions
1.2.4 Advanced Data Processing
1.2.4.1 Strings
1.2.5 Basic Plotting
1.2.5.1 QQ Normal Probability Plot
1.2.5.2 Low-Level Plotting Commands
1.2.5.3 General Graphics Parameters
1.2.6 Basic R Programming
1.2.7 Data Simulation Primer
1.3 Practice Problems
1.3.1 Long-to-Wide Data Format Translation
1.3.2 Data Frames
1.3.3 Data Stratification
1.3.4 Simulation
1.3.5 Programming
1.4 Appendix
1.4.1 Tidyverse
1.4.2 Additional R Documentation and Resources
1.4.3 HTML SOCR Data Import
1.4.4 R Debugging
Chapter 2: Basic Visualization and Exploratory Data Analytics
2.1 Data Handling
2.1.1 Saving and Loading R Data Structures
2.1.2 Importing and Saving Data from CSV Files
2.1.3 Importing Data from ZIP and SAV Files
2.1.4 Exploring the Structure of Data
2.1.5 Exploring Numeric Variables
2.1.6 Measuring Central Tendency-Mean, Median, and Mode
2.1.7 Measuring Spread-Variance, Quartiles, and the Five-Number Summary
2.1.8 Visualizing Numeric Variables-Boxplots
2.1.9 Visualizing Numeric Variables-Histograms
2.1.10 Uniform and Normal Distributions
2.1.11 Exploring Categorical Variables
2.1.12 Exploring Relationships Between Variables
2.1.12.1 Visualizing Relationships-Scatterplots
2.1.12.2 Examining Relationships-Two-Way Cross-Tabulations
2.1.13 Missing Data
2.1.13.1 Multivariate Data Simulation
2.1.13.2 TBI Data Example
2.1.13.3 Imputation via Expectation-Maximization
2.1.13.3.1 Types of Missing Data
2.1.13.3.2 General Idea of the EM Algorithm
2.1.13.3.3 EM-Based Imputation
2.1.13.3.4 Manual Implementation of EM-Based Imputation
2.1.13.3.5 Plotting the Complete and Imputed Data
2.1.13.3.6 Validation of EM-Imputation Using the R Package Amelia
2.1.14 Parsing Web Pages and Visualizing Tabular HTML Data
2.1.15 Cohort-Rebalancing (for Imbalanced Groups)
2.1.15.1 Example 1: Parkinson´s Diseases Study
2.2 Exploratory Data Analytics (EDA)
2.2.1 Classification of Visualization Methods
2.2.2 Composition
2.2.2.1 Histograms and Density Plots
2.2.2.2 Pie Chart
2.2.2.3 Heat Map
2.2.3 Comparison
2.2.3.1 Paired Scatter Plots
2.2.3.2 Bar Plots
2.2.3.3 Trees and Graphs
2.2.3.4 Correlation Plots
2.2.4 Relationships
2.2.4.1 Data Modeler
2.2.4.1.1 Loading the Spectral Crystallography Data
2.2.4.1.2 Sample Distributions
2.2.4.1.3 Fitting Single-Sample Univariate Distribution Models
2.2.4.1.4 Visual Inspection
2.2.4.1.5 Quantitative Summaries
2.2.4.1.6 Mixture Distribution Data Modeling
2.2.4.1.7 Mixture-Distribution Model Fitting and Parameter Estimation
2.2.4.1.8 Plotting the Mixture Distribution Models
2.2.4.1.9 Reporting Model Parameter Estimates
2.2.4.2 2D Kernel Density and 3D Surface Plots
2.2.4.3 3D and 4D Visualizations
2.3 Practice Problems
2.3.1 Data Manipulation
2.3.2 Bivariate Relations
2.3.3 Missing Data
2.3.4 Surface Plots
2.3.5 Unbalanced Groups
2.3.6 Common Plots
2.3.7 Trees and Graphs
2.3.8 Data EDA Examples
2.3.9 Data Reports
Chapter 3: Linear Algebra, Matrix Computing, and Regression Modeling
3.1 Linear Algebra
3.1.1 Building Matrices
3.1.2 Matrix Subscripts
3.1.3 Addition and Subtraction
3.1.4 Multiplication
3.1.4.1 Element-Wise Multiplication
3.1.4.2 Matrix Multiplication (Product)
3.1.4.3 Matrix Inversion (Division)
3.2 Matrix Computing
3.2.1 Solving Systems of Equations
3.2.2 The Identity Matrix
3.2.3 Vectors, Matrices, and Scalars
3.2.4 Sample Statistics
3.2.5 Applications of Matrix Algebra in Linear Modeling
3.2.6 Finding Function Extrema (Min/Max) Using Calculus
3.2.7 Linear Modeling in R
3.3 Eigenspectra-Eigenvalues and Eigenvectors
3.4 Matrix Notation
3.5 Linear Regression
3.5.1 Sample Covariance Matrix
3.6 Linear Multivariate Regression Modeling
3.6.1 Simple Linear Regression
3.6.2 Ordinary Least Squares Estimation
3.6.3 Regression Model Assumptions
3.6.4 Correlations
3.6.5 Multiple Linear Regression
3.7 Case Study 1: Baseball Players
3.7.1 Step 1: Collecting Data
3.7.2 Step 2: Exploring and Preparing the Data
3.7.2.1 Exploring Relationships Among Features-The Correlation Matrix
3.7.2.2 Multicollinearity and Feature-Selection in High-Dimensional Data
3.7.2.3 Visualizing Relationships Between Features
3.7.3 Step 3: Training a Model on the Data
3.7.4 Step 4: Evaluating Model Performance
3.7.5 Step 5: Improving Model Performance
3.7.5.1 Adding Nonlinear Relationships
3.7.5.2 Converting a Numeric Variable to a Binary Indicator
3.7.5.3 Adding Interaction Effects
3.8 Regression Trees and Model Trees
3.8.1 Adding Regression to Trees
3.9 Bayesian Additive Regression Trees (BART)
3.9.1 1D Simulation
3.9.2 Higher-Dimensional Simulation
3.9.3 Heart Attack Hospitalization Case-Study
3.9.4 Another Look at Case Study 2: Baseball Players
3.9.4.1 Step 3: Training a Model on the Data
3.9.4.2 Visualizing Regression Decision Trees
3.9.4.3 Step 4: Evaluating Model Performance
3.9.4.4 Measuring Performance with Mean Absolute Error
3.10 Practice Problems
3.10.1 How Is Matrix Multiplication Defined?
3.10.2 Scalar Versus Matrix Multiplication
3.10.3 Matrix Equations
3.10.4 Least Square Estimation
3.10.5 Matrix Manipulation
3.10.6 Matrix Transposition
3.10.7 Sample Statistics
3.10.8 Eigenvalues and Eigenvectors
3.10.9 Regression Forecasting Using Numerical Data
Chapter 4: Linear and Nonlinear Dimensionality Reduction
4.1 Motivational Example: Reducing 2D to 1D
4.2 Matrix Rotations
4.3 Summary (PCA, ICA, and FA)
4.4 Principal Component Analysis (PCA)
4.4.1 Principal Components
4.5 Independent Component Analysis (ICA)
4.6 Factor Analysis (FA)
4.7 Singular Value Decomposition (SVD)
4.7.1 SVD Summary
4.8 t-Distributed Stochastic Neighbor Embedding (t-SNE)
4.8.1 t-SNE Formulation
4.8.2 t-SNE Example: Hand-Written Digit Recognition
4.9 Uniform Manifold Approximation and Projection (UMAP)
4.9.1 Mathematical Formulation
4.9.2 Hand-Written Digits Recognition
4.9.3 Apply UMAP for Class-Prediction Using New Data
4.10 UMAP Parameters
4.10.1 Stability, Replicability, and Reproducibility
4.10.2 UMAP Interpretation
4.11 Dimensionality Reduction Case Study (Parkinson´s Disease)
4.11.1 Step 1: Collecting Data
4.11.2 Step 2: Exploring and Preparing the Data
4.11.3 PCA
4.11.4 Factor Analysis (FA)
4.11.5 t-SNE
4.11.6 Uniform Manifold Approximation and Projection (UMAP)
4.12 Practice Problems
4.12.1 Parkinson´s Disease Example
4.12.2 Allometric Relations in Plants example
4.12.3 3D Volumetric Brain Study
Chapter 5: Supervised Classification
5.1 k-Nearest Neighbor Approach
5.2 Distance Function and Dummy Coding
5.2.1 Estimation of the Hyperparameter k
5.2.2 Rescaling of the Features
5.2.3 Rescaling Formulas
5.2.4 Case Study: Youth Development
5.2.5 Case Study: Predicting Galaxy Spins
5.3 Probabilistic Learning-Naïve Bayes Classification
5.3.1 Overview of the Naïve Bayes Method
5.3.2 Model Assumptions
5.3.3 Bayes Formula
5.3.4 The Laplace Estimator
5.3.5 Case Study: Head and Neck Cancer Medication
5.4 Decision Trees and Divide-and-Conquer Classification
5.4.1 Motivation
5.4.1.1 Hands-On Example: Iris Data
5.4.2 Decision Tree Overview
5.4.2.1 Divide and Conquer
5.4.2.2 Entropy
5.4.2.3 Misclassification Error and Gini Index
5.4.2.4 C5.0 Decision Tree Algorithm
5.4.2.5 Pruning the Decision Tree
5.4.3 Case Study 1: Quality of Life and Chronic Disease
5.4.4 Classification Rules
5.5 Case Study 2: QoL in Chronic Disease (Take 2)
5.6 Practice Problems
5.6.1 Iris Species
5.6.2 Cancer Study
5.6.3 Baseball Data
5.6.4 Medical Specialty Text-Notes Classification
5.6.5 Chronic Disease Case Study
Chapter 6: Black Box Machine Learning Methods
6.1 Neural Networks
6.1.1 From Biological to Artificial Neurons
6.1.2 Activation Functions
6.2 Network Topology
6.2.1 Network Layers
6.2.2 Training Neural Networks with Backpropagation
6.2.3 Case Study 1: Google Trends and the Stock Market-Regression
6.2.4 Simple NN Demo-Learning to Compute
6.2.5 Case Study 2: Google Trends and the Stock Market-Classification
6.3 Support Vector Machines (SVM)
6.3.1 Classification with Hyperplanes
6.3.1.1 Finding the Maximum Margin
6.3.1.2 Linearly Separable Data
6.3.1.3 Nonlinearly Separable Data
6.3.1.4 Using Kernels for Nonlinear Spaces
6.3.2 Case Study 3: Optical Character Recognition (OCR)
6.3.3 Case Study 4: Iris Flowers
6.3.4 Parameter Tuning
6.3.5 Improving the Performance of Gaussian Kernels
6.4 Ensemble Meta-Learning
6.4.1 Bagging
6.4.2 Boosting
6.4.3 Random Forests
6.4.4 Random Forest Algorithm (Pseudo Code)
6.4.4.1 Training Random Forests
6.4.4.2 Evaluating Random Forest Performance
6.4.5 Adaptive Boosting
6.5 Practice Problems
6.5.1 Problem 1: Google Trends and the Stock Market
6.5.2 Problem 2: Quality of Life and Chronic Disease
Chapter 7: Qualitative Learning Methods-Text Mining, Natural Language Processing, and Apriori Association Rules Learning
7.1 Natural Language Processing (NLP) and Text Mining (TM)
7.1.1 A Simple NLP/TM Example
7.1.2 Case Study: Job Ranking
7.1.3 Area Under ROC Curve
7.1.4 TF-IDF
7.1.4.1 Term Frequency (TF)
7.1.4.2 Inverse Document Frequency (IDF)
7.1.4.3 TF-IDF
7.1.5 Cosine Similarity
7.1.6 Sentiment Analysis
7.1.7 NLP/TM Analytics
7.2 Apriori Association Rules Learning
7.2.1 Association Rules
7.2.2 The Apriori Algorithm for Association Rule Learning
7.2.3 Rule Support and Confidence
7.2.4 Building a Set of Rules with the Apriori Principle
7.2.5 A Toy Example
7.2.6 Case Study 1: Head and Neck Cancer Medications
7.2.6.1 Visualizing Item Support-Item Frequency Plots
7.2.6.2 Visualizing Transaction Data-Plotting the Sparse Matrix
7.2.7 Graphical Depiction of Association Rules
7.2.8 Saving Association Rules to a File or a Data Frame
7.3 Summary
7.4 Practice Problems
7.4.1 Groceries
7.4.2 Titanic Passengers
Chapter 8: Unsupervised Clustering
8.1 ML Clustering
8.2 Silhouette Plots
8.3 The k-Means Clustering Algorithm
8.3.1 Pseudocode
8.3.2 Choosing the Appropriate Number of Clusters
8.3.3 Case Study 1: Divorce and Consequences on Young Adults
8.3.4 Model Improvement
8.3.4.1 Tuning the Hyperparameter k
8.3.5 Case Study 2: Pediatric Trauma
8.3.6 Feature Selection for k-Means Clustering
8.4 Hierarchical Clustering
8.5 Spectral Clustering
8.5.1 Image Segmentation Using Spectral Clustering
8.5.2 Point Cloud Segmentation Using Spectral Clustering
8.6 Gaussian Mixture Models
8.7 Summary
8.8 Practice Problems
8.8.1 Youth Development
Chapter 9: Model Performance Assessment, Validation, and Improvement
9.1 Measuring the Performance of Classification Methods
9.2 Evaluation Strategies
9.2.1 Binary Outcomes
9.2.2 Cross Tables, Contingency Tables, and Confusion-Matrices
9.2.3 Other Measures of Performance Beyond Accuracy
9.2.3.1 Silhouette Coefficient
9.2.3.2 The Kappa (κ) Statistic
9.2.3.3 Summary of the Kappa Score for Calculating Prediction Accuracy
9.2.3.4 Sensitivity and Specificity
9.2.3.5 Precision and Recall
9.2.3.6 The F-Measure
9.2.4 Visualizing Performance Tradeoffs (ROC Curve)
9.3 Estimating Future Performance (Internal Statistical Cross-validation)
9.3.1 The Holdout Method
9.3.2 Cross-validation
9.3.3 Bootstrap Sampling
9.4 Improving Model Performance by Parameter Tuning
9.4.1 Using Caret for Automated Parameter Tuning
9.5 Customizing the Tuning Process
9.6 Comparing the Performance of Several Alternative Models
9.7 Forecasting Types and Assessment Approaches
9.7.1 Overfitting
9.7.1.1 Example (US Presidential Elections)
9.7.1.2 Example (Google Flu Trends)
9.7.1.3 Example (Autism)
9.8 Internal Statistical Cross-validation
9.8.1 Example (Linear Regression)
9.8.2 Cross-validation Methods
9.8.2.1 Exhaustive Cross-validation
9.8.2.2 Nonexhaustive Cross-validation
9.8.3 Case Studies
9.8.3.1 Example 1: Prediction of Parkinson´s Disease Using Adaptive Boosting (AdaBoost)
9.8.3.2 Example 2: Sleep Dataset
9.8.3.3 Example 3: Model-Based (Linear Regression) Prediction Using the Attitude Dataset
9.8.3.4 Example 4: Parkinson´s Data (PPMI data)
9.8.4 Summary of CV Output
9.8.5 Alternative Predictor Functions
9.8.5.1 Logistic Regression
9.8.5.2 Quadratic Discriminant Analysis (QDA)
9.8.6 Foundation of LDA and QDA for Prediction, Dimensionality Reduction, or Forecasting
9.8.6.1 LDA (Linear Discriminant Analysis)
9.8.6.2 QDA (Quadratic Discriminant Analysis)
9.8.6.3 Neural Network
9.8.6.4 SVM
9.8.6.5 k-Nearest Neighbors Algorithm (k-NN)
9.8.6.6 k-Means Clustering (k-MC)
9.8.6.7 Spectral Clustering
9.8.6.8 Iris Petal Data
9.8.6.9 Spirals Data
9.8.7 Comparing Multiple Classifiers
Chapter 10: Specialized Machine Learning Topics
10.1 Working with Specialized Data and Databases
10.1.1 Data Format Conversion
10.1.2 Querying Data in SQL Databases
10.1.3 SparQL Queries
10.1.4 Real Random Number Generation
10.1.5 Downloading the Complete Text of Web Pages
10.1.6 Reading and Writing XML with the XML Package
10.1.7 Web Page Data Scraping
10.1.8 Parsing JSON From Web APIs
10.1.9 Reading and Writing Microsoft Excel Spreadsheets Using XLSX
10.2 Working with Domain-Specific Data
10.2.1 Working with Bioinformatics Data
10.2.2 Visualizing Network Data
10.3 Data Streaming
10.3.1 Definition
10.3.2 The stream Package
10.3.3 Synthetic Example-Random Gaussian Stream
10.3.4 Generate the Stream
10.3.4.1 K-Means Clustering
10.3.5 Sources of Data Streams
10.3.5.1 Static Structure Streams
10.3.5.2 Concept Drift Streams
10.3.5.3 Real Data Streams
10.3.6 Printing, Plotting, and Saving Streams
10.3.7 Stream Animation
10.3.8 Case Study: SOCR Knee Pain Data
10.3.9 Data Stream Clustering and Classification (DSC)
10.3.10 Evaluation of Data Stream Clustering
10.4 Optimization and Improving the Computational Performance
10.4.1 Generalizing Tabular Data Structures with dplyr
10.4.2 Making Data Frames Faster with data.table
10.4.3 Creating Disk-Based Data Frames with ff
10.4.4 Using Massive Matrices with bigmemory
10.5 Parallel Computing
10.5.1 Measuring Execution Time
10.5.2 Parallel Processing with Multiple Cores
10.5.3 Parallelization Using foreach and doParallel
10.5.4 GPU Computing
10.6 Deploying Optimized Learning Algorithms
10.6.1 Building Bigger Regression Models with biglm
10.6.2 Growing Bigger and Faster Random Forests with bigrf
10.6.3 Training and Evaluation Models in Parallel with caret
10.7 R Notebook Support for Other Programming Languages
10.7.1 R-Python Integration
10.7.2 Installing Python
10.7.3 Install the reticulate Package
10.7.4 Installing and Importing Python Modules
10.7.5 Python-Based Data Modeling
10.7.6 Visualization of the Results in R
10.7.7 R Integration with C/C++
10.8 Practice Problem
Chapter 11: Variable Importance and Feature Selection
11.1 Feature Selection Methods
11.1.1 Filtering Techniques
11.1.2 Wrapper
11.1.3 Embedded Techniques
11.1.4 Random Forest Feature Selection
11.1.5 Case Study-ALS
11.2 Regularized Linear Modeling and Controlled Variable Selection
11.2.1 General Questions
11.2.2 Model Regularization
11.2.3 Matrix Notation
11.2.4 Regularized Linear Modeling
11.2.4.1 Ridge Regression
11.2.4.2 Least Absolute Shrinkage and Selection Operator (LASSO) Regression
11.2.5 Predictor Standardization
11.2.6 Estimation Goals
11.2.7 Linear Regression
11.2.8 Drawbacks of Linear Regression
11.2.8.1 Assessing Prediction Accuracy
11.2.8.2 Estimating the Prediction Error
11.2.8.3 Improving the Prediction Accuracy
11.2.9 Variable Selection
11.2.10 Simple Regularization Framework
11.2.10.1 Role of the Penalty Term
11.2.10.2 Role of the Regularization Parameter
11.2.10.3 LASSO
11.2.11 General Regularization Framework
11.2.12 Likelihood Ratio Test (LRT), False Discovery Rate (FDR), and Logistic Transform
11.2.12.1 Likelihood Ratio Test (LRT)
11.2.12.2 False Discovery Rate (FDR)
11.2.12.3 Graphical Interpretation of the Benjamini-Hochberg (BH) Method
11.2.12.4 FDR Adjusting the p-Values
11.2.13 Logistic Transformation
11.2.13.1 Example: Heart Transplant Surgery
11.2.14 Implementation of Regularization
11.2.15 Computational Complexity
11.2.16 LASSO and Ridge Solution Paths
11.2.17 Regression Solution Paths-Ridge vs. LASSO
11.2.18 Choice of the Regularization Parameter
11.2.19 Cross-validation Motivation
11.2.20 n-Fold Cross-validation
11.2.21 LASSO 10-Fold Cross-validation
11.2.22 Stepwise OLS (Ordinary Least Squares)
11.2.23 Final Models
11.2.24 Model Performance
11.2.25 Summary
11.3 Knockoff Filtering (FDR-Controlled Feature Selection)
11.3.1 Simulated Knockoff Example
11.3.2 Knockoff Invocation
11.3.3 PD Neuroimaging-Genetics Case Study
11.4 Practice Problems
Chapter 12: Big Longitudinal Data Analysis
12.1 Classical Time-Series Analytic Approaches
12.1.1 Information theoretic model evaluation criteria
12.1.2 Time-Series Analysis
12.1.2.1 Autoregressive Integrated Moving Average Extended/Exogenous (ARIMAX) Model
12.1.2.2 Simulated ARIMAX Example
12.1.2.3 Google Trends Analytics
12.1.3 Structural Equation Modeling (SEM)-Latent Variables
12.1.3.1 Foundations of SEM
12.1.3.2 SEM Components
12.1.3.3 Case Study-Parkinson´s Disease (PD)
12.1.3.4 Outputs of Lavaan SEM
12.1.4 Longitudinal Data Analysis-Linear Mixed Model
12.1.4.1 Mean Trend
12.1.4.2 Modeling the Correlation
12.1.5 Generalized Estimating Equations (GEE)
12.1.5.1 GEE Versus GLMM
12.1.6 PD/PPMI Case Study: SEM, GLMM, and GEE Modeling
12.2 Network-Based Approaches
12.2.1 Background
12.2.2 Recurrent Neural Networks (RNN)
12.2.3 Tensor Format Representation
12.2.4 Simulated RNN Case Study
12.2.5 Climate Data Study
12.2.5.1 Examine the ACF
12.2.6 Keras-Based Multicovariate LSTM Time-Series Analysis and Forecasting
12.2.6.1 Using Keras to Model Stateful LSTM Time-Series
12.2.6.2 Definitions
12.2.6.3 Keras Modeling of Time-Series Data
12.2.6.4 Keras Modeling of Image Classification Data (CIFAR10)
Chapter 13: Function Optimization
13.1 General Optimization Approach
13.1.1 First-Order Gradient-Based Optimization
13.1.2 Second-Order Hessian-Based Optimization
13.1.2.1 Newton´s Method
13.1.2.2 Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method
13.1.3 Gradient-Free Optimization
13.2 Free (Unconstrained) Optimization
13.2.1 Example 1: Minimizing a Univariate Function (Inverse-CDF)
13.2.2 Example 2: Minimizing a Bivariate Function
13.2.3 Example 3: Using Simulated Annealing to Find the Maximum of an Oscillatory Function
13.3 Constrained Optimization
13.3.1 Equality Constraints
13.3.2 Lagrange Multipliers
13.3.3 Inequality Constrained Optimization
13.3.3.1 Linear Programming (LP)
13.3.3.2 Mixed Integer Linear Programming (MILP)
13.3.4 Quadratic Programming (QP)
13.4 General Nonlinear Optimization
13.4.1 Dual Problem Optimization
13.4.1.1 Motivation
13.4.1.2 Example 1: Linear Example
13.4.1.3 Example 2: Quadratic Example
13.4.1.4 Example 3: More Complex Nonlinear Optimization
13.4.1.5 Example 4: Another Linear Example
13.5 Manual Versus Automated Lagrange Multiplier Optimization
13.6 Data Denoising
13.7 Sparse Matrices
13.8 Parallel Computing
13.9 Foundational Methods for Function Optimization
13.9.1 Basics
13.9.2 Gradient Descent
13.9.2.1 Gradient Descent Pseudo Algorithm
13.9.2.2 Example
13.9.2.3 Summary of Gradient Descent
13.9.3 Convexity
13.9.4 Foundations of the Newton-Raphson´s Method
13.9.4.1 Newton-Raphson Method Pseudocode
13.9.4.2 Advantages and Disadvantages of the Newton-Raphson Method
13.9.5 Stochastic Gradient Descent
13.9.5.1 Stochastic Gradient Descent Pseudocode
13.9.6 Simulated Annealing (SANN)
13.9.6.1 SANN Pseudocode
13.10 Hands-On Examples
13.10.1 Example 1: Healthcare Manufacturer Product Optimization
13.10.1.1 Unconstrained Optimization
13.10.1.2 Constrained Optimization
13.10.1.3 Manual Solution
13.10.1.4 R-Based Automated Solution
13.10.2 Example 2: Optimization of the Booth´s Function
13.10.3 Example 3: Extrema of the Bivariate Goldstein-Price Function
13.10.4 Example 4: Bivariate Oscillatory Function
13.10.5 Nonlinear Constraint Optimization Problem
13.11 Examples of Explicit Optimization Use in AI/ML
13.12 Practice Problems
Chapter 14: Deep Learning, Neural Networks
14.1 Perceptrons
14.2 Biological Relevance
14.3 Simple Neural Net Examples
14.3.1 Exclusive OR (XOR) Operator
14.3.2 NAND Operator
14.3.3 Complex Networks Designed Using Simple Building Blocks
14.4 Neural Network Modeling Using Keras
14.4.1 Iterations-Samples, Batches, and Epochs
14.4.2 Use-Case: Predicting Titanic Passenger Survival
14.4.3 EDA/Visualization
14.4.4 Data Preprocessing
14.4.5 Keras Modeling
14.4.6 NN Model Fitting
14.4.7 Convolutional Neural Networks (CNNs)
14.4.8 Model Exploration
14.4.9 Passenger Survival Forecasting Using New Data
14.4.10 Fine-Tuning the NN Model
14.4.11 Model Export and Import
14.5 Case Studies
14.5.1 Classification Example Using Sonar Data
14.5.2 Schizophrenia Neuroimaging Study
14.5.3 ALS Regression Example
14.5.4 IBS Study
14.5.5 Country QoL Ranking Data
14.5.6 Handwritten Digits Classification
14.5.6.1 Configuring the Neural Network
14.5.6.2 Training
14.5.6.3 Forecasting
14.5.6.4 Examining the Network Structure
14.5.6.5 Model Validation
14.6 Classifying Real-World Images Using Pretrained Tensorflow and Keras Models
14.6.1 Load the Pretrained Model
14.6.2 Load and Preprocess a New Image
14.6.3 Image Classification
14.6.4 Additional Image Classification Examples
14.6.4.1 Lake Mapourika, New Zealand
14.6.4.2 Beach
14.6.4.3 Volcano
14.6.4.4 Brain Surface
14.6.4.5 Face Mask: Synthetic Face Image
14.7 Data Generation: Simulating Synthetic Data
14.7.1 Fractal Shapes
14.7.2 Fake Images
14.7.3 Generative Adversarial Networks (GANs)
14.7.3.1 CIFAR10 Archive
14.7.3.2 Generator (G)
14.7.3.3 Discriminator
14.7.3.4 Training the DCGAN
14.7.3.5 Elements of the DCGAN Training
14.8 Transfer Learning
14.8.1 Text Classification Using Deep Network Transfer Learning
14.8.1.1 Binary Transfer Learning Label-Classification of Clinical Text
14.8.1.1.1 Define a Fresh New model1 De Novo
14.8.1.2 Naïve, Out-of-the-Box, Prior-Model Assessment (Without Retraining)
14.8.1.3 Simple Transfer Learning
14.8.1.4 Full-Scale Transfer Learning
14.8.2 Multinomial Transfer Learning Classification of Clinical Text
14.8.3 Binary Classification of Film Reviews
14.9 Image Classification
14.9.1 Performance Metrics
14.9.2 Torch Deep Convolutional Neural Network (CNN)
14.9.2.1 Data Import
14.9.2.2 Torch-Based Transfer Learning
14.9.3 Tensorflow Image Preprocessing Pipeline
14.9.3.1 Notes About the Tensorflow Pipeline Protocol
14.9.3.2 Network Layers
14.9.3.3 Model Tracking and Network Visualization
14.10 Additional References
14.11 Practice Problems
14.11.1 Deep Learning Classification
14.11.2 Deep Learning Regression
14.11.3 Image Classification
14.11.4 (Challenging Problem) Deep Convolutional Networks for 3D Volume Segmentation
Summary
Electronic Appendix (Table 1)
Glossary (Table 2)
Index