Table of contents : Cover Page Title Page Copyright Page Dedication About the Author Acknowledgement Preface Errata Table of Contents 1. Data Science Overview Structure Objectives Evolution of data analytics Define data science Domain knowledge Mathematical and scientific techniques Tools and technology Data science analysis types Data science job roles ML model development process Data visualizations Result communication Responsible and ethical AI Career in data science Conclusion 2. Mathematics Essentials Structure Objectives Introduction to linear algebra Scalar, vectors, matrices, and tensors Scalar Vectors Matrices Tensors The determinant Eigenvalues and Eigenvectors Eigenvalue decomposition and Singular Value Decomposition (SVD) Singular value decomposition Principal component analysis Multivariate calculus Differential Calculus Sum rule Power rule Special cases Trigonometric functions Product rule Chain rule Quotient rule Multiple variables Partial differentiation Total derivative Integral calculus Slices Definite vs.indefinite integrals The Gradient The Jacobian The Hessian The Lagrange multipliers Laplace interpolation Optimization The Gradient Descent algorithm Conclusion 3. Statistics Essentials Structure Objectives Introduction to probability and statistics Descriptive statistics The measure of central tendency Mean Median Mode Measures of variability Range Variance Covariance Standard Deviation Measure of asymmetry Modality Skewness Populations and samples Central Limit Theorem Sampling distribution Conditional probability Random variables Inferential statistics Probability distributions What is a probability distribution? Normal distribution Binomial distribution Poisson distribution Geometric distribution Exponential distribution Conclusion 4. Exploratory Data Analysis Structure Objectives What is EDA? Need for the EDA Understanding data Categorical variables Numeric variables Binning (numeric to categorical) Encoding Methods of EDA Key concepts of EDA Conclusion 5. Data Preprocessing Structure Objectives Introduction to data preprocessing Methods in data preprocessing Transformation into vectors Normalization Dealing with the missing values Conclusion 6. Feature Engineering Structure Objectives Introduction to feature engineering Importance of feature variable Feature engineering in machine learning Feature engineering techniques Imputation Handling outliers Binning Log Transform One-hot encoding Grouping operations Categorical column grouping Numerical column grouping Feature split Scaling Extracting date Applying feature engineering Conclusion 7. Machine Learning Algorithms Structure Objectives Introduction to machine learning Brief history of machine learning Classification of machine learning algorithms Top 10 algorithms of machine learning explained Building a machine learning model Conclusion 8. Productionizing Machine Learning Models Structure Objectives Types of ML production system Batch prediction Batch learning REST APIs Online learning Introduction to REST APIs Application Programming Interface (APIs) Hyper Text Transfer Protocol (HTTP) Client-server architecture Resource Flask framework Simple flask application Salary prediction model ML model user interface HTML template Conclusion 9. Data Flows in Enterprises Structure Objectives Introducing data pipeline Designing data pipeline ETL vs. ELT Scheduling jobs Messaging queue Passing arguments to data pipeline Conclusion 10. Introduction to Databases Structure Objectives Modern databases and terminology Relational database or SQL database Install PostgreSQL and pgAdmin Set-up a database and table Connect Python to Postgres Modify data pipeline to store in Postgres Document-oriented database or No-SQL Install MongoDB and compass client Create a database and collection Connect Python to MongoDB Modify data pipeline to store in MongoDB Graph databases Install and start Neo4j Add nodes and relations Filesystem as storage What is Filesystem? Filesystem as data store Hierarchy to store CSV Conclusion 11. Introduction to Big Data Structure Objectives Introducing Big Data Definition of Big Data Introducing Hadoop Hadoop Distributed File System (HDFS) MapReduce YARN Hadoop common Setting-up a Hadoop Cluster Installing a Hadoop Cluster Starting Hadoop cluster in Docker Word-count MapReduce Program Map program Reducer program MapReduce JAR Running Word Count in HDFS Cluster Conclusion 12. DevOps for Data Science Structure Objectives Introduction to DevOps Agile methodology, CI/CD, and DevOps DevOps for data science Source Code Management Quality Assurance Model objects andsecurity Production deployment Communication and collaboration Conclusion 13. Introduction to Cloud Computing Structure Objectives Introducing cloud computing Operating system model What is virtualization? What is cloud computing? Types of cloud services Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Software as a Service (SaaS) Types of cloud infrastructure Public cloud Private cloud Hybrid cloud Data science and cloud computing Data Compute Integration Deployment Market growth of cloud Conclusion 14. Deploy Model to Cloud Structure Objectives Register for GCP free account GCP console Create VM and its properties Connecting and uploading code to VM Executing Python model on cloud Access the model via browser Scaling the resources in Cloud Conclusion 15. Introduction to Business Intelligence Structure Objectives What is business intelligence? Business intelligence analysis Business intelligence process Step 1: Data awareness Data types Data sources Step 2: Store data Data models Data storage Step 3: Business needs Key Performance Indicators (KPI) Data Visuals Step 4: a Visualization tool Time to insight Ease of use Step 5: Enable platform Data access Business users Business intelligence trends Gartner 2019 Magic Quadrant Conclusion 16. Data Visulazation Tools Structure Objectives Introduction to data visualization Data visualization types Data visualization tools Visualization tool features Introduction to Microsoft Power BI Use case Microsoft Power BI Microsoft Power BI console Load the data Create data visuals Publish the visuals Conclusion 17. Industry Use Case 1 - Form Assist Structure Objective Abstract Introduction Related Work Proposed work Work architecture NIST dataset Activation function – ReLU Dropout Data augmentation Optimization Feature extraction Image thresholding Classifier Results Conclusion Acknowledgment References 18. Industry Use Case 2 - People Reporter Structure Objective Abstract Introduction Event detection Work architecture Results Nipah virus outbreak in Kerala CSK enters the final of IPL 2018: OnePlus 6 launched in India Conclusion Acknowledgment References 19. Data Science Learning Resources Structure Objective Books Online courses Competitions Blogs and magazines University courses Conferences and events Meet-ups and interest groups YouTube channels and Podcasts Analytic reports and white paper Talk to people Conclusion 20. Do It Your Self Challenges Structure Objectives DIY challenge 1 – Analyzing the pathological slide for blood analysis Challenge overview Challenge statement Target users Resources IP source DIY challenge 2 – IoT based weather monitoring system Challenge overview Challenge statement Target Users Resources IP source DIY challenge 3 – Facial image-based BMI calculator Challenge overview Challenge statement Target users Resources IP source DIY challenge 4 – Chatbot assistant for Tourism in North East Challenge overview Challenge statement Target users Resources IP source DIY challenge 5 – Assaying and grading of fruits for e-procurement Challenge overview Challenge statement Target users Resources IP source Conclusion 21. Qs for DS Assessment Structure Objectives Data Science Overview Mathematics Essentials Statistics Essentials Exploratory Data Analysis Data Preprocessing Feature Engineering Machine Learning Algorithms Productionizing Machine Learning Models Data Flows in Enterprises Introduction to Databases Introduction to Big Data DevOps for Data Science Introduction to Cloud Computing Deploy Model to Cloud Introduction to Business Intelligence Data Visualization Tools Conclusion