Machine Learning Engineering with Python [Second Edition]
9781837631964
Transform your machine learning projects into successful deployments with this practical guide on how to build and scale
144
91
19MB
English
Pages 601
Year 2023
Report DMCA / Copyright
DOWNLOAD EPUB FILE
Table of contents :
Preface
Who this book is for
What this book covers
To get the most out of this book
Get in touch
Introduction to ML Engineering
Technical requirements
Defining a taxonomy of data disciplines
Data scientist
ML engineer
ML operations engineer
Data engineer
Working as an effective team
ML engineering in the real world
What does an ML solution look like?
Why Python?
High-level ML system design
Example 1: Batch anomaly detection service
Example 2: Forecasting API
Example 3: Classification pipeline
Summary
The Machine Learning Development Process
Technical requirements
Setting up our tools
Setting up an AWS account
Concept to solution in four steps
Comparing this to CRISP-DM
Discover
Using user stories
Play
Develop
Selecting a software development methodology
Package management (conda and pip)
Poetry
Code version control
Git strategies
Model version control
Deploy
Knowing your deployment options
Understanding DevOps and MLOps
Building our first CI/CD example with GitHub Actions
Continuous model performance testing
Continuous model training
Summary
From Model to Model Factory
Technical requirements
Defining the model factory
Learning about learning
Defining the target
Cutting your losses
Preparing the data
Engineering features for machine learning
Engineering categorical features
Engineering numerical features
Designing your training system
Training system design options
Train-run
Train-persist
Retraining required
Detecting data drift
Detecting concept drift
Setting the limits
Diagnosing the drift
Remediating the drift
Other tools for monitoring
Automating training
Hierarchies of automation
Optimizing hyperparameters
Hyperopt
Optuna
AutoML
auto-sklearn
AutoKeras
Persisting your models
Building the model factory with pipelines
Scikit-learn pipelines
Spark ML pipelines
Summary
Packaging Up
Technical requirements
Writing good Python
Recapping the basics
Tips and tricks
Adhering to standards
Writing good PySpark
Choosing a style
Object-oriented programming
Functional programming
Packaging your code
Why package?
Selecting use cases for packaging
Designing your package
Building your package
Managing your environment with Makefiles
Getting all poetic with Poetry
Testing, logging, securing, and error handling
Testing
Securing your solutions
Analyzing your own code for security issues
Analyzing dependencies for security issues
Logging
Error handling
Not reinventing the wheel
Summary
Deployment Patterns and Tools
Technical requirements
Architecting systems
Building with principles
Exploring some standard ML patterns
Swimming in data lakes
Microservices
Event-based designs
Batching
Containerizing
Hosting your own microservice on AWS
Pushing to ECR
Hosting on ECS
Building general pipelines with Airflow
Airflow
Airflow on AWS
Revisiting CI/CD for Airflow
Building advanced ML pipelines
Finding your ZenML
Going with the Kubeflow
Selecting your deployment strategy
Summary
Scaling Up
Technical requirements
Scaling with Spark
Spark tips and tricks
Spark on the cloud
AWS EMR example
Spinning up serverless infrastructure
Containerizing at scale with Kubernetes
Scaling with Ray
Getting started with Ray for ML
Scaling your compute for Ray
Scaling your serving layer with Ray
Designing systems at scale
Summary
Deep Learning, Generative AI, and LLMOps
Going deep with deep learning
Getting started with PyTorch
Scaling and taking deep learning into production
Fine-tuning and transfer learning
Living it large with LLMs
Understanding LLMs
Consuming LLMs via API
Coding with LLMs
Building the future with LLMOps
Validating LLMs
PromptOps
Summary
Building an Example ML Microservice
Technical requirements
Understanding the forecasting problem
Designing our forecasting service
Selecting the tools
Training at scale
Serving the models with FastAPI
Response and request schemas
Managing models in your microservice
Pulling it all together
Containerizing and deploying to Kubernetes
Containerizing the application
Scaling up with Kubernetes
Deployment strategies
Summary
Building an Extract, Transform, Machine Learning Use Case
Technical requirements
Understanding the batch processing problem
Designing an ETML solution
Selecting the tools
Interfaces and storage
Scaling of models
Scheduling of ETML pipelines
Executing the build
Building an ETML pipeline with advanced Airflow features
Summary
Other Books You May Enjoy
Index