Pretrain Vision and Large Language Models in Python
9781804618257
Master the art of training vision and large language models with conceptual fundaments and industry-expert guidance. Lea
2,687
45
7MB
English
Pages 258
Year 2023
Report DMCA / Copyright
DOWNLOAD EPUB FILE
Table of contents :
Pretrain Vision and Large Language Models in Python
Foreword
Contributors
About the author
Acknowledgment
About the reviewer
Preface
Who is this book for?
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Share Your Thoughts
Download a free PDF copy of this book
Part 1: Before Pretraining
Chapter 1: An Introduction to Pretraining Foundation Models
The art of pretraining and fine-tuning
The Transformer model architecture and self-attention
State-of-the-art vision and language models
Top vision models as of April 2023
Contrastive pretraining and natural language supervision
Top language models as of April 2023
Language technique spotlight – causal modeling and the scaling laws
Encoders and decoders
Summary
References
Chapter 2: Dataset Preparation: Part One
Finding a dataset and use case for foundation modeling
Top pretraining use cases by industry
Delta – how different is your dataset?
Use the scaling laws to size your datasets
Fundamentals – scaling laws of neural language models
Bias detection and mitigation
Enhancing your dataset – multilingual, multimodal, and augmentations
Summary
References
Chapter 3: Model Preparation
Finding your best base model
Starting with the smallest base model you can
Trade-off – simplicity versus complexity
Finding your pretraining loss function
Pretraining loss functions in vision – ViT and CoCa
Pretraining loss functions in language – Alexa Teacher Model
Changing your pretraining loss function
Solving for your model size
Practical approaches to solving for your model size
Not all scaling laws are created equal
Planning future experiments
Summary
References
Part 2: Configure Your Environment
Chapter 4: Containers and Accelerators on the Cloud
What are accelerators and why do they matter?
Getting ready to use your accelerators
How to use accelerators on AWS – Amazon SageMaker
Optimizing accelerator performance
Hyperparameters
Infrastructure optimizations for accelerators on AWS
Troubleshooting accelerator performance
Summary
References
Chapter 5: Distribution Fundamentals
Understanding key concepts – data and model parallelism
What data parallel is all about
What model parallel is all about
Combining model and data parallel
Distributed training on Amazon SageMaker
Distributed training software
SM DDP
SMP library
Advanced techniques to reduce GPU memory
Tensor parallelism
Optimizer state sharding
Activation checkpointing
Sharded data parallelism
Bringing it all home with examples from models today
Stable Diffusion – data parallelism at scale
GPT-3 – model and data parallelism at scale
Summary
References
Chapter 6: Dataset Preparation: Part Two, the Data Loader
Introducing the data loader in Python
Building and testing your own data loader – a case study from Stable Diffusion
Creating embeddings – tokenizers and other key steps for smart features
Optimizing your data pipeline on Amazon SageMaker
Transforming deep learning datasets at scale on AWS
Summary
References
Part 3: Train Your Model
Chapter 7: Finding the Right Hyperparameters
Hyperparameters – batch size, learning rate, and more
Key hyperparameters in vision and language
Tuning strategies
Hyperparameter tuning for foundation models
Scaling up as a function of world size with SageMaker
Tuning on a sample of your data and updating based on world size
Summary
References
Chapter 8: Large-Scale Training on SageMaker
Optimizing your script for SageMaker training
Importing packages
Argument parsing
Top usability features for SageMaker training
Warm pools for rapid experimentation
SSM and SSH into training instances
Track jobs and experiments to replicate results
Summary
References
Chapter 9: Advanced Training Concepts
Evaluating and improving throughput
Calculating model TFLOPS
Using Flash Attention to speed up your training runs
Speeding up your jobs with compilation
Integrating compilation into your PyTorch scripts
Amazon SageMaker Training Compiler and Neo
Best practices for compilation
Running compiled models on Amazon’s Trainium and Inferentia custom hardware
Solving for an optimal training time
Summary
References
Part 4: Evaluate Your Model
Chapter 10: Fine-Tuning and Evaluating
Fine-tuning for language, text, and everything in between
Fine-tuning a language-only model
Fine-tuning vision-only models
Fine-tuning vision-language models
Evaluating foundation models
Model evaluation metrics for vision
Model evaluation metrics in language
Model evaluation metrics in joint vision-language tasks
Incorporating the human perspective with labeling through SageMaker Ground Truth
Reinforcement learning from human feedback
Summary
References
Chapter 11: Detecting, Mitigating, and Monitoring Bias
Detecting bias in ML models
Detecting bias in large vision and language models
Mitigating bias in vision and language models
Bias mitigation in language – counterfactual data augmentation and fair loss functions
Bias mitigation in vision – reducing correlation dependencies and solving sampling issues
Monitoring bias in ML models
Detecting, mitigating, and monitoring bias with SageMaker Clarify
Summary
References
Chapter 12: How to Deploy Your Model
What is model deployment?
What is the best way to host my model?
Model deployment options on AWS with SageMaker
Why should I shrink my model, and how?
Model compilation
Knowledge distillation
Quantization
Hosting distributed models on SageMaker
Model servers and end-to-end hosting optimizations
Summary
References
Part 5: Deploy Your Model
Chapter 13: Prompt Engineering
Prompt engineering – the art of getting more with less
From few- to zero-shot learning
Text-to-image prompt engineering tips
Image-to-image prompt engineering tips
Upscaling
Masking
Prompting for object-to-image with DreamBooth
Prompting large language models
Instruction fine-tuning
Chain-of-thought prompting
Summarization
Defending against prompt injections and jailbreaking
Advanced techniques – prefix and prompt tuning
Prefix tuning
Prompt tuning
Summary
References
Chapter 14: MLOps for Vision and Language
What is MLOps?
Common MLOps pipelines
Continuous integration and continuous deployment
Model monitoring and human-in-the-loop
MLOps for foundation models
MLOps for vision
AWS offerings for MLOps
A quick introduction to SageMaker Pipelines
Summary
References
Chapter 15: Future Trends in Pretraining Foundation Models
Techniques for building applications for LLMs
Building interactive dialogue apps with open-source stacks
Using RAG to ensure high accuracy in LLM applications
Is generation the new classification?
Human-centered design for building applications with LLMs
Other generative modalities
AWS offerings in foundation models
The future of foundation models
The future of pretraining
Summary
References
Index
Why subscribe?
Other Books You May Enjoy
Packt is searching for authors like you
Share Your Thoughts