Table of contents : Cover Front Matter Part I. Foundation 1. Introduction 2. Markov Decision Processes 3. Dynamic Programming 4. Monte Carlo Methods 5. Temporal Difference Learning Part II. Value Function Approximation 6. Linear Value Function Approximation 7. Nonlinear Value Function Approximation 8. Improvements to DQN Part III. Policy Approximation 9. Policy Gradient Methods 10. Problems with Continuous Action Space 11. Advanced Policy Gradient Methods Part IV. Advanced Topics 12. Distributed Reinforcement Learning 13. Curiosity-Driven Exploration 14. Planning with a Model: AlphaZero Back Matter