Distributional Reinforcement Learning (Adaptive Computation and Machine Learning) 9780262374019, 2022033240, 2022033241, 9780262048019, 9780262374026

The first comprehensive guide to distributional reinforcement learning, providing a new mathematical formalism for think

264 56 13MB

English Pages 384 Year 2023

Table of contents :
1 Introduction
3 1.1 Why Distributional Reinforcement Learning?
4 1.2 An Example: Kuhn Poker
5 1.3 How Is Distributional Reinforcement Learning Different?
6 1.4 Intended Audience and Organization
7 1.5 Bibliographical Remarks
8 2 The Distribution of Returns
9 2.1 Random Variables and Their Probability Distributions
10 2.2 Markov Decision Processes
11 2.3 The Pinball Model
12 2.4 The Return
13 2.5 The Bellman Equation
14 2.6 Properties of the Random Trajectory
15 2.7 The Random-Variable Bellman Equation
16 2.8 From Random Variables to Probability Distributions
17 2.9 Alternative Notions of the Return Distribution*
18 2.10 Technical Remarks
19 2.11 Bibliographical Remarks
20 2.12 Exercises
21 3 Learning the Return Distribution
22 3.1 The Monte Carlo Method
23 3.2 Incremental Learning
24 3.3 Temporal-Difference Learning
25 3.4 From Values to Probabilities
26 3.5 The Projection Step
27 3.6 Categorical Temporal-Difference Learning
28 3.7 Learning to Control
29 3.8 Further Considerations
30 3.9 Technical Remarks
31 3.10 Bibliographical Remarks
32 3.11 Exercises
33 4 Operators and Metrics
34 4.1 The Bellman Operator
35 4.2 Contraction Mappings
36 4.3 The Distributional Bellman Operator
37 4.4 Wasserstein Distances for Return Functions
38 4.5 ℓ p Probability Metrics and the Cramér Distance
39 4.6 Sufficient Conditions for Contractivity
40 4.7 A Matter of Domain
41 4.8 Weak Convergence of Return Functions*
42 4.9 Random-Variable Bellman Operators*
43 4.10 Technical Remarks
44 4.11 Bibliographical Remarks
45 4.12 Exercises
46 5 Distributional Dynamic Programming
47 5.1 Computational Model
48 5.2 Representing Return-Distribution Functions
49 5.3 The Empirical Representation
50 5.4 The Normal Representation
5.5 Fixed-Size Empirical Representations
52 5.6 The Projection Step
53 5.7 Distributional Dynamic Programming
54 5.8 Error Due to Diffusion
55 5.9 Convergence of Distributional Dynamic Programming
56 5.10 Quality of the Distributional Approximation
57 5.11 Designing Distributional Dynamic Programming Algorithms
58 5.12 Technical Remarks
59 5.13 Bibliographical Remarks
60 5.14 Exercises
61 6 Incremental Algorithms
62 6.1 Computation and Statistical Estimation
63 6.2 From Operators to Incremental Algorithms
64 6.3 Categorical Temporal-Difference Learning
65 6.4 Quantile Temporal-Difference Learning
66 6.5 An Algorithmic Template for Theoretical Analysis
67 6.6 The Right Step Sizes
68 6.7 Overview of Convergence Analysis
69 6.8 Convergence of Incremental Algorithms*
70 6.9 Convergence of Temporal-Difference Learning*
71 6.10 Convergence of Categorical Temporal-Difference Learning*
72 6.11 Technical Remarks
73 6.12 Bibliographical Remarks
74 6.13 Exercises
75 7 Control
76 7.1 Risk-Neutral Control
77 7.2 Value Iteration and Q-Learning
78 7.3 Distributional Value Iteration
79 7.4 Dynamics of Distributional Optimality Operators
80 7.5 Dynamics in the Presence of Multiple Optimal Policies*
81 7.6 Risk and Risk-Sensitive Control
82 7.7 Challenges in Risk-Sensitive Control
83 7.8 Conditional Value-At-Risk*
84 7.9 Technical Remarks
85 7.10 Bibliographical Remarks
86 7.11 Exercises
87 8 Statistical Functionals
88 8.1 Statistical Functionals
89 8.2 Moments
90 8.3 Bellman Closedness
91 8.4 Statistical Functional Dynamic Programming
92 8.5 Relationship to Distributional Dynamic Programming
93 8.6 Expectile Dynamic Programming
94 8.7 Infinite Collections of Statistical Functionals
95 8.8 Moment Temporal-Difference Learning*
96 8.9 Technical Remarks
97 8.10 Bibliographical Remarks
98 8.11 Exercises
99 9 Linear Function Approximation
100 9.1 Function Approximation and Aliasing
101 9.2 Optimal Linear Value Function Approximations
102 9.3 A Projected Bellman Operator for Linear Value Function Approximation
103 9.4 Semi-Gradient Temporal-Difference Learning
104 9.5 Semi-Gradient Algorithms for Distributional Reinforcement Learning
105 9.6 An Algorithm Based on Signed Distributions*
106 9.7 Convergence of the Signed Algorithm*
107 9.8 Technical Remarks
108 9.9 Bibliographical Remarks
109 9.10 Exercises
110 10 Deep Reinforcement Learning
111 10.1 Learning with a Deep Neural Network
10.2 Distributional Reinforcement Learning with Deep Neural Networks
113 10.3 Implicit Parameterizations
114 10.4 Evaluation of Deep Reinforcement Learning Agents
115 10.5 How Predictions Shape State Representations
116 10.6 Technical Remarks
117 10.7 Bibliographical Remarks
118 10.8 Exercises
119 11 Two Applications and a Conclusion
120 11.1 Multiagent Reinforcement Learning
121 11.2 Computational Neuroscience
122 11.3 Conclusion
123 11.4 Bibliographical Remarks
124 Notation
125 References
126 Index

Distributional Reinforcement Learning (Adaptive Computation and Machine Learning)
9780262374019, 2022033240, 2022033241, 9780262048019, 9780262374026

Author / Uploaded
Marc G. Bellemare

Similar Topics
Computers
Algorithms and Data Structures: Pattern Recognition

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

Recommend Papers

Distributional Reinforcement Learning 9780262374019, 2022033240, 2022033241, 9780262048019, 9780262374026

The first comprehensive guide to Distributional Reinforcement Learning, providing a new mathematical formalism for think

145 7 13MB Read more

Machine Learning for Data Streams: with Practical Examples in MOA (Adaptive Computation and Machine Learning series) 0262037793, 9780262037792

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popul

570 91 21MB Read more

Introduction to Natural Language Processing (Adaptive Computation and Machine Learning series) [Illustrated] 0262042843, 9780262042840

A survey of computational methods for understanding, generating, and manipulating human language, which offers a synthes

497 38 4MB Read more

Artificial Intelligence: Reinforcement Learning in Python: Complete guide to artificial intelligence and machine learning, prep for deep reinforcement learning

When people talk about artificial intelligence, they usually don’t mean supervised and unsupervised machine learning. T

99 88 262KB Read more

Algorithms for Reinforcement Learning

518 131 2MB Read more

R Machine Learning Projects: Implement supervised, unsupervised, and reinforcement learning techniques using R 3.5 1789807948, 9781789807943

Master a range of machine learning domains with real-world projects using TensorFlow for R, H2O, MXNet, and more Key Fea

893 213 11MB Read more

Adaptive Machine Learning Algorithms with Python : Solve Data Analytics and Machine Learning Problems on Edge Devices 9781484280171, 9781484280164

Learn to use adaptive algorithms to solve real-world streaming data problems. This book covers a multitude of data proce

102 60 20MB Read more

Deep reinforcement learning 9789811382840, 9789811382857

503 87 11MB Read more

DEEP REINFORCEMENT LEARNING (Yuxi Li)

713 49 2MB Read more

Reinforcement Learning and Optimal Control 1886529396

202 105 25MB Read more