Table of contents : Cover Front Matter 1. Introduction 2. Where Code Executes 3. Data Management 4. Expressing Parallelism 5. Error Handling 6. Unified Shared Memory 7. Buffers 8. Scheduling Kernels and Data Movement 9. Communication and Synchronization 10. Defining Kernels 11. Vectors and Math Arrays 12. Device Information and Kernel Specialization 13. Practical Tips 14. Common Parallel Patterns 15. Programming for GPUs 16. Programming for CPUs 17. Programming for FPGAs 18. Libraries 19. Memory Model and Atomics 20. Backend Interoperability 21. Migrating CUDA Code Back Matter