柯克编著的《大规模并行处理器程序设计(英文版第2版)》简介:We would like to offer some of our experience in teaching courses with this book. Since 2006, we have taught multiple types of courses: in one-semester for mat and in one-week intensive format. The original ECE498AL course has become a permanent course known as ECE408 or CS483 of the University of Illinois at Urbana-Champaign. We started to write up some early chapters of this book when we offered ECE498AL the second time. The first four chapters were also tested in an MIT class taught by Nicolas Pinto in the spring of 2009. Since then, we have used the book for numerous offerings of ECE408 as well as the VSCSE and PUMPS summer schools.
作者(柯克)结合自己多年从事并行计算课程教学的经验,以简洁、直观和实用的方式,详细剖析了编写并行程序所需的各种技术,并用丰富的案例说明了并行程序设计的整个开发过程,即从计算机思想开始,直到最终实现高效可行的并行程序。
与上一版相比,《大规模并行处理器程序设计(英文版第2版)》对书中内容进行全面修订和更新,更加系统地阐述并行程序设计,既介绍了基本并行算法模式,又补充了更多的背景资料,而且还介绍了一些新的实用编程技术和工具。具体更新情况如下:
并行模式:新增3章并行模式方面的内容,详细说明了并行应用中涉及的诸多算法。
CUDA Fortran:这一章简要介绍了针对CUDA体系结构的编程接口,并通过丰富的实例阐释CUDA编程。
OpenACC:这一章介绍了使用指令表示并行性的开放标准,以简化并行编程任务。
Thrust:Thrust.~CUDA C/C++之上的一个抽象层。《大规模并行处理器程序设计(英文版第2版)》用一章的篇幅说明了如何利用Thrust并行
模板库以最少的编程工作来实现高性能应用。
C++AMP:微软开发的一种编程接口,用于简化Windows环境中大规模并行处理编程。
NVIDlA的KepIer架构:探讨了NVIDlA高性能、节能的GPU架构的编程特性。
Preface
Acknowledgements
CHAPTER 1 Introduction
CHAPTER 2 History of GPU Computing.
CHAPTER 3 Introduction to Data Parallelism and CUDA C..
CHAPTER 4 Data-Parallel Execution Model
CHAPTER 5 CUDA Memories
CHAPTER 6 Performance Considerations
CHAPTER 7 Floating-Point Considerations
CHAPTER 8 Parallel Patterns: Convolution
CHAPTER 9 Parallel Patterns: Prefix Sum
CHAPTER 10 Parallel Patterns: Sparse Matrix-Vector Multiplication
CHAPTER 11 Application Case Study: Advanced MRI Reconstruction
CHAPTER 12 Application Case Study: Molecular Visualization and Analysis
CHAPTER 13 Parallel Programming and Computational Thinking
CHAPTER 14 An Introduction to OpenCLTM...
CHAPTER 15 Parallel Programming with OpenACC
CHAPTER 16 Thrust: A Productivity-Oriented Library for CUDA
CHAPTER 17 CUDA FORTRAN
CHAPTER 18 An Introduction to C++ AMP
CHAPTER 19 Programming a Heterogeneous Computing Cluster
CHAPTER 20 CUDA Dynamic Parallelism
CHAPTER 21 Conclusion and Future Outlook