高性能计算机上的数值线性代数(国际著名数学图书影印版)(美)冬格拉清华大学出版社豆瓣PDF电子书bt网盘迅雷下载科学技术-自然科学-数学-霍普软件下载网

About the Authors

Preface

Introduction

1 High-Performance Computing

1.1 Trends in Computer Design

1.2 Traditional Computers and Their Limitations

1.3 Parallelism within a Single ProceSsor

1.3.1 Multiple Functional Units

1.3.2 Pipelining

1.3.3 Overlapping

1.3.4 RISC

1.3.5 VLIW

1.3.6 Vector Instructions

1.3.7 Chaining

1.3.8 Memory-to-Memory and Register-to-Register Organizations

1.3.9 Register Set

1.3.10 Stripmining

1.3.11 Reconfigurable Vector Registers

1.3.12 Memory Organization

1.4 Data Organization

1.4.1 Main Memory

1.4.2 Cache

1.4.3 Local Memory

1.5 Memory Management

1.6 Parallelism through Multiple Pipes or Multiple Processors

1.7 Message Passing

1.8 Virtual Shared Memory

1.8.1 Routing

1.9 Interconnection Topology

1.9.1 Crossbar Switch

1.9.2 Timeshared Bus

1.9.3 Ring Connection

1.9.4 Mesh Connection

1.9.5 Hypercube

1.9.6 Multi-staged Network

1.10 Programming Techniques

1.11 Trends: Network-Based Computing

2 Overview of Current High-Performance Computers

2.1 Supercomputers

2.2 RISC-Based Processors

2.3 Parallel Processors

3 Implementation Details and Overhead

3.1 Parallel Decomposition and Data Dependency Graphs

3.2 Synchronization

3.3 Lead Balancing

3.4 Recurrence

3.5 Indirect Addressing

3.6 Message Passing

3.6.1 Performance Prediction

3.6.2 Message-Passing Standards

3.6.3 Routing

4 Performance: Analysis, Modeling, and Measurements

4.1 Amdahl's Law

4.1.1 Simple Case of Amdahl's Law

4.1.2 General Form of Amdahl's Law

4.2 Vector Speed and Vector Length

4.3 Amdahl's Law--Parallel Processing

4.3.1 A Simple Model

4.3.2 Gustafson's Model

4.4 Examples cf (r∞, n1/2)-values for Various Computers

4.4.1 CRAY J90 and CRAY T90 (One Processor)

4.4.2 General Observations

4.5 LINPACK Benchmark.

4.5.1 Description cf the Benchmark

4.5.2 Calls to the BLAS

4.5.3 Asymptotic Performance

5 Building Blocks in Linear Algebra

5.1 Basic Linear Algebra Subprograms

5.1.1 Level 1 BLAS

5.1.2 Level 2 BLAS

5.1.3 Level 3 BLAS

5.2 Levels of Parallelism

5.2.1 Vector Computers

5.2.2 Parallel Processors with Shared Memory

5.2.3 Parallel-Vector Computers

5.2.4 Clusters Computing

5.3 Basic Factorizations of Linear Algebra.

5.3.1 Point Algorithm: Gaussian Elimination with Partial Pivoting

5.3.2 Special Matrices

5.4 Blocked Algorithms: Matrix-Vector and Matrix-Matrix Versions

5.4.1 Right-Looking Algorithm

5.4.2 Left-Looking Algorithm

5.4.3 Crout Algorithm

5.4.4 Typical Performance of Blocked LU Decomposition

5.4.5 Blocked Symmetric Indefinite Factorization

5.4.6 Typical Performance of Blocked Symmetric Indefinite Factorization

5.5 Linear Least Squares

5.5.1 Householder Method

5.5.2 Blocked Householder Method

5.5.3 Typical Performance of the Blocked Householder Factor-ization

5.6 Organization of the Modules

5.6.1 Matrix-Vector Product

5.6.2 Matrix-Matrix Product

5.6.3 Typical Performance for Parallel Processing

5.6.4 Benefits

5.7 LAPACK

5.8 ScaLAPACK

5.8.1 The Basic Linear Algebra Communication Subprograms(BLACS)

5.8.2 PBLAS

5.8.3 ScaLAPACK Sample Code

6 Direct Solution of Sparse Linear Systems

6.1 Introduction to Direct Methods for Sparse Linear Systems

6.1.1 Four Approaches

6.1.2 Description of Sparse Data Structure

6.1.3 Man!pulation of Sparse Data Structures

6.2 General Sparse Matrix Methods

6.2.1 Fill-in and Sparsity Ordering

6.2.2 Indirect Addressing--Its Effect and How to Avoid It

6.2.3 Comparison with Dense Codes

6.2.4 Other Approaches

6.3 Methods for Symmetric Matrices and Band Systems

6.3.1 The Clique Concept in Gaussian Elimination

6.3.2 Further Comments on Ordering Schemes

6.4 Frontal Methods

6.4.1 Frontal Methods--Link to Band Methods and Numerical Pivoting

6.4.2 Vector Performance

6.4.3 Parallel Implementation of Frontal Schemes

6.5 Multifrontal Methods .

6.5.1 Performance on Vector Machines

6.5.2 Performance on RISC Machines

6.5.3 Performance on Parallel Machines

6.5.4 Exploitation of Structure

6.5.5 Unsymmetric Multifrontal Methods

6.6 Other Approaches for Exploitation of Parallelism

6.7 Software

6.8 Brief Summary

7 Krylov Subspaces: Projection

7.1 Notation

7.2 Basic Iteration Methods: Richardson Iteration, Power Method .

7.3 Orthogonal Basis (Arnoldi, Lanczos)

8 Iterative Methods for Linear Systems

8.1 Krylov Subspace Solution Methods: Basic Principles

8.1.1 The Ritz-Galerkin Approach: FOM and CG

8.1.2 The Minimum Residual Approach: GMRES and MINRES

8.1.3 The Petrov-Galerkin Approach: Bi-CG and QMR

8.1.4 The Minimum Error Approach: SYMMLQ and GMERR

8.2 Iterative Methods in More Detail

8.2.1 The CG Method

8.2.2 Parallelism in the CG Method: General Aspects

8.2.3 Parallelism in the CG Method: Communication Overhead

8.2.4 MINRES

8.2.5 Least Squares CG

8.2.6 GMRES and GMRES(m)

8.2.7 GMRES with Variable Preconditioning

8.2.8 Bi-CG and QMR

8.2.9 CGS

8.2.10 Bi-CGSTAB

8.2.11 Bi-CGSTAB(e) and Variants

8.3 Other Issues

8.4 How to Test Iterative Methods

9 Preconditioning and Parallel Preconditioning

9.1 Preconditioning and Parallel Preconditioning

9.2 The Purpose of Preconditioning

9.3 Incomplete L U Decompositions

9.3.1 Efficient Implementations of ILU(0) Preconditioning

9.3.2 General Incomplete Decompositions

9.3.3 Variants of ILU Preconditioners

9.3.4 Some General Comments on ILU

9.4 Some Other Forms of Preconditioning

9.4.1 Sparse Approximate Inverse (SPAI)

9.4.2 Polynomial Preconditioning

9.4.3 Preconditioning by Blocks or Domains

9.4.4 Element by Element Preconditioners

9.5 Vector and Parallel Implementation of Preconditioners

9.5.1 Partial Vectorization

9.5.2 Reordering the Unknowns

9.5.3 Changing the Order of Computation

9.5.4 Some Other Vectorizable Preconditioners

9.5.5 Parallel Aspects of Reorderings

9.5.6 Experiences with Parallelism .

10 Linear Eigenvalue Problems Ax=λx

10.1 Theoretical Background and Notation

10.2 Single-Vector Methods

10.3 The QR Algorithm

10.4 Subspace Projection Methods

10.5 The Arnoldi Factorization

10.6 Restarting the Arnoldi Process

10.6.1 Explicit Restarting

10.7 Implicit Restarting

10.8 Lanczos' Method

10.9 Harmonic Ritz Values and Vectors

10.10 Other Subspace Iteration Methods

10.11 Davidson's Method

10.12 The Jacobi-Davidson Iteration Method

10.12.1 JDQR

10.13 Eigenvalue Software: ARPACK, P_ARPACK

10.13.1 Reverse Communication Interface

10.13.2 Parallelizing ARPACK

10.13.3 Data Distribution of the Arnoldi Factorization .

10.14 Message Passing

10.15 Parallel Performance

10.16 Availability

10.17 Summary

11 The Generalized Eigenproblem

11.1 Arnoldi/Lanczos with Shift-Invert

11.2 Alternatives to Arnoldi/Lanczos with Shift-Invert

11.3 The Jacobi-Davidson QZ Algorithm

11.4 The Jacobi-Davidson QZ Method: Restart and Deflation

11.5 Parallel Aspects

A Acquiring Mathematical Software

A.1 netlib

A.1.1 Mathematical Software

A.2 Mathematical Software Libraries

B Glossary

C Level 1, 2, and 3 BLAS Quick Reference

D Operation Counts for Various BLAS and Decompositions

Bibliography

Index

书名	高性能计算机上的数值线性代数(国际著名数学图书影印版)
分类	科学技术-自然科学-数学
作者	(美)冬格拉
出版社	清华大学出版社
下载
简介	编辑推荐 The book is divided into five major parts: (1) introduction to terms and concepts, including an overview of the state of the art for high-performance computers and a discussion of performance evaluation (Chapters 1-4); (2) direct solution of dense matrix problems (Chapter 5); (3) direct solution of sparse systems of equations (Chapter 6); (4) iterative solution of sparse systems of equations (Chapters 7-9); and (5) iterative solution of sparse eigenvalue problems (Chapters 10-11). Any book that attempts to cover these topics must necessarily be somewhat out of date before it appears, because the area is in a state of flux. We have purposely avoided highly detailed descriptions of popular machines and have tried instead to focus on concepts as much as possible;nevertheless, to make the description more concrete, we do point to specific computers. 目录 About the Authors Preface Introduction 1 High-Performance Computing 1.1 Trends in Computer Design 1.2 Traditional Computers and Their Limitations 1.3 Parallelism within a Single ProceSsor 1.3.1 Multiple Functional Units 1.3.2 Pipelining 1.3.3 Overlapping 1.3.4 RISC 1.3.5 VLIW 1.3.6 Vector Instructions 1.3.7 Chaining 1.3.8 Memory-to-Memory and Register-to-Register Organizations 1.3.9 Register Set 1.3.10 Stripmining 1.3.11 Reconfigurable Vector Registers 1.3.12 Memory Organization 1.4 Data Organization 1.4.1 Main Memory 1.4.2 Cache 1.4.3 Local Memory 1.5 Memory Management 1.6 Parallelism through Multiple Pipes or Multiple Processors 1.7 Message Passing 1.8 Virtual Shared Memory 1.8.1 Routing 1.9 Interconnection Topology 1.9.1 Crossbar Switch 1.9.2 Timeshared Bus 1.9.3 Ring Connection 1.9.4 Mesh Connection 1.9.5 Hypercube 1.9.6 Multi-staged Network 1.10 Programming Techniques 1.11 Trends: Network-Based Computing 2 Overview of Current High-Performance Computers 2.1 Supercomputers 2.2 RISC-Based Processors 2.3 Parallel Processors 3 Implementation Details and Overhead 3.1 Parallel Decomposition and Data Dependency Graphs 3.2 Synchronization 3.3 Lead Balancing 3.4 Recurrence 3.5 Indirect Addressing 3.6 Message Passing 3.6.1 Performance Prediction 3.6.2 Message-Passing Standards 3.6.3 Routing 4 Performance: Analysis, Modeling, and Measurements 4.1 Amdahl's Law 4.1.1 Simple Case of Amdahl's Law 4.1.2 General Form of Amdahl's Law 4.2 Vector Speed and Vector Length 4.3 Amdahl's Law--Parallel Processing 4.3.1 A Simple Model 4.3.2 Gustafson's Model 4.4 Examples cf (r∞, n1/2)-values for Various Computers 4.4.1 CRAY J90 and CRAY T90 (One Processor) 4.4.2 General Observations 4.5 LINPACK Benchmark. 4.5.1 Description cf the Benchmark 4.5.2 Calls to the BLAS 4.5.3 Asymptotic Performance 5 Building Blocks in Linear Algebra 5.1 Basic Linear Algebra Subprograms 5.1.1 Level 1 BLAS 5.1.2 Level 2 BLAS 5.1.3 Level 3 BLAS 5.2 Levels of Parallelism 5.2.1 Vector Computers 5.2.2 Parallel Processors with Shared Memory 5.2.3 Parallel-Vector Computers 5.2.4 Clusters Computing 5.3 Basic Factorizations of Linear Algebra. 5.3.1 Point Algorithm: Gaussian Elimination with Partial Pivoting 5.3.2 Special Matrices 5.4 Blocked Algorithms: Matrix-Vector and Matrix-Matrix Versions 5.4.1 Right-Looking Algorithm 5.4.2 Left-Looking Algorithm 5.4.3 Crout Algorithm 5.4.4 Typical Performance of Blocked LU Decomposition 5.4.5 Blocked Symmetric Indefinite Factorization 5.4.6 Typical Performance of Blocked Symmetric Indefinite Factorization 5.5 Linear Least Squares 5.5.1 Householder Method 5.5.2 Blocked Householder Method 5.5.3 Typical Performance of the Blocked Householder Factor-ization 5.6 Organization of the Modules 5.6.1 Matrix-Vector Product 5.6.2 Matrix-Matrix Product 5.6.3 Typical Performance for Parallel Processing 5.6.4 Benefits 5.7 LAPACK 5.8 ScaLAPACK 5.8.1 The Basic Linear Algebra Communication Subprograms(BLACS) 5.8.2 PBLAS 5.8.3 ScaLAPACK Sample Code 6 Direct Solution of Sparse Linear Systems 6.1 Introduction to Direct Methods for Sparse Linear Systems 6.1.1 Four Approaches 6.1.2 Description of Sparse Data Structure 6.1.3 Man!pulation of Sparse Data Structures 6.2 General Sparse Matrix Methods 6.2.1 Fill-in and Sparsity Ordering 6.2.2 Indirect Addressing--Its Effect and How to Avoid It 6.2.3 Comparison with Dense Codes 6.2.4 Other Approaches 6.3 Methods for Symmetric Matrices and Band Systems 6.3.1 The Clique Concept in Gaussian Elimination 6.3.2 Further Comments on Ordering Schemes 6.4 Frontal Methods 6.4.1 Frontal Methods--Link to Band Methods and Numerical Pivoting 6.4.2 Vector Performance 6.4.3 Parallel Implementation of Frontal Schemes 6.5 Multifrontal Methods . 6.5.1 Performance on Vector Machines 6.5.2 Performance on RISC Machines 6.5.3 Performance on Parallel Machines 6.5.4 Exploitation of Structure 6.5.5 Unsymmetric Multifrontal Methods 6.6 Other Approaches for Exploitation of Parallelism 6.7 Software 6.8 Brief Summary 7 Krylov Subspaces: Projection 7.1 Notation 7.2 Basic Iteration Methods: Richardson Iteration, Power Method . 7.3 Orthogonal Basis (Arnoldi, Lanczos) 8 Iterative Methods for Linear Systems 8.1 Krylov Subspace Solution Methods: Basic Principles 8.1.1 The Ritz-Galerkin Approach: FOM and CG 8.1.2 The Minimum Residual Approach: GMRES and MINRES 8.1.3 The Petrov-Galerkin Approach: Bi-CG and QMR 8.1.4 The Minimum Error Approach: SYMMLQ and GMERR 8.2 Iterative Methods in More Detail 8.2.1 The CG Method 8.2.2 Parallelism in the CG Method: General Aspects 8.2.3 Parallelism in the CG Method: Communication Overhead 8.2.4 MINRES 8.2.5 Least Squares CG 8.2.6 GMRES and GMRES(m) 8.2.7 GMRES with Variable Preconditioning 8.2.8 Bi-CG and QMR 8.2.9 CGS 8.2.10 Bi-CGSTAB 8.2.11 Bi-CGSTAB(e) and Variants 8.3 Other Issues 8.4 How to Test Iterative Methods 9 Preconditioning and Parallel Preconditioning 9.1 Preconditioning and Parallel Preconditioning 9.2 The Purpose of Preconditioning 9.3 Incomplete L U Decompositions 9.3.1 Efficient Implementations of ILU(0) Preconditioning 9.3.2 General Incomplete Decompositions 9.3.3 Variants of ILU Preconditioners 9.3.4 Some General Comments on ILU 9.4 Some Other Forms of Preconditioning 9.4.1 Sparse Approximate Inverse (SPAI) 9.4.2 Polynomial Preconditioning 9.4.3 Preconditioning by Blocks or Domains 9.4.4 Element by Element Preconditioners 9.5 Vector and Parallel Implementation of Preconditioners 9.5.1 Partial Vectorization 9.5.2 Reordering the Unknowns 9.5.3 Changing the Order of Computation 9.5.4 Some Other Vectorizable Preconditioners 9.5.5 Parallel Aspects of Reorderings 9.5.6 Experiences with Parallelism . 10 Linear Eigenvalue Problems Ax=λx 10.1 Theoretical Background and Notation 10.2 Single-Vector Methods 10.3 The QR Algorithm 10.4 Subspace Projection Methods 10.5 The Arnoldi Factorization 10.6 Restarting the Arnoldi Process 10.6.1 Explicit Restarting 10.7 Implicit Restarting 10.8 Lanczos' Method 10.9 Harmonic Ritz Values and Vectors 10.10 Other Subspace Iteration Methods 10.11 Davidson's Method 10.12 The Jacobi-Davidson Iteration Method 10.12.1 JDQR 10.13 Eigenvalue Software: ARPACK, P_ARPACK 10.13.1 Reverse Communication Interface 10.13.2 Parallelizing ARPACK 10.13.3 Data Distribution of the Arnoldi Factorization . 10.14 Message Passing 10.15 Parallel Performance 10.16 Availability 10.17 Summary 11 The Generalized Eigenproblem 11.1 Arnoldi/Lanczos with Shift-Invert 11.2 Alternatives to Arnoldi/Lanczos with Shift-Invert 11.3 The Jacobi-Davidson QZ Algorithm 11.4 The Jacobi-Davidson QZ Method: Restart and Deflation 11.5 Parallel Aspects A Acquiring Mathematical Software A.1 netlib A.1.1 Mathematical Software A.2 Mathematical Software Libraries B Glossary C Level 1, 2, and 3 BLAS Quick Reference D Operation Counts for Various BLAS and Decompositions Bibliography Index
随便看	教育学原理(全国教育硕士专业学位推荐教材) 汉语2008(附光盘交通篇汉英对照版) 乌申斯基教育文选/外国教育名著丛书学校特色论(全国教育干部培训教材) 自然地理学(大学本科小学教育专业教材) 董渭川教育文存/中国近现代教育家文库教学论/教育科学分支学科丛书普通高中思想政治课程导论潘菽全集(第9卷) 潘菽全集(第7卷) 潘菽全集(第5卷) 潘菽全集(第2卷) 潘菽全集(第1卷) 潘菽全集(第3卷) 潘菽全集(第4卷) 潘菽全集(第10卷) 潘菽全集(第6卷) 潘菽全集(第8卷) 建筑材料与选型手册(精)/TIME-SAVER系列手册园林经济管理(全国高校园林与风景园林专业规划推荐教材) 房地产经纪(房地产类专业适用全国建设行业职业教育规划推荐教材) 重点建设工程施工技术与管理创新消防工程施工质量问答/建筑工程施工质量验收规范问答丛书土建工程施工质量问答/建筑工程施工质量验收规范问答丛书安装工程施工质量问答/建筑工程施工质量验收规范问答丛书豌豆荚软之星酒店管理系统软之星门诊管理系统软之星餐饮管理系统软之星超市管理系统 QMosaic镶嵌匀色分幅软件纵横小说捍卫者移动存储安全系统密码管理精灵五笔大师输入法重装机犬修改器 v32/64 师父女性外观街霸春丽MOD v1.48 Easy Shooter修改器 v1.5 漫威蜘蛛侠重制版复仇者联盟标志战衣MOD v2.19 蛇之上序章修改器 v2.12 霍格沃茨之遗符合设定的幻影显形传送MOD v3.48 怪物猎人世界EBB版贪欲套装外观MOD v3.37 艾尔登法环不同难度MOD v1.4 fluffy模组管理器 v1.0 师父女性外观肌肉街霸嘉米MOD v3.29 inequality inequitable ineradicable inert inertia inertia reel inertia selling inescapable inessential inestimable [BT下载][喜欢你我也是第五季][第08-09集][WEB-MKV/12.65G][国语配音/中文字幕][4K-2160P][H265][流媒体][Lelv 剧集 2024 大陆爱情连载 [BT下载][独步逍遥][第425集][WEB-MP4/0.16G][国语配音/中文字幕][1080P][H265][流媒体][ZeroTV] 剧集 2020 大陆动作连载 [BT下载][独步逍遥][第425集][WEB-MKV/0.52G][国语配音/中文字幕][4K-2160P][H265][流媒体][ZeroTV] 剧集 2020 大陆动作连载 [BT下载][神印王座][第111集][WEB-MP4/0.35G][国语配音/中文字幕][1080P][H265][流媒体][ZeroTV] 剧集 2022 大陆动作连载 [BT下载][聘猫记][第11-12集][WEB-MKV/0.41G][国语音轨/简繁字幕][1080P][流媒体][ZeroTV] 剧集 2024 大陆古装连载 [BT下载][元尊][第05集][WEB-MP4/0.32G][国语配音/中文字幕][1080P][H265][流媒体][ZeroTV] 剧集 2024 大陆动作连载 [BT下载][师兄啊师兄第二季][第41集][WEB-MP4/1.31G][国语配音/中文字幕][4K-2160P][H265][流媒体][ZeroTV] 剧集 2023 大陆动作连载 [BT下载][真嘟假嘟][第07-08集][WEB-MP4/0.20G][国语配音/中文字幕][1080P][流媒体][LelveTV] 剧集 2024 大陆其它连载 [BT下载][真嘟假嘟][第07-08集][WEB-MP4/0.32G][国语配音/中文字幕][4K-2160P][H265][流媒体][LelveTV] 剧集 2024 大陆其它连载 [BT下载][元尊][第05集][WEB-MKV/1.42G][国语配音/中文字幕][4K-2160P][H265][流媒体][ZeroTV] 剧集 2024 大陆动作连载史莱姆牧场2粉色史莱姆在哪-史莱姆牧场2粉色史莱姆位置介绍史莱姆牧场2浣熊史莱姆在哪-史莱姆牧场2浣熊史莱姆位置介绍史莱姆牧场2浣熊史莱姆怎么喂养-浣熊史莱姆喂养方法史莱姆牧场2丝滑细沙在哪-史莱姆牧场2丝滑细沙位置介绍艾尔登法环红木短弓怎么获得-艾尔登法环红木短弓获得方法艾尔登法环复合弓怎么获得-艾尔登法环复合弓获得方法艾尔登法环死王子杖怎么获得-艾尔登法环死王子杖获得方法艾尔登法环罪人杖怎么获得-艾尔登法环罪人杖获得方法艾尔登法环腐败结晶杖怎么获得-艾尔登法环腐败结晶杖获得方法艾尔登法环结晶杖怎么获得-艾尔登法环结晶杖获得方法