网站首页  软件下载  游戏下载  翻译软件  电子书下载  电影下载  电视剧下载  教程攻略

请输入您要查询的图书:

 

书名 云端基因组学(影印版)(英文版)
分类 科学技术-自然科学-生物科学
作者 (美)杰拉尔丁·A.范德奥维拉//布里安·D.奥康钠
出版社 东南大学出版社
下载
简介
内容推荐
基因组学领域的数据正在剧增。在短短几年内,美国国家卫生研究院(National Institutes of Health,NIH)等组织托管的基因组数据已经超过了50PB(5OOO万GB),这些组织正在转向云基础架构,以便将数据提供给研究团体。你该如何调整分析工具和协议来访问和分析云端的海量数据?
通过这本实用书籍,研究人员将学会如何使用基因组分析工具包(Genome Analysis Toolkit,GATK)、Docker、WDL、Terra等开源工具来处理基因组学算法。GATk用户社区的长期监理人Geraldine Van der Auwera和加州大学圣克鲁兹基因组学研究所的Brian O’Connor会指导你完成这一过程。你将通过使用真实数据和相关领域的基因组学算法展开学习。
本书涵盖了:
基本的基因组学和计算技术背景;
基本的云计算操作;
GATK入门,加上三个主要的GATK最佳实践;
使用WDL和Cromwell编写的脚本化工作流进行自动分析;
扩展云端的工作流执行,包括并行化和成本优化;
使用Jupyter notebook在云端进行交互式分析;
使用Terra确保协作和计算可重复性。
作者简介
杰拉尔丁·A.范德奥维拉博士是麻省理工学院一哈佛大学博德研究所数据科学平台的外联和沟通负责人。
目录
Foreword
Preface
1. Introduction
The Promises and Challenges of Big Data in Biology and Life Sciences
Infrastructure Challenges
Toward a Cloud-Based Ecosystem for Data Sharing and Analysis
Cloud-Hosted Data and Compute
Platforms for Research in the Life Sciences
Standardization and Reuse of Infrastructure
Being FAIR
Wrap-Up and Next Steps
2. Genomics in a Nutshell: A Primer for Newcomers to the Field
Introduction to Genomics
The Gene as a Discrete Unit of Inheritance (Sort Of)
The Central Dogma of Biology: DNA to RNA to Protein
The Origins and Consequences of DNA Mutations
Genomics as an Inventory of Variation in and Among Genomes
The Challenge of Genomic Scale, by the Numbers
Genomic Variation
The Reference Genome as Common Framework
Physical Classification of Variants
Germline Variants Versus Somatic Alterations
High-Throughput Sequencing Data Generation
From Biological Sample to Huge Pile of Read Data
Types of DNA Libraries: Choosing the Right Experimental Design
Data Processing and Analysis
Mapping Reads to the Reference Genome
Variant Calling
Data Quality and Sources of Error
Functional Equivalence Pipeline Specification
Wrap-Up and Next Steps
3. Computing Technology Basics for Life Scientists
Basic Infrastructure Components and Performance Bottlenecks
Types of Processor Hardware: CPU, GPU, TPU, FPGA, OMG
Levels of Compute Organization: Core, Node, Cluster, and Cloud
Addressing Performance Bottlenecks
Parallel Computing
Parallelizing a Simple Analysis
From Cores to Clusters and Clouds: Many Levels of Parallelism
Trade-Offs of Parallelism: Speed, Efficiency, and Cost
Pipelining for ParaUelization and Automation
Workflow Languages
Popular Pipelining Languages for Genomics
Workflow Management Systems
Virtualization and the CIoud
VMs and Containers
Introducing the Cloud
Categories of Research Use Cases for Cloud Services
Wrap-Up and Next Steps
4. First Steps in the Cloud
Setting Up Your Google Cloud Account and First Project
Creating a Project
Checking Your Billing Account and Activating Free Credits
Running Basic Commands in Google Cloud Shell
Logging in to the Cloud Shell VM
Using gsutil to Access and Manage Files
Pulling a Docker Image and Spinning Up the Container
Mounting a Volume to Access the Filesystem from Within the Container
Setting Up Your Own Custom VM
Creating and Configuring Your VM Instance
Logging into Your VM by Using SSH
Checking Your Authentication
Copying the Book Materials to Your VM
Installing Docker on Your VM
Setting Up the GATK Container Image
……
6. GATK Best Practices for Germline Short Variant Discovery
7. GATK Best Practices for Somatic Variant Discovery
8. Automatina Analysis Execution with Workflows
9. Deciphering Real Genomics Workflows
10. Running Single Workflows at Scale with Pipelines API
11. Running Many Workflows Conveniently in Terra
12. Interactive Analysis in Jupyter Notebook
13. Assembling Your Own Workspace in Terra
14. Making a Fully Reproducible Paper
Glossary
Index
随便看

 

霍普软件下载网电子书栏目提供海量电子书在线免费阅读及下载。

 

Copyright © 2002-2024 101bt.net All Rights Reserved
更新时间:2025/1/31 19:38:55