网站首页  软件下载  游戏下载  翻译软件  电子书下载  电影下载  电视剧下载  教程攻略

请输入您要查询的图书:

 

书名 流式系统(影印版)(英文版)
分类
作者 (美)泰勒·阿克道//斯拉瓦·切尔尼亚克//鲁文·拉克斯
出版社 东南大学出版社
下载
简介
内容推荐
如今,流式数据是大数据中的一个大问题。随着越来越多的企业试图掌控遍布全球的无限海量数据集,流式系统终于到了足以被主流接纳的成熟度。通过这本实用指南,数据工程师、数据科学家和开发人员将学习到如何以概念化和无关于平台的方式处理流式数据。
本书基于对Tyler Akidau的热门博文《Streaming 101》和((Streaming 102》的拓展,将带你从入门到细致入微地理解实时数据流处理的what、where、when和how。你还将与本书合著者Slava Chernyak和Reuven Lax一起深入了解水印和exactly-once处理。
目录
Preface Or: What Are You Getting Yourself Into Here?
Part Ⅰ.The Beam Model
1.Streaming 101
Terminology: What Is Streaming?
On the Greatly Exaggerated Limitations of Streaming
Event Time Versus Processing Time
Data Processing Patterns
Bounded Data
Unbounded Data: Batch
Unbounded Data: Streaming
Summary
2.The What, Where, When, and How of Data Processing
Roadmap
Batch Foundations: What and Where
What: Transformations
Where: Windowing
Going Streaming: When and How
When: The Wonderful Thing About Triggers Is Triggers Are Wonderful Things!
When: Watermarks
When: Early/On-Time~Late Triggers FTWI
When: Allowed Lateness (i.e., Garbage Collection
How: Accumulation
Summary
3.Watermarks
Definition
Source Watermark Creation
Perfect Watermark Creation
Heuristic Watermark Creation
Watermark Propagation
Understanding Watermark Propagation
Watermark Propagation and Output Timestamps
The Tricky Case of Overlapping Windows
Percentile Watermarks
Processing-Time Watermarks
Case Studies
Case Study: Watermarks in Google Cloud Dataflow
Case Study: Watermarks in Apache Flink
Case Study: Source Watermarks for Google Cloud Pub/Sub
Summary
4.Advanced Windowing
When/Where: Processing-Time Windows
Event-Time Windowing
Processing-Time Windowing via Triggers
Processing-Time Windowing via Ingress Time
Where: Session Windows
Where: Custom Windowing
Variations on Fixed Windows
Variations on Session Windows
One Size Does Not Fit All
Summary
5.Exactly-Once and Side Effects
Why Exactly Once Matters
Accuracy Versus Completeness
Side Effects
Problem Definition
Ensuring Exactly Once in Shuffle
Addressing Determinism
Performance
Graph Optimization
Bloom Filters
Garbage Collection
Exactly Once in Sources
Exactly Once in Sinks
Use Cases
Example Source: Cloud Pub/Sub
Example Sink: Files
Example Sink: Google BigQuery
Other Systems
Apache Spark Streaming
Apache Flink
Summary
Part Ⅱ.Streams and Tables
6.Streams and Tables
Stream-and-Table Basics Or: a Special Theory of Stream and Table Relativity
Toward a General Theory of Stream and Table Relativity
Batch Processing Versus Streams and Tables
A Streams and Tables Analysis of MapReduce
Reconciling with Batch Processing
What, Where, When, and How in a Streams and Tables World
What: Transformations
Where: Windowing
When: Triggers
How: Accumulation
A Holistic View Of Streams and Tables in the Beam Model
A General Theory of Stream and Table Relativity
Summary
7.The Practicalities of Persistent State
Motivation
The Inevitability of Failure
Correctness and Efficiency
Implicit State
Raw Grouping
Incremental Combining
Generalized State
Case Study: Conversion Attribution
Conversion Attribution with Apache Beam
Summary
8.Streaming SQL
What Is Streaming SQL?
Relational Algebra
Time-Varying Relations
Streams and Tables
Looking Backward: Stream and Table Biases
The Beam Model: A Stream-Biased Approach
The SQL Model: A Table-Biased Approach
Looking Forward: Toward Robust Streaming SQL
Stream and Table Selection
Temporal Operators
Summary
9.Streaming Joins
All Your loins Are Belong to Streaming
Unwindowed loins
FULL OUTER
LEFT OUTER
RIGHT OUTER
INNER
ANTI
SEMI
Windowed loins
Fixed Windows
Temporal Validity
Summary
10.The Evolution of Large-Scale Data Processing
MapReduce
Hadoop
Flume
Storm
Spark
MillWheel
随便看

 

霍普软件下载网电子书栏目提供海量电子书在线免费阅读及下载。

 

Copyright © 2002-2024 101bt.net All Rights Reserved
更新时间:2025/2/22 17:36:14