Streaming data is a big deal in big data these days, and for good reason. Businesses crave ever more timely data, and streaming is a good way to achieve lower latency. Plus, streaming is a much easier way to tame the massive, unbounded data sets that are increasingly common today.
Expanded from co-author Tyler Akidau’s popular series of blog posts "Streaming 101" and "Streaming 102", this practical book shows data engineers, data scientists, and developers how to work with streaming or event-time data in a conceptual and platform-agnostic way. You’ll go from "101"-level understanding of stream processing to a nuanced grasp of the what, where, when, and how of processing real-time data streams.
Dive deep into topics including watermarks and windowing, as well as state and timers in the context of stream processing. Although the book uses Apache Beam code snippets to make examples concrete, it presents a general and broad explanation of streaming that's not tied to a specific framework.
Table of Contents
Part I. The Beam Model
Chapter 1. Streaming 101
Chapter 2. The What, Where, When, And How Of Data Processing
Chapter 3. Watermarks
Chapter 4. Advanced Windowing
Chapter 5. Exactly-Once And Side Effects
Part II. Streams And Tables
Chapter 6. Streams And Tables
Chapter 7. The Practicalities Of Persistent State
Chapter 8. Streaming Sql
Chapter 9. Streaming Joins
Chapter 10. The Evolution Of Large-Scale Data Processing