Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 
Monday, May 9 • 5:10pm - 6:00pm
Building a Durable Real-Time Data Pipeline: Apache BookKeeper at Twitter - Sijie Guo & Leigh Stewart, Twitter

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Log has been proven to be a very powerful data structure for addressing challenging distributed systems problems. DistributedLog is such a replicated log service that is built on top of Apache BookKeeper, providing infinite, ordered, append-only streams that can be used for building robust real-time systems. It is the foundation of Twitter’s durable real-time data pipeline, and has been used widely elsewhere at Twitter in applications including transactional database system, search ingestion pipeline, and real-time streaming data-analytics platform. In this talk, Sijie Guo will discuss what are the challenges on building durable real-time data pipeline, how they achieve it and how they use it to support different workloads with different characteristics from a strongly-consistent distributed database to a real-time data analytics pipeline.


Sijie Guo

Currently work for Twitter on DistributedLog/BooKeeper. Apache BookKeeper PMC Chair. Previously work for Yahoo! on push notification system.

Monday May 9, 2016 5:10pm - 6:00pm PDT
Georgia B