Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 
Back To Schedule
Tuesday, May 10 • 3:00pm - 3:50pm
Apache Flume or Apache Kafka? How About Both? - Jayesh Thakrar, Conversant

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Flume and Kafka are seen by some to serve the same functionality and often considered as mutually exclusive. This presentation is about an implementation where both are used together as parts of a heterogeneous streaming data pipeline.

The presentation will cover the evolution of the pipeline and how it grew from being designed to handle 20 billion log lines to 90+ billion log lines a day. It will also cover Flume customization for ensuring data uniqueness as well as to allow fractional bifurcation of data from production to QA systems for continuous regression testing.

Finally, the presentation will cover monitoring of the pipeline from a holistic view as well as a detailed drill-down and associated alerting.

avatar for Jayesh Thakrar

Jayesh Thakrar

Sr. Software Engineer, Conversant
Jayesh Thakrar is a Sr. Data Engineer at Conversant (http://www.conversantmedia.com/). He is a data geek who gets to build and play with large data systems consisting of Hadoop, Spark, HBase, Cassandra, Flume and Kafka. To rest after a good day's work, he uses OpenTSDB with 500+ million... Read More →

Tuesday May 10, 2016 3:00pm - 3:50pm PDT
Regency A