Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 
Wednesday, May 11 • 4:10pm - 5:00pm
Data Science for the Datacenter: Analyzing Logs with Apache Spark - William Benton, Red Hat, Inc

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Contemporary applications and infrastructure software leave behind a tremendous volume of metric and log data. This “digital exhaust” is inscrutable to humans and difficult for computers to analyze, since it is vast, complex, and not explicitly structured.

In this session, Will Benton will introduce the log processing domain and give you practical advice for using Apache Spark to analyze log data, including data engineering techniques to impose structure on disparate log sources; data science approaches to detect infrastructure failures; language-processing techniques to characterize the text of log messages; best practices for tuning Spark and using newer Spark features; and how to visualize your results. You’ll learn from Benton’s experience developing applications that analyze the vast log data generated within Red Hat’s network and leave well-prepared to analyze your own logs.

avatar for William Benton

William Benton

Manager, Software Engineering and Sr. Principal Engineer, Red Hat, Inc
William Benton leads a team of data scientists and engineers at Red Hat, where he has applied machine learning to problems ranging from forecasting cloud infrastructure costs to designing better cycling workouts. His current focus is investigating the best ways to build and deploy... Read More →

Wednesday May 11, 2016 4:10pm - 5:00pm PDT
Plaza B