This event has ended. Create your own event → Check it out
This event has ended. Create your own
Register Now or Visit the Website for more Information 
View analytic
Wednesday, May 11 • 4:10pm - 5:00pm
Data Science for the Datacenter: Analyzing Logs with Apache Spark - William Benton, Red Hat, Inc

Sign up or log in to save this to your schedule and see who's attending!

Contemporary applications and infrastructure software leave behind a tremendous volume of metric and log data. This “digital exhaust” is inscrutable to humans and difficult for computers to analyze, since it is vast, complex, and not explicitly structured.

In this session, Will Benton will introduce the log processing domain and give you practical advice for using Apache Spark to analyze log data, including data engineering techniques to impose structure on disparate log sources; data science approaches to detect infrastructure failures; language-processing techniques to characterize the text of log messages; best practices for tuning Spark and using newer Spark features; and how to visualize your results. You’ll learn from Benton’s experience developing applications that analyze the vast log data generated within Red Hat’s network and leave well-prepared to analyze your own logs.

avatar for William Benton

William Benton

Red Hat, Inc.
William Benton leads a data science team in Red Hat’s Emerging Technologies group, where he has contributed to several open-source distributed computing projects and applied analytic techniques to problems ranging from forecasting cloud infrastructure costs to designing better cycling workouts. He has also conducted research and development in the areas of static program analysis, managed language runtimes, logic databases, cluster... Read More →

Wednesday May 11, 2016 4:10pm - 5:00pm
Plaza B