This event has ended. Create your own event → Check it out
This event has ended. Create your own
Register Now or Visit the Website for more Information 
View analytic
Thursday, May 12 • 9:00am - 12:00pm
Getting Started with Machine Learning & Spark - Holden Karau, IBM (Additional Fee)

Sign up or log in to save this to your schedule and see who's attending!

Apache Spark is a fast and general engine for distributed computing & big data processing with APIs in Scala, Java, Python, and R. Apache Spark ships with built in libraries for a variety of purposes including: SQL, Streaming, Graph Analysis, and Machine Learning. This talk will focus on how to use Spark for Machine Learning.

Apache Spark has two APIs for Machine Learning, the newer of which is focused on creating Machine Learning Pipelines. This talk will explore a simple classification problem in both of the APIs, followed by a tour of some of the different machine learning models. We will then talk about loading/saving models and the challenges faced when attempting to construct a real-time serving solution from Spark ML’s models. From their we will explore some of the performance improvement work being done inside of Spark for improving machine learning.

avatar for Holden Karau

Holden Karau

Principal Software Engineer, IBM
Holden Karau is a software development engineer and is active in open source. She a co-author of Learning Spark & Fast Data Processing with Spark and has taught intro Spark workshops. Prior to IBM she worked on a variety of big data, search, and classification problems at Alpine, DataBricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a Bachelors of Mathematics in Computer Science. Outside of computers she... Read More →

Thursday May 12, 2016 9:00am - 12:00pm

Attendees (8)