This event has ended. Create your own event → Check it out
This event has ended. Create your own
Register Now or Visit the Website for more Information 
View analytic
Tuesday, May 10 • 10:00am - 10:50am
Breaking Spark: Top 5 Mistakes to Avoid When Using Apache Spark in Production - Neelesh Srinivas Salian, Cloudera

Sign up or log in to save this to your schedule and see who's attending!

Apache Spark has been growing in deployments for the past year. The increasing amount of data being analyzed and processed through the framework is massive and it continues to push the boundaries of the engine.

This talk will focus on common problematic issues observed in a cluster environment setup with Apache Spark, based on the presenter’s experiences across 150+ production deployments.

When planning a Apache Spark deployment in a cluster, it is recommended to follow certain guidelines to help setup a real-world environment. The classification of issues that can occur are:

1) Scaling of the Architecture
2) Memory Configurations
3) End user Code
4) Incompatible Dependencies
5) Administration/Operation related issues.

These observations are very useful as they help to improve the usability and supportability of Apache Spark to avoid such issues in future deployments.

avatar for Neelesh Srinivas Salian

Neelesh Srinivas Salian

Engineer, Cloudera
Neelesh is a Customer Operations Engineer at Cloudera, Palo Alto, working on all components from the entire Cloudera's Distribution Including Apache Hadoop (CDH). He is also a Contributor to the Apache Software Foundation projects such as Flink, Spark and Hadoop. He previously spoke at multiple conferences about the 5 Top issues in Spark. He holds a Master of Computer Science degree from North Carolina State University with a focus on Cloud... Read More →

Tuesday May 10, 2016 10:00am - 10:50am
Georgia B