Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 
Back To Schedule
Tuesday, May 10 • 11:20am - 12:10pm
Get the Best Out of Hive and Spark - Xuefu Zhang, Uber

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Hive has wide use cases for batch-oriented SQL workloads for ETL and data analytics in the Hadoop ecosystem. Its rich features haven’t been matched by any other available SQL on Hadoop tools. In fact, many these tools are tied to and depend on Hive one way or the other. Apache Spark, on the other hand, offers a general data processing framework positioned to replace MapReduce with its faster data processing and efficient memory utilization. Moreover, one doesn’t have to abandon one for another or juggle between the two in order to get both sets of benefits, as Hive on Spark maintains Hive’s feature richness while providing faster SQL on Hadoop execution. As the adoption of Hive on Spark for production use, This presentation will share with you the best practice of deployment and performance tuning which enables you to gain the best out of the two projects.


Xuefu Zhang

Software Engineer, Uber Technologies
Xuefu Zhang has over 10 year’s experience in software development. Earlier this year he joined as a software engineer in Uber from Cloudera, where he spent his main efforts on Apache Hive and Pig. He also worked in the Hadoop team at Yahoo when the majority of the development on... Read More →

Tuesday May 10, 2016 11:20am - 12:10pm PDT
Georgia A