Loading…
This event has ended. Create your own event → Check it out
This event has ended. Create your own
Register Now or Visit the Website for more Information 
View analytic
Tuesday, May 10 • 11:20am - 12:10pm
Get the Best Out of Hive and Spark - Xuefu Zhang, Uber

Sign up or log in to save this to your schedule and see who's attending!

Apache Hive has wide use cases for batch-oriented SQL workloads for ETL and data analytics in the Hadoop ecosystem. Its rich features haven’t been matched by any other available SQL on Hadoop tools. In fact, many these tools are tied to and depend on Hive one way or the other. Apache Spark, on the other hand, offers a general data processing framework positioned to replace MapReduce with its faster data processing and efficient memory utilization. Moreover, one doesn’t have to abandon one for another or juggle between the two in order to get both sets of benefits, as Hive on Spark maintains Hive’s feature richness while providing faster SQL on Hadoop execution. As the adoption of Hive on Spark for production use, This presentation will share with you the best practice of deployment and performance tuning which enables you to gain the best out of the two projects.

Speakers
XZ

Xuefu Zhang

Software Engineer, Uber Technologies
Xuefu Zhang has over 10 year’s experience in software development. Earlier this year he joined as a software engineer in Uber from Cloudera, where he spent his main efforts on Apache Hive and Pig. He also worked in the Hadoop team at Yahoo when the majority of the development on Hadoop was still there. In addition, he spent his early career at Informatica, gaining important experience on enterprise software development, especially in ETL and... Read More →


Tuesday May 10, 2016 11:20am - 12:10pm
Georgia A