Loading…
This event has ended. Create your own event → Check it out
This event has ended. Create your own
Register Now or Visit the Website for more Information 
View analytic
Tuesday, May 10 • 11:20am - 12:10pm
Spark Cyborgs - Deep Integration of Spark with Parallel Relational Engines - Torsten Steinbach & Gustavo Arocena, IBM

Sign up or log in to save this to your schedule and see who's attending!

In this session we describe a family of hybrid engines that result from a deep two-way integration between Spark and parallel RDBMSs. This integration differs from projects like Hive on Spark, that leverage Spark purely as an execution framework. It also goes beyond what’s possible with the current version of the DataSources API in terms of leveraging the capabilities of the storage backend. In our presentation you will learn about four essential building blocks of the hybrid engines:
1. Derive DataFrame partitioning implicitly from parallel RDBMS partitioning
2. Colocation and efficient data movement between Spark and RDBMS processes
3. Hybrid queries by augmenting parallel RDBMS with Spark
4. Spark machine learning integrated in RDBMS for relational data

Speakers
GA

Gustavo Arocena

Big Data Architect, IBM
Gustavo Arocena is a Big Data Architect at the IBM Toronto Lab, with more than 10 years of experience in database technology and language processing. Recently he has lead the design and implementation of several components of the Big SQL engine, including the Hive-compatible IO layer, the SQL INSERT statement, and the integration with Spark. Gustavo has several publications and has presented at multiple conferences. He holds a Master's degree... Read More →
TS

Torsten Steinbach

IBM
Torsten has been a software architect for database technology in IBM for many years. He lead product development for DB2 performance management tooling, Netezza workload management and in-database analytics. Currently he works on IBM’s cloud data warehouse dashDB and it’s integrated analytic functions and languages such as R. | Torsten spoke at many IBM Insight conferences, also at Esri UC and IDUG.



Tuesday May 10, 2016 11:20am - 12:10pm
Plaza C

Attendees (16)