Loading…
Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Spark [clear filter]
Wednesday, May 11
 

2:00pm

Shared Memory Layer for Spark Applications - Dmitry Setrakyan, GridGain
In this presentation we will talk about the need to share state across different Spark
jobs and applications and several technologies that make it possible, including
Tachyon and Apache Ignite. We will dive into importance of In Memory File Systems,
Shared In-Memory RDDs with Apache Ignite, as well as present a hands on demo
demonstrating advantages and disadvantages of one approach over another. We will
also discuss requirements of storing data off-heap in order to achieve large horizontal
and vertical scale of the applications using Spark and Ignite.

Speakers
DS

Dmitriy Setrakyan

EVP Engineering, GridGain
Dmitriy Setrakyan is founder and Chief Product Officer at GridGain. Dmitriy has been working with distributed architectures for over 15 years and has expertise in the development of various middleware platforms, financial trading systems, CRM applications and similar systems. Prior... Read More →


Wednesday May 11, 2016 2:00pm - 2:50pm
Plaza C

3:00pm

Time Series Processing with Apache Spark - Josef Adersberger, QAware GmbH
A lot of data is best represented as time series: Operational data, financial data and even in general-purpose DWHs the dominant dimension is time. The area of time series databases is growing rapidly but the support in Spark to process and analyze time series data is still in the early stages. We present Chronix Spark which provides a mature TimeSeriesRDD implementation for fast retrieval and complex analysis of time series data. Chronix Spark is open source software and battle-proved at a big german car manufacturer and a german telco. We show how we‘ve used Chronix Spark in a real-life project and provide some benchmarks how it has outperformed common time series databases like OpenTSDB, KairosDB and InfluxDB. We lift the curtain and deep-dive into the internals how we‘ve achieved this.

Speakers
avatar for Josef Adersberger

Josef Adersberger

CTO, QAware
Josef Adersberger is co-founder & CTO of QAware, a German custom software development company and CNCF silver member. He studied computer science in Rosenheim and Munich and holds a doctoral degree in software engineering. He is currently responsible for a large-scale cloud migration... Read More →



Wednesday May 11, 2016 3:00pm - 3:50pm
Plaza C

5:10pm

Mining Public Datasets Using Apache Zeppelin (incubating) and Spark - Alexander Bezzubov, NFLabs
There are a lot of public datasets available in the wild and the number is growing. In meantime, ASF provides a plethora of free tools for any practitioner to build up on. In this talk Alexander will show how to levirage 2 of them, Zeppelin and Spark, for exploratory data anaytics and building a data product over two real datasets CommonCrawl http://commoncrawl.org and GithubArchive https://www.githubarchive.org

Speakers
AB

Alexander Bezzubov

Software Engineer, NFLabs
Alexander Bezzubov is Apache Zeppelin contributor, PMC member and software engineer at NFLabs. Previous speaking experience includes Apache BigData NA 2016 in Vancouver, FOSSASIA 2016 in Singapore, Apache BigData EU 2015 in Budapest.



Wednesday May 11, 2016 5:10pm - 6:00pm
Plaza C