Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Geospatial [clear filter]
Tuesday, May 10

2:00pm PDT

SciSpark: MapReduce in Atmospheric Sciences - Kim Whitehall, NASA Jet Propulsion Laboratory
The atmospheric science (AS) community generates model and observational data to simulate and monitor the Earth system. Big data in the AS community has arrived: high volumes (petabytes), at increasing velocity (to AS groups worldwide) and variety (of data formats and resolutions), are need for the veracity of models and observation systems that add value to the policy-making process. As scientists require solutions that allow interaction with these big data, the community is interested in the Map Reduce paradigm and Apache Spark. This talk presents a specific NASA Advanced Information Systems Technology (AIST) project called “SciSpark” that marries Apache Spark with climate science. SciSpark is a scalable system for interactive AS analysis. We will demonstrate SciSpark’s scientific data ingestion, visual interaction and metrics generation using the Spark engine.


Kim Whitehall

NASA Jet Propulsion Laboratory
Kim is a scientific applications software engineer at NASA’s Jet Propulsion Laboratory.

Tuesday May 10, 2016 2:00pm - 2:50pm PDT
Plaza A

3:00pm PDT

Geospatially Enable Your Hadoop, Accumulo, and Spark Applications with LocationTech Projects
What is the average predicted temperature of BC from 2050-2099 based on forecasting models? How many tweets containing the hashtag #apachecon were sent from Canada? In general: how do we ask questions concerning location to very large sets of geospatial data? To answer these types of questions, existing large data processing frameworks like Hadoop, Accumulo and Spark need to be "geospatially enabled". LocationTech is a working group inside of the Eclipse Foundation that is home to 4 open source projects doing exactly that: GeoTrellis, GeoWave, GeoMesa, and GeoJinni (sense a pattern?). In this talk, I will give an introduction to what geospatial data is, talk about challenges in processing large sets of geospatial data, and talk about how these four LocationTech projects work with Apache projects to overcome those challenges and let us get the most out of our large geospatial data.

avatar for Robert Emanuele

Robert Emanuele

Software Developer, Azavea
Rob Emanuele is the maintainer of the open source geospatial library GeoTrellis, which provides geospatial capabilities to Apache Spark. He was the program chair for FOSS4G North America in 2015 and 2016. He is a member of the LocationTech Project Management Committee.

Tuesday May 10, 2016 3:00pm - 3:50pm PDT
Plaza A