Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Geospatial [clear filter]
Tuesday, May 10

9:00am PDT

Open Geospatial Standards and Open Source - George Percival, Open Geospatial Consortium (OGC)
The aim of this talk and the geospatial track is to increase the benefits of implementing open source consistent with open geospatial standards. Open standards capture geospatial knowledge gained from previous experience for reuse. Accuracy in data exchange is increased by using standards. Even “simple” use cases done inconsistently cause errors, e.g. coordinate order. Standards from the Open Geospatial Consortium (OGC) applicable to Apache projects include: coordinate systems, geometry, grids, spatial relations, web services, encodings, metadata. Multiple Apache projects include geospatial implementations as highlighted in this track. To aid in code refuse this track seeks to increase coordinations across Apache projects based on geospatial standards as well as with other external activities. An anticipated outcome of this track is increasing geo-collaboration of Apache and OGC.

avatar for George Percivall

George Percivall

CTO, Chief Engineer, OGC
As CTO and Chief Engineer of the Open Geospatial Consortium (OGC), George Percivall is responsible for the OGC Interoperability Program and the OGC Compliance Program. His roles include articulating OGC standards as a coherent architecture, as well as addressing implications of technology... Read More →

Tuesday May 10, 2016 9:00am - 9:50am PDT
Plaza A

11:20am PDT

Applying Geospatial Analytics Using Apache Spark Running on Apache Mesos - Adam Mollenkopf, Esri
This session will explore how to apply spatiotemporal analytics using Apache Spark on high velocity streaming data-in-motion and high volume batch data-at-rest. A comparison of available open source geospatial libraries will be reviewed including Apache SIS, Magellan, JTS, and the esri/geometry-api-java. Demonstrations will be shown on how to integrate a geospatial library with Spark analytics and how these analytics can be run on an Apache Mesos cluster to provide a highly scalable solution with elastic capabilities. Examples will focus on applications in the connected car space and smart cities and smart communities.

avatar for Adam Mollenkopf

Adam Mollenkopf

Real-Time & Big Data GIS Capability Lead, Esri
Adam Mollenkopf is responsible for the strategic direction Esri takes towards enabling real-time and big data capabilities in the ArcGIS platform. This includes having the ability to ingest real-time data streams from a wide variety of sources, performing continuous and recurring... Read More →

Tuesday May 10, 2016 11:20am - 12:10pm PDT
Plaza A

2:00pm PDT

SciSpark: MapReduce in Atmospheric Sciences - Kim Whitehall, NASA Jet Propulsion Laboratory
The atmospheric science (AS) community generates model and observational data to simulate and monitor the Earth system. Big data in the AS community has arrived: high volumes (petabytes), at increasing velocity (to AS groups worldwide) and variety (of data formats and resolutions), are need for the veracity of models and observation systems that add value to the policy-making process. As scientists require solutions that allow interaction with these big data, the community is interested in the Map Reduce paradigm and Apache Spark. This talk presents a specific NASA Advanced Information Systems Technology (AIST) project called “SciSpark” that marries Apache Spark with climate science. SciSpark is a scalable system for interactive AS analysis. We will demonstrate SciSpark’s scientific data ingestion, visual interaction and metrics generation using the Spark engine.


Kim Whitehall

NASA Jet Propulsion Laboratory
Kim is a scientific applications software engineer at NASA’s Jet Propulsion Laboratory.

Tuesday May 10, 2016 2:00pm - 2:50pm PDT
Plaza A

3:00pm PDT

Geospatially Enable Your Hadoop, Accumulo, and Spark Applications with LocationTech Projects
What is the average predicted temperature of BC from 2050-2099 based on forecasting models? How many tweets containing the hashtag #apachecon were sent from Canada? In general: how do we ask questions concerning location to very large sets of geospatial data? To answer these types of questions, existing large data processing frameworks like Hadoop, Accumulo and Spark need to be "geospatially enabled". LocationTech is a working group inside of the Eclipse Foundation that is home to 4 open source projects doing exactly that: GeoTrellis, GeoWave, GeoMesa, and GeoJinni (sense a pattern?). In this talk, I will give an introduction to what geospatial data is, talk about challenges in processing large sets of geospatial data, and talk about how these four LocationTech projects work with Apache projects to overcome those challenges and let us get the most out of our large geospatial data.

avatar for Robert Emanuele

Robert Emanuele

Software Developer, Azavea
Rob Emanuele is the maintainer of the open source geospatial library GeoTrellis, which provides geospatial capabilities to Apache Spark. He was the program chair for FOSS4G North America in 2015 and 2016. He is a member of the LocationTech Project Management Committee.

Tuesday May 10, 2016 3:00pm - 3:50pm PDT
Plaza A
Wednesday, May 11

10:50am PDT

Hiding Some of Geospatial Complexity - Martin Desruisseaux, Geomatys
It is tempting to ignore the complexity of geospatial international standards on the assumption that everyone today uses coordinates given by GPS. But even though obsolescent, the NAD27 datum for instance is still critically important in the U.S. where it has been used for definitions of many legal boundaries. Even on modern datum, support of polar areas or supplemental dimensions can be challenging. In this talk, we will present a few key Apache SIS methods that handle a lot of this complexity: e.g. how to get Coordinate Reference Systems from strings and an estimation of transformation accuracy, through an API that avoid diving too deeply in the complexity of GIS. We will show an example of what happen under the hood during a cube transformation, for demonstrating what the developers gain with SIS. Finally, we will present applications for trying Apache SIS without programming.


Martin Desruisseaux

Developer, Geomatys
I hold a Ph.D thesis in oceanography, but have continuously developed tools for helping analysis work. I used C/C++ before to switch to Java in 1997. I develop geospatial libraries since that time, initially as a personal project then as a GeoTools contributor until 2008. I'm now... Read More →

Wednesday May 11, 2016 10:50am - 11:40am PDT
Plaza A

11:50am PDT

Geospatial Querying in Apache Marmotta - Sergio Fernandez, Redlink GmbH
Apache Marmotta provides different means of querying: SPARQL, LDPath, LDP, etc. GeoSPARQL provides an extension to the SPARQL constructs to represent and query geospatial data. The talk will present the development recently done to add GeoSPARQL support in Marmotta, going through the challenges and potential of this new set of features, demoing some of then during the talk.

avatar for Sergio Fernández

Sergio Fernández

Software Engineer, Redlink GmbH
I'm a Software engineer specialized in innovation, with a focus on Data Architectures. My interests include Distributed Architectures, Data Integration, Linked Data and System Engineering. I've worked as software engineer and project manager in different industries, but always somehow... Read More →

Wednesday May 11, 2016 11:50am - 12:40pm PDT
Plaza A

2:00pm PDT

Spatial Data Based People/Vhicles Trails Analysis to Support Precision Urban Planning - Yonghua Zeng, IBM
In this session, the presenter will share the experience on how to use the hadoop based big data technology with huge cellular signal data, RFID, and GPS data to analyze and predict people and vehicles trails, to support the precision urban planning. This whole architecture includes,
11) Data ingestion kafka+streaming to collect and preprocessing the real-time generated cellular signal data, RFID and GPS data(200G+ per day)
2) Algorithm model with spatial data computation using Spark core and MLib to analyze and predict people and vehicles trails on the data collected
3) SQL on Hadoop technology to provide the interactive query and analysis for frontend applications
4) Spatial data visualization with GIS and grid technology to render the heatmap, people residency distribution, real traffic road status, OD map etc


yonghua zeng

solution architect of big data, IBM
Henry Zeng, senior architect of big data and analytics based in IBM China Development Lab.Henry has more than 10 years experience on data management related products, system and applications development and architecturing, he has two bookspublished in this area. He is now the solution... Read More →

Wednesday May 11, 2016 2:00pm - 2:50pm PDT
Plaza A

3:00pm PDT

Crowd Learning for Indoor Positioning - Thomas Burgess, indoo.rs GmbH
Real-time accurate indoor positioning poses many new possibilities and challenges. At indoo.rs (Austrian based start-up founded in 2010), we enable positioning within mobile applications (Android/iOS) so that users can find themselves and navigate through floor plans. In practice, we estimate location and movement, using motion sensors and comparisons of radio scans (WiFi/iBeacon) to pre-measured reference measurements (fingerprints). We currently are transitioning from using dedicated measurements to an approach that learns and updates references by analyzing data from navigating users.This approach uses the Hadoop ecosystem to combine the output of the IOT network of mobiles and beacons with big data based machine learning and near real-time analytics (including visualizations). The complete solution reduces implementation and maintenance cost of installing indoor location.

avatar for Thomas Burgess

Thomas Burgess

Director of research, indoo.rs GmbH
Thomas is the CRO of indoo.rs and leads its research efforts since 2012. Earlier, he did his PhD in particle physics at Stockholm University for the AMANDA/IceCube neutrino telescopes, and worked as a postdoctoral researcher at University of Bergen for the ATLAS experiment at the... Read More →

Wednesday May 11, 2016 3:00pm - 3:50pm PDT
Plaza A

5:10pm PDT

Action! Does Your Project Want to Join Geospatial? - George Percival, Open Geospatial Consortium (OGC)
This will be a capstone discussion of the geospatial track of Apache BD. Now that we have heard several excellent presentations about different project implementing geospatial functions, is there interest in future activities?

Topics for Discussion
  • Is there interest in coordination across projects?
  • Is there interest in coordination outside of Apache?

Come join us! Future events on Big Geo Data
  • FOSS4G in Bonn, August, 24 – 26 - confirmed
  • Apache in Seville later this year - To be discussed.

avatar for George Percivall

George Percivall

CTO, Chief Engineer, OGC
As CTO and Chief Engineer of the Open Geospatial Consortium (OGC), George Percivall is responsible for the OGC Interoperability Program and the OGC Compliance Program. His roles include articulating OGC standards as a coherent architecture, as well as addressing implications of technology... Read More →

Wednesday May 11, 2016 5:10pm - 6:00pm PDT
Plaza A