Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

State-Future of $foo [clear filter]
Monday, May 9

10:40am PDT

Apache Hadoop 3 Current Status - Akira Ajisaka, NTT DATA
Do you want Hadoop 3 release? It is over 4 years since Hadoop 3 and Hadoop 2 were diverged, and there are a lot of great improvements in Hadoop 3, such as Shell Script Rewrite and MapReduce Native Optimization. Therefore if Hadoop 3 is released, users can enjoy the benefits of the new features.
In this session, we will introduce the new features and incompatible changes in Hadoop 3, and how the release is discussed in Apache Hadoop community. In addition, Akira Ajisaka would like to discuss releasing Hadoop 3 with the participants here if possible.

avatar for Akira Ajisaka

Akira Ajisaka

Software Engineer, NTT DATA Corporation
Akira Ajisaka is a software engineer working at NTT DATA, Japan. He belongs to OSS Professional Services team and deploys and operates Hadoop clusters for customers. He sometimes troubleshoots them by investigating source code and creating patches to fix the problem. He is an Apache... Read More →

Monday May 9, 2016 10:40am - 11:30am PDT
Regency B

3:00pm PDT

Dockerized Hadoop Platform and Recent Updates in Apache Bigtop - Amir Sanjar, IBM & Yu-Hsin Yeh, Trend Micro
Apache Bigtop is a project focuses on packaging, testing and configuration management solutions all around the Hadoop ecosystem. In this presentation, we’ll talk about how Bigtop Provisioner integrated with Docker Swarm, Docker Compose, and Docker Machine to give you the ability to run a fully distributed Hadoop cluster on Docker anywhere. In addition, the newly developed image pre-build feature substantially improves the user experience by cutting down the provisioning time to less than a minute. In the past few month, another excited work happened in Bigtop is the IBM PowerPC integration. So, to sum up the content of this talk:
1) How Bigtop Provisioner integrated with Docker ecosystem to achieve multi-host Hadoop cluster deployment.
2) The integration of IBM PowerPC with Apache Bigtop.
3) Newly added Hadoop ecosystem components and some new features we’ve developed recently.

avatar for Amir Sanjar

Amir Sanjar

Sr. Software Eng, IBM - Apache Bigtop PMC
Amir Sanjar has many years of experience in big data software and solution development at companies including IBM and Canonical. He is the inventor of several patents in areas of enterprise solution automation and wireless/cell technology. Currently, he leads big data ecosystem and... Read More →
avatar for Evans Ye

Evans Ye

ASF member, Apache Bigtop Committer/PMC member/Former VP, Director of Taiwan Data Engineering Association, Apache Software Foundation
Yu-Hsin Yeh(Evans Ye) is former VP, and currently committer and PMC member of Apache Bigtop. He loves to code, automate things, and tackling big data challenges. Aside from engineering stuff, he is also an enthusiast in giving talks to share software innovations and cutting-edge technologies... Read More →

Monday May 9, 2016 3:00pm - 3:50pm PDT
Regency B

4:10pm PDT

On the Bleeding Edge - Cassandra 3.4 and Beyond - Jonathan Haddad, Datastax
Cassandra is recognized as the best distributed database leveraging continuous availability and partition-tolerance for global deployments. With a strong open source history that began at Facebook to solve problems of absurdly massive scale, Cassandra has grown to be a huge project with a bright future. In this talk we will unpack exactly what that future is all about. With a brand new, high performance Secondary Index implementation, SSTable encryptions, a paradigm shift in architecture moving away from SEDA and towards threads per core, Materialized Views and Aggregations, Cassandra is maturing as a powerful front-runner on the bleeding edge of the NoSQL space.

avatar for Jon Haddad

Jon Haddad

Evangelist for Apache Cassandra, DataStax
Jon has 15 years experience in both development and operations. For 10 years he’s worked at various startups in southern California. For 2 years he had been the maintainer of cqlengine, the Python object mapper for Cassandra, now integrated into the native Cassandra driver. He’s... Read More →

Monday May 9, 2016 4:10pm - 5:00pm PDT
Regency B

5:10pm PDT

Apache Tika - What’s New with 2.0? - Nick Burch, Quanticate
Apache Tika detects and extracts metadata and text from a huge range of file formats and types. From Search to Big Data, single file to internet scale, if you’ve got files, Tika can help you get out useful information!

Apache Tika has been around for nearly 10 years now, and with the passage of all that time, plus the new 2.0 release, a lot has changed. Not only has there been a huge increase in the number of supported formats, but the ways of using Tika have expanded, and some of the philosophies on the best way to handle things have altered with experience. Tika has gained support for a wide range of programming languages to, and more recently, Big-Data scale support.

Whether you’re an old-hand with Tika looking to know what’s hot or different with 2.0, or someone new looking to learn more about the power of Tika, this talk will have something in it for you!


Nick Burch

CTO, Quanticate
Nick began contributing to Apache projects in 2003, and hasn't looked back since! He's mostly involved in "Content" projects like Apache POI, Apache Tika and Apache Chemistry, as well as foundation-wide activities like Conferences and Travel Assistance.Nick is CTO at Quanticate, a... Read More →

Monday May 9, 2016 5:10pm - 6:00pm PDT
Regency B