Apache: Big Data 2016 has ended
Register Now or Visit the Website for more Information 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Security [clear filter]
Tuesday, May 10

9:00am PDT

Enforcing Fine Grained Role Based Authorization in Multi-tenant Streaming Data Platforms - Ashish Singh, Cloudera
Reliable, high-rate ingestion of data from a large variety of sources is the first step toward answering big analytical questions, and Apache Kafka, a scalable publish-subscribe messaging system, is a popular choice for this goal. With the increasing adoption of Kafka, security has become more important than ever. In this talk, Ashish Singh will review recent advancements made in Kafka towards closing security gaps, and discuss how addition of pluggable authorization in Kafka has enabled Apache Sentry (incubating) to provide an enterprise-grade, fine-grained, role-based authorization in Kafka. The talk will conclude with a demonstration of a working example of how administrators can rely on Sentry for enforcing fine-grained authorization in multi-tenant Kafka platforms.

avatar for Ashish Singh

Ashish Singh

Software Engineer, Cloudera
Ashish Singh is a Software Engineer, working with Cloudera to empower the Hadoop ecosystem to answer bigger questions. Ashish studied Computer Science and Engineering at Ohio State University. Before working in the Big Data space, he worked on optimizing MPI collective communications... Read More →

Tuesday May 10, 2016 9:00am - 9:50am PDT
Regency E

10:00am PDT

Protecting Enterprise Data in Hadoop - Owen O'Malley, Hortonworks
Hadoop has long had strong authentication via integration with Kerberos,
authorization via User/Group/Other HDFS permissions, and auditing via
the audit log. Recent developments in Hadoop have added HDFS file access
control lists, pluggable encryption key provider APIs, HDFS snapshots,
and HDFS encryption zones. These features combine to give important new
data protection features that every company should be using to protect
their data. This talk will cover what the new features are
and when and how to use them in enterprise production environments.
Upcoming features including columnar encryption in the ORC columnar format
will also be covered. By encrypting particular columns, enterprises can
control which users have access to particularly sensitive columns that
contain personally identifiable information or financial information.

avatar for Owen O’Malley

Owen O’Malley

Co-founder & Sr Architect, Hortonworks
Owen O’Malley is a co-founder and architect at Hortonworks, which develops the completely open source Hortonworks Data Platform (HDP). HDP includes Hadoop and the large ecosystem of big data tools that enterprises need for data analytics. Owen has been working on Hadoop since 2006... Read More →

Tuesday May 10, 2016 10:00am - 10:50am PDT
Regency E

11:20am PDT

Apache Eagle - Identify Threats Instantly Through Policy Engine and User Profile - Medha Samant, eBay
Apache Eagle is an Open Source Monitoring framework for Hadoop to instantly identify access to sensitive data, recognize attacks, malicious activities in Hadoop and take actions in real time. Eagle provides distributed, fault-tolerant policy engine and out of box machine learning models to create user profiles offline based on historic user behaviors and detects anomalies online.

Apache Eagle was initially created for filling some obvious gaps in Hadoop security landscape and soon expanded to Hadoop system monitoring including map/reduce job monitoring, data node anomaly detection, master node garbage collection activity monitoring etc.

Apache Eagle’s core is the fully distributed policy evaluation engine. It solved common yet hard problems for traditional monitoring, like horizontal scalability, data skew, policy fault-tolerance, fluent stream DSL etc.

avatar for Medha Samant

Medha Samant

Director, Product Management, eBay
Medha Samant is Director of Product Management at eBay; driving product strategy and execution for Data Analytics and Business intelliegnece platform. Medha has over 20 years of extensive and diversified experience across product development, product innovation and strategy, product... Read More →

Tuesday May 10, 2016 11:20am - 12:10pm PDT
Regency E

2:00pm PDT

Apache Kerby for Big Data Security - Kai Zheng, Intel
Big Data platform based on Apache Hadoop presents numerous security, compliance, and integration challenges in both enterprise and Internet domains. This session will present a new, comprehensive authentication solution through using Apache Kerby in Hadoop, allowing everyone can be connected everywhere in the ecosystem, in a low-risk yet secure manner, as incurred in and benefit from Kerberos. Apache Kerby is a sub-project to the Apache Directory since Jan 2015. It is an implementation of Kerberos in Java and will provide rich, intuitive and interoperable library and facilities that integrate multiple authentication mechanisms including PKINIT, OTP and token (OAuth2.0). We will introduce and discuss the solution, state of community development, highlighted features, and roadmap. We will also show the demo and explain how Kerby’s embedded nature can be leveraged in Hadoop.


Kai Zheng

Kai is a senior software engineering in Intel that works in big data and security fields for quite a few of years. He is a key Apache Kerby initiator, Directory PMC member and Apache Hadoop committer.

Tuesday May 10, 2016 2:00pm - 2:50pm PDT
Regency E

3:00pm PDT

Enabling Universal Authorization Models Using Sentry - Hao Hao & Anne Yu, Cloudera
Sentry is an framework to provide fine-grained access control on data stored on a Hadoop cluster. Sentry has been leveraged to manage authorization policies to Hive, Solr, Impala, and Kafka. A new generic authorization model has been implemented in Sentry to allow enabling protections on various types of data existing in Hadoop engines easily. It is critical to instrument different authorization security models meeting diverse security requirements. The architecture of the new generic model is designed to plug-in various authorization model such as role based access control and attribute based access control easily using the same storage service. In this talk, Anne and Hao will present the architecture of the generic model in Sentry framework which satisfies all those targets. And they will also elaborate how to create policies to protect data for different Hadoop engines.


Hao Hao

Hao Hao is a software engineer at Cloudera. She is working on Sentry project, a granular, role-based authorization module for Hadoop cluster. She is also a committer of Apache Sentry (incubating) project. Hao has performed extensive research on smartphone security, web security while... Read More →
avatar for Anne Yu

Anne Yu

Software Engineer, Cloudera
I am a software engineer working as Cloudera. I am also a PMC and committer of Apache Sentry. I am interested in big data and its security technologies. I am also very interested in Technology driven education system, such as AltSchool.

Tuesday May 10, 2016 3:00pm - 3:50pm PDT
Regency E