For a more thorough explanation of what Okera offers, please refer to the documentation here.
Okera solves the complex issue of managing and controlling access to many big data datasources. It transparently integrates into Hadoop, S3, EMR, and Spark (and many more architectures!) to provide a unified metadata catalog and access control policy store. There are two modes of operation:
- Data access is handled directly by integrated systems (ex: Hive, Impala), with Okera only managing the role of authorization layer
- Data access, as well as policy access and enforcement is managed by the Okera services.
Like massively parallel processing (MPP) databases, Okera is always available, taking on requests from a growing set of data processing systems, such as Spark, Python, Presto, Impala, and Hive, with no delay and no additional overhead. Its scalability and high availability features make Okera the ideal backend for all access to data in many of the common storage systems, like S3, HDFS, and Kafka - which often only provide coarse grained security controls. All access to data is vetted against the central access control policies and logged for auditing. Because of its optimized access to data using state-of-the-art technologies, queries are accelerated and can take further advantage of acceleration mechanisms provided by Okera.
With the integration into industry-standard security mechanism, such as Kerberos provided by Microsoft ActiveDirectory servers, access to data is authenticated against the existing company-wide user directories.
All of this ensures that Okera is not introducing any additional complexity, but rather simplifies your life and makes access to data a reliable, secure, and transparent experience.