Metadata and data governance framework for Hadoop with lineage, audit, and RBAC and ABAC security
- Stars2.1k
- Forks911
- Open Issues139
Apache-2.0
- Java
- TypeScript
- JavaScript

About Apache Atlas
Apache Atlas is an extensible set of core governance services for Hadoop and the wider enterprise data ecosystem. It gives organizations a common metadata store so any metadata consumer can work together without point-to-point interfaces, and it helps teams meet compliance requirements.
It provides visibility into data through prescriptive and forensic models, technical and operational audit, and lineage enriched with business taxonomical metadata. Security is both role based and attribute based, and Apache Ranger is used to prevent non-authorized access paths to data at runtime.
Atlas is an Apache project that builds with Java and runs on a self-hosted server, with documented Docker build and run instructions. The distribution ships server and hook tarballs for HBase, Hive, Impala, Kafka, Sqoop, Storm, Falcon, and Couchbase to capture metadata from those systems.
Key features
- Prescriptive and forensic views of data
- Technical and operational audit
- Lineage enriched with business taxonomical metadata
- Role based and attribute based access control
- Hooks for Hive, HBase, Impala, Kafka, Sqoop, and more
Details
- On GitHub since
- 2017
- Language
- Java, TypeScript, JavaScript
- Builds with
- Java 8, 11, or 17
- Security
- RBAC and ABAC, Apache Ranger
- Self-hosted
- Yes, Docker build available
- License
- Apache-2.0
