Dagster logo

Dagster

Cloud-native data pipeline orchestrator with declarative assets, lineage, and observability

Open Source Alternative to
Repository activity
  • Stars15.7k
  • Forks2.2k
  • Open Issues2.7k
dagster-io-dagster health score - Linux Foundation Insights
License

Apache-2.0

Languages
  • Python
  • TypeScript
  • LookML
Dagster screenshot

About Dagster

Dagster is a cloud-native data pipeline orchestrator for developing and maintaining data assets such as tables, data sets, machine learning models, and reports. It helps teams define what data should exist, run it at the right time, and keep it up to date across local development through production.

Assets are declared as Python functions, then scheduled and monitored in Dagster's web UI. It includes integrated lineage and observability, centralized metadata, diagnostics, cataloging, and a declarative programming model for building reusable components and spotting data quality issues early.

Dagster has a growing library of integrations for popular data tools and can be deployed to your infrastructure. It is Apache 2.0 licensed, available on PyPI, and officially supports Python 3.9 through Python 3.14.

Key features

  • Declare data assets as Python functions
  • Schedule and monitor pipelines in the web UI
  • Integrated lineage, observability, and metadata
  • Diagnostics, cataloging, and data quality checks
  • Integrations for popular data tools

Details

First released
2018
License
Apache 2.0
Platforms
Web · CLI
Deployment
cloud · self-hostable
Language
Python · 3.9 to 3.14
Governance
Dagster Labs