16 Best Open Source Alternatives to Datadog

Updated July 2026

Good observability asks you to instrument everything - more hosts, more custom metrics, more trace and log volume - and that is precisely what Datadog charges for. Billing tracks hosts, ingested gigabytes, and custom metrics, so the more thoroughly you measure your own systems, the faster the invoice climbs, and the telemetry that explains your infrastructure lives on someone else's. The open source alternatives here keep metrics, distributed traces, and logs in stores you run, retained on your terms rather than a billing tier's. Instrumenting one more service costs storage instead of a per-host line item, and you still get the correlated view across your stack that made the incident-time workflow worth having.

1.Netdata

79.2kGPL-3.0C Self-host

Netdata trades sampling intervals for resolution: it charts system and application metrics at one-second granularity, so a spike that a one-minute tool would average away shows up immediately. Install it and dashboards populate themselves with little configuration, which is how it stays useful for lean teams as much as large fleets.

Per-second metrics and visualizations
Collects from systems, containers, apps, logs, APIs, and synthetic checks
ML models per metric for anomaly detection
Built-in alerts with email, Slack, Telegram, PagerDuty, Discord, and Teams

More about Netdata Visit website

2.Grafana

74.4kAGPL-3.0TypeScript Self-host

Grafana is an open-source platform for monitoring and observability. It lets you query, visualize, alert on, and understand metrics no matter where they are stored, and it is built to create, explore, and share dashboards with a team.

Client-side visualizations for metrics and logs
Dynamic dashboards with template variables
Ad-hoc query exploration and side-by-side comparisons
Log exploration with preserved label filters

More about Grafana Visit website

3.SigNoz

27.3kOtherTypeScript Self-host

SigNoz is an open source observability platform for monitoring applications, services, and infrastructure. It brings logs, metrics, and traces into one place so you can spot issues, troubleshoot downtime, and debug with richer context, positioned as an open source alternative to Datadog and New Relic.

Application performance monitoring with p99 latency, error rate, Apdex, and ops per second
Centralized logs with filters, query builder, and log charts
Distributed tracing with flamegraphs and Gantt charts
Metrics dashboards with pie, time-series, and bar chart panels

More about SigNoz Visit website

4.Apache SkyWalking

24.8kApache-2.0Java Self-host

Apache SkyWalking is an open source APM system for microservices, cloud-native, and container-based architectures. It collects monitoring, tracing, and diagnostic data from distributed systems and brings service topology, service-centric observability, and dashboards together in one place.

Distributed tracing with service topology analysis
Metrics, logs, profiling, and alarms
Agents for Java, .NET Core, PHP, NodeJS, Go, Python, and more
eBPF-based monitoring and profiling with Rover

More about Apache SkyWalking Visit website

5.Jaeger

22.9kApache-2.0Go Self-host

Jaeger is a distributed tracing system for monitoring and troubleshooting requests as they flow through complex distributed systems. By following a single request across every service it touches, teams can pinpoint latency, errors, and unexpected behavior in development or production.

Distributed tracing across complex service workflows
Ingests OpenTelemetry trace data over HTTP or gRPC
Pluggable storage backends for traces
Web UI for exploring traces and service dependencies

More about Jaeger Visit website

6.Vector

22kMPL-2.0Rust Self-host

Vector is an open-source observability data pipeline for collecting, transforming, and routing logs and metrics. It runs end-to-end as an agent or aggregator, so teams can consolidate telemetry flow and send data to current or future vendors. The focus is control over observability data, including cost reduction, enrichment, and data security placement.

Collect, transform, and route logs and metrics
Deploy as an agent or aggregator
Sources include Docker logs, files, HTTP, journald, Kafka, and sockets
Transforms include dedupe, filter, remap, Lua, and log-to-metric

More about Vector Visit website

7.OpenObserve

19.3kAGPL-3.0TypeScript Self-host

OpenObserve is a cloud-native observability tool for logs, metrics, traces, analytics, and real user monitoring. It is built for teams that want a single place to search, query, and alert on telemetry without the cost and complexity of separate tools.

Parquet columnar storage with S3-native design
Full-text log search, SQL queries, filters, and dashboards
Distributed tracing with OpenTelemetry
Metrics dashboards with SQL or PromQL

More about OpenObserve Visit website

8.Pinpoint

13.8kApache-2.0Java Self-host

Pinpoint is an application performance management tool for large-scale distributed systems, inspired by Google Dapper. It traces transactions end to end across services so you can see how components connect and quickly find problem areas and bottlenecks in complex applications.

ServerMap for distributed system topology
Real-time active thread chart
Request-response scatter chart
CallStack for code-level transaction visibility

More about Pinpoint Visit website

9.Fluentd

13.6kApache-2.0Ruby Self-host

Fluentd sits between your data sources and your backend systems as a single unified logging layer, so applications no longer need to know where their logs end up. It collects events from many sources and writes them to files, RDBMS, NoSQL, IaaS, SaaS, Hadoop, and other destinations.

Unified logging layer decouples sources from backends
500+ plugins for inputs and outputs
Writes to files, RDBMS, NoSQL, IaaS, SaaS, and Hadoop
Structured JSON events with fluent.conf routing

More about Fluentd Visit website

10.Quickwit

11.3kApache-2.0Rust Self-host

Quickwit is a cloud-native search engine for observability data, focused on logs and distributed traces, with metrics support on the roadmap. It is an open-source alternative to Datadog, Elasticsearch, Loki, and Tempo for teams that need full-text search and analytics over large event data.

Full-text search and aggregation queries
Elasticsearch/OpenSearch-compatible ingest and search APIs
OTEL-native logs and traces with Jaeger-native tracing
Schemaless or strict schema indexing with schemaless analytics

More about Quickwit Visit website

11.HyperDX

9.6kMITTypeScript Self-host

HyperDX is an open source observability platform for finding production issues in logs, metrics, traces, errors, and session replays. It runs on top of a ClickHouse cluster and is built to make search and visualization faster across production telemetry.

Correlate logs, metrics, session replays, traces, and errors
Schema-agnostic search on an existing ClickHouse schema
Alerts, dashboards, event deltas, and live tailing
OpenTelemetry support with an included collector

More about HyperDX Visit website

12.Falco

9kApache-2.0C++ Self-host

Falco is a cloud native runtime security tool for Linux. It detects and alerts on abnormal behavior and potential security threats in real time, acting as a kernel monitoring and detection agent that observes events such as syscalls.

Kernel-level event monitoring based on syscalls
Custom rules engine for host and container behavior
Container runtime and Kubernetes metadata enrichment
Off-host event analysis in SIEM or data lake systems

More about Falco Visit website

13.Fluent Bit

7.9kApache-2.0C Self-host

Fluent Bit is a lightweight telemetry agent for collecting, processing, and forwarding logs, metrics, and traces from any source to any destination. It is built for Linux, Windows, macOS, BSD, and embedded environments, and is designed to use minimal CPU and memory.

70+ built-in plugins for inputs, filters, and outputs
SQL stream processing for analytics and transformations
Built-in TLS and SSL support with async I/O
Internal metrics exposed over HTTP and Prometheus

More about Fluent Bit Visit website

14.Coroot

7.8kApache-2.0Go Self-host

Coroot is an open source observability and APM tool that brings metrics, logs, traces, and profiles together in one place. It cuts down manual investigation by turning that telemetry into actionable insights, including automated root cause analysis and SLO-based alerting.

Automatic collection of metrics, logs, traces, and profiles via eBPF
eBPF instrumentation with zero code changes
Service map, predefined inspections, and SLO-based alerting
Distributed tracing and continuous profiling

More about Coroot Visit website

15.Uptrace

4.2kAGPL-3.0Go Self-host

Uptrace is an open source APM for monitoring applications and troubleshooting issues with OpenTelemetry traces, metrics, and logs. It is built for teams that want a single place to follow application behavior across telemetry data.

Single UI for traces, metrics, and logs
50+ pre-built dashboards
Alerting with Email, Slack, WebHook, and AlertManager
SQL-like span queries and Promql-like metric queries

More about Uptrace Visit website

16.Elastic APM

1.3kOtherGo Self-host

Elastic APM Server is the application performance monitoring component of Elastic Observability. It receives data from Elastic APM agents instrumented in your applications and turns it into Elasticsearch documents, so performance data lands in the same store as your logs and metrics for hybrid-cloud applications.

Ingests data from Elastic APM agents
Stores APM data as Elasticsearch documents
End-to-end distributed tracing with metrics and logs in context
Accepts OpenTelemetry data

More about Elastic APM Visit website

Our picks

Datadog's breadth is the hard part to match, so pick by whether you want one platform, cheap storage, or zero-touch instrumentation.

OpenTelemetry-native all-in-one: SigNoz SigNoz brings logs, metrics, and traces into one interface built on OpenTelemetry and ClickHouse, positioned as a Datadog alternative. It covers APM with p99 latency and error rates, distributed tracing with flamegraphs, log search, and anomaly-detection alerts, and supports every language OpenTelemetry does.

Instant per-second infrastructure metrics: Netdata Netdata charts system and application metrics at one-second resolution, so short spikes a coarser tool would average away stay visible. Dashboards populate themselves with little configuration, and a model trained per metric flags anomalies. It runs on Linux, macOS, Windows, Docker, and Kubernetes, keeps data on your own infrastructure, and exports to Prometheus, InfluxDB, and Graphite.

Zero-instrumentation via eBPF: Coroot Coroot uses eBPF to collect metrics, logs, traces, and profiles with no code changes, so even legacy or third-party services are covered. On top of that data it builds a service map, distributed tracing, SLO-based alerting, and automated root cause analysis, and stores logs in ClickHouse. It suits teams that cannot easily re-instrument everything.

High-volume storage on the cheap: OpenObserve OpenObserve handles logs, metrics, traces, and real user monitoring with Parquet columnar, S3-native storage, which keeps large volumes affordable. Logs support full-text search and SQL, metrics query in SQL or PromQL, and it deploys as a single Rust binary with native multi-tenancy. A strong fit when log and trace volume is the main cost driver.

Taking observability off Datadog

Datadog is hard to replace because it is not a metrics tool; it is host and container monitoring, logs, traces, dashboards, alerting, service maps, synthetics, and a large integration catalog behind one data model. The first real choice is whether you want one open source platform that covers most of that surface, or a composed stack with separate systems for metrics, logs, and traces. A composed stack gives sharper control over retention and cost, but correlation, permissions, upgrades, and on-call ownership all become your responsibility.

Where open source usually lags is the glue rather than the core telemetry: tag conventions, saved views, monitor evaluation, alert routing, and integration defaults that teams stopped noticing. The collection paths themselves are in good shape, especially if you standardize on OpenTelemetry. SigNoz and Uptrace are built OpenTelemetry-native and put traces, metrics, and logs in one interface, so they land closest to Datadog's single-pane feel. Coroot takes a different route, using eBPF to collect metrics, logs, traces, and profiles with no code changes, which helps for legacy services you cannot easily instrument. Budget time to retrain engineers on new query languages and to rebuild runbooks tied to Datadog screens.

Cut over as a dual-write period, not a flag day. Inventory Datadog monitors, dashboards, metric names, tag keys, log pipelines, trace instrumentation, and SLOs, then move collection toward vendor-neutral agents and send the same telemetry to both systems until alert behavior matches. Dashboards and monitors often export as JSON through the API or infrastructure-as-code state, but the queries need translation. Historical logs and metrics depend on retention and archives, so treat old data as reference you can read, not something that will rehydrate cleanly into the new backend.

Related alternatives

PRTG9 Nextcloud7 PaperCut3 TSplus3

Frequently asked questions

What is the hardest part of replacing Datadog?+

Rarely the collection. The hard part is recreating the operating model built around Datadog: tag standards, monitor ownership, dashboard conventions, alert routes, service views, and incident habits. In many teams Datadog is also the shared source of truth during an outage. Plan the replacement around those workflows first and storage engines second, or you will move the data and lose the muscle memory that made it useful.

Should I use one platform or a separate stack for metrics, logs, and traces?+

It depends on how much operational load you can carry. A single platform like SigNoz keeps logs, metrics, and traces correlated in one interface with less to run. A composed stack of specialized systems gives sharper control over retention and cost per signal, but you own the correlation, upgrades, and access model. Smaller teams usually favor one platform; teams with strong cost pressure on a single signal often split it out.

How does OpenTelemetry help when leaving Datadog?+

It decouples your instrumentation from any one backend. If your services emit OpenTelemetry, you can change where traces and metrics land without re-instrumenting code. SigNoz and Uptrace ingest OpenTelemetry natively, and Coroot uses eBPF to collect telemetry with no code changes at all. Standardizing on open protocols during a dual-write period is what lets you send the same data to Datadog and its replacement while you compare them.

Will APM traces look the same after the switch?+

Not exactly. Trace ingestion may be simple, but the experience around service maps, dependency views, error grouping, sampling controls, and flame graphs will differ. Datadog also nudges you toward particular naming and tagging conventions. Before switching, validate that traces propagate across your main services, that spans keep useful attributes, and that high-cardinality data will not overwhelm the new backend. Coroot and HyperDX build their trace views on ClickHouse, which shapes how queries feel.

How do logs change when moving off Datadog?+

Logs are where cost and surprises usually appear. Datadog log pipelines, parsing rules, indexes, facets, and exclusion filters can accumulate for years. Rebuild them deliberately instead of forwarding everything: decide which logs need fast search, which need cheap retention, which fields must be parsed, and which sensitive values to drop before storage. OpenObserve leans on S3-backed columnar storage specifically to keep high log volume affordable.

Do I have to self-host everything after leaving Datadog?+

No. Some teams run the whole stack themselves; others use managed hosting around the same open source components. The tradeoff is control versus operational load. Self-hosting gives you direct say over retention, data location, and tuning, while managed options cut upgrade and reliability work. Either way, understand the data formats, ingestion limits, and export paths before you commit, because those decide how easily you can move again later.

What should I test before cutting over alerts from Datadog?+

Run Datadog and the replacement in parallel long enough to cover normal traffic, deploys, batch jobs, and at least one real incident or game day. Compare alert timing, missing-data behavior, recovery notifications, grouping, deduplication, and escalation. A monitor that looks equivalent on paper can still page too late, page too often, or miss a partial failure, and only a parallel run against real conditions exposes that.