How to Migrate from Datadog APM to OpenTelemetry in One Afternoon

A pragmatic four-hour migration playbook from Datadog APM to OpenTelemetry-native observability — including the gotchas nobody warns you about.

Lhoussine

May 9, 2026·8 min read

How to Migrate from Datadog APM to OpenTelemetry in One Afternoon

Most "migration guides" for observability are 47 pages of marketing dressed as documentation. This one assumes you have a Node application using the dd-trace package, you want to swap it for OpenTelemetry, and you have an afternoon. Here's the order of operations.

If you'd rather see a comparison of where the migrated traces could land, the Datadog alternative comparison covers the destination options.

Pre-flight (10 minutes)

Open three browser tabs:

Your current Datadog APM dashboard for one service. Note the top 5 charts your team actually uses.
Your application's package.json. Confirm dd-trace is in dependencies, not devDependencies.
Your service's start command. Find where the agent is initialized — usually import 'dd-trace/init.js' at the top of app.js, or via a NODE_OPTIONS='-r dd-trace/init' env var.

If you're in containers, find the entrypoint command in your Dockerfile or Helm chart. Same logic applies.

Step 1 — install the OTel SDK (5 minutes)

npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node @opentelemetry/exporter-trace-otlp-http

Or, if you want auto-instrumentation plus the SecureNow firewall in one package:

npm install securenow

(This is the one I'll use for the rest of the post because it collapses six dependencies into one. Pick whichever you prefer.)

Step 2 — replace the start command (2 minutes)

Find the line where dd-trace is loaded. There are usually two places it can be:

// Option A: explicit import at the top of app.js
import 'dd-trace/init.js';

// Option B: NODE_OPTIONS env var
NODE_OPTIONS='-r dd-trace/init' node app.js

Replace whichever you have with:

node -r securenow/register app.js
# or:
node -r @opentelemetry/auto-instrumentations-node/register app.js

That's the actual swap. Restart the service. You're now emitting OpenTelemetry spans instead of Datadog's proprietary format.

Step 3 — point at a destination (5 minutes)

The default OTLP endpoint is http://localhost:4318 (the OpenTelemetry collector). If you don't have one running, set:

SECURENOW_API_KEY=snk_live_...        # if using SecureNow
# or:
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-otlp-endpoint

The first option (SecureNow) is the simplest because it's a managed destination with a free tier. The second is what you'd do if you're standing up SigNoz, Grafana Tempo, or any self-hosted OTLP backend.

If you want to keep Datadog as the destination during the migration (so you don't have to rebuild dashboards immediately), Datadog accepts OTLP at:

OTEL_EXPORTER_OTLP_ENDPOINT=https://trace.agent.<region>.datadoghq.com
OTEL_EXPORTER_OTLP_HEADERS=DD-API-KEY=<your-datadog-api-key>

This is the path most teams take: data layer → OpenTelemetry; destination → still Datadog (for now).

Step 4 — verify (15 minutes)

Hit the service with a few requests. In your destination dashboard, look for:

A trace with the right service.name (matches your app)
Spans for HTTP requests, with the right path and status code
Database spans, if applicable
Error events on traces that 500'd

If something is missing, the most common cause is that the SDK loaded after your framework started. The -r flag is critical because it loads the OTel SDK before any of your require() calls patch the framework. If you skip the -r flag and try to import OTel at the top of app.js, you'll see partial coverage.

Step 5 — port custom traces (30 minutes)

If you have hand-written dd-trace API calls, they need to be rewritten to OpenTelemetry's API. The good news: the concepts map 1:1. The bad news: the symbol names differ.

Datadog	OpenTelemetry
`tracer.startSpan('op')`	`tracer.startSpan('op')`
`span.setTag('key', 'value')`	`span.setAttribute('key', 'value')`
`span.finish()`	`span.end()`
`tracer.scope().activate(span, fn)`	`context.with(trace.setSpan(context.active(), span), fn)`

Search the codebase for dd-trace imports and tracer. references. Replace with the OTel equivalents. A team with 30 hand-written spans typically finishes this in 30–45 minutes.

Step 6 — port the dashboards (90 minutes)

This is the biggest chunk and it's mostly clicking, not engineering. List the 5–10 dashboards your team actually uses (be honest — the count is always lower than people think). For each:

Latency histograms → query the same data in your new tool's chart language
Error rate → either a status-code histogram or an exception count
Throughput → request count per minute
Top endpoints → group by http.target (OTel's standard attribute name)

If you're moving to PromQL-based tools (Grafana, Mimir, SigNoz), most of the queries are one-line PromQL. If you're moving to a SQL-based tool (SecureNow, ClickHouse-backed), the queries are SQL on the spans table.

Step 7 — port the alerts (45 minutes)

Same logic as dashboards. The actual alert rules (P99 > 500ms for 5 minutes, error rate > 1%, etc.) should be identical — only the query syntax changes. For most teams, alerts are 10–20 rules; budget two minutes per rule for finding the right metric and setting thresholds.

Step 8 — flip the agent off (5 minutes)

Once the new destination has 24–48 hours of data and you're confident, disable the Datadog agent's tracer:

DD_APM_ENABLED=false

Or remove the agent install entirely if you also want logs and infrastructure metrics off Datadog. The agent will keep running for log forwarding until you replace that too — which is a separate migration with its own checklist.

What goes wrong (and how to fix it)

"Spans missing entirely." The -r flag loaded after your framework. Confirm your start command actually has -r securenow/register or -r @opentelemetry/auto-instrumentations-node/register before the script name.

"Spans are there but missing attributes." OTel's auto-instrumentation respects framework conventions; Datadog's was more aggressive about attaching custom attributes. If a specific attribute is missing, you can add it in a span processor or set it manually on the active span.

"Errors aren't linked to traces." OTel's exception recording uses span.recordException(err). If you have global error handlers (Express's error middleware, etc.), OTel auto-instrumentation patches them, but custom error pipelines may need an explicit call.

"Trace propagation broken between services." OpenTelemetry uses W3C traceparent headers by default; Datadog uses x-datadog-trace-id. If you have two services where one is OTel and one is still on Datadog, configure both to support both headers — most SDKs allow this with a propagators config flag.

The afternoon's actual time budget

Realistic per-service: 30–60 minutes for the SDK swap and verification. Multiply by the number of services. Add 90 minutes for dashboards (one-time, not per-service). Add 45 minutes for alerts. Add a coffee break.

For a 5-service stack: ~5 × 45 minutes = 3.75 hours of focused work, plus 90 minutes of dashboard/alert work, plus parallel verification = ~6 hours. One engineer, one afternoon.

For a 50-service stack: do it 5 services at a time over a sprint. Don't do them all at once — the parallel verification is what makes it safe.

After the migration

You're now OpenTelemetry-native. The destination question — Datadog, SecureNow, SigNoz, Grafana Cloud — is now a separate decision, and changing it is a config-flag change rather than a rewrite. For most teams the second-order benefit (negotiation leverage at renewal) pays for the migration on its own.

Frequently Asked Questions

How long does a Datadog-to-OTel migration actually take?

For a single Node service: 30 minutes to 2 hours including verification. For a 10-service stack with shared instrumentation: roughly an afternoon. The time-consuming part is dashboard recreation, not the SDK swap.

Can I keep my Datadog dashboards?

Not directly. Datadog's query language doesn't translate to OTLP or PromQL. Plan to rebuild the 5–10 dashboards your team actually uses; ignore the rest. Most teams find this is the biggest single chunk of the migration.

What about my custom Datadog metrics?

OpenTelemetry has a metrics API; the migration is one-to-one for counters, gauges, and histograms. The catch is that you need to also migrate your alerting rules to whatever destination understands the new metric names.

Will I lose data during the cutover?

Not if you run both in parallel for 48 hours. Standard practice: install OTel alongside the Datadog agent, send traces to both, verify, then disable the agent. Zero-downtime is achievable for any team that's careful.

How to Migrate from Datadog APM to OpenTelemetry in One Afternoon

How to Migrate from Datadog APM to OpenTelemetry in One Afternoon

Pre-flight (10 minutes)

Step 1 — install the OTel SDK (5 minutes)

Step 2 — replace the start command (2 minutes)

Step 3 — point at a destination (5 minutes)

Step 4 — verify (15 minutes)

Step 5 — port custom traces (30 minutes)

Step 6 — port the dashboards (90 minutes)

Step 7 — port the alerts (45 minutes)

Step 8 — flip the agent off (5 minutes)

What goes wrong (and how to fix it)

The afternoon's actual time budget

After the migration

Related

Frequently Asked Questions

How long does a Datadog-to-OTel migration actually take?

Can I keep my Datadog dashboards?

What about my custom Datadog metrics?

Will I lose data during the cutover?

Recommended reading