How a Developer Caught a Supply Chain Attack in Trace Data
A realistic scenario of how a developer discovered a compromised npm package through unusual outbound HTTP calls in OpenTelemetry trace data — and used AI analysis to confirm and contain the threat.
Posted by
Related reading
SOC Notification Triage: From Alert Overload to Actionable Incidents
Master the art of SOC notification triage with structured workflows. Learn to filter, prioritize, and resolve security alerts efficiently using status-based workflows and AI-powered investigation.
Eliminating False Positives: A SOC Team's Guide to Smarter Alerting
Reduce false positive rates in your SOC with AI-suggested exclusions, test-before-apply workflows, and intelligent path pattern matching. A practical guide to cleaner alerts.
Real-Time IP Monitoring at Scale: Tracking Thousands of IPs Across Your Infrastructure
Monitor and investigate thousands of IP addresses in real-time with automated threat intelligence enrichment, status tracking, and batch analysis for enterprise security operations.
How a Developer Caught a Supply Chain Attack in Trace Data
Alex is a senior backend engineer at Meridian Finance, a fintech startup running twelve Node.js microservices in production. The team uses OpenTelemetry for observability and SecureNow for security monitoring. This is the story of how a routine dependency update turned into an incident — and how trace data revealed what traditional security tools missed entirely.
This scenario is fictional but constructed from real attack patterns documented in MITRE ATT&CK T1195.002 (Supply Chain Compromise: Software Supply Chain) and real-world incidents like the event-stream, ua-parser-js, and colors.js compromises. The technical details — the trace patterns, the investigation workflow, the detection queries — are all real.
Monday: The Routine Update
Monday morning, 9:15 AM. Alex runs the weekly dependency audit, a habit the team adopted after the ua-parser-js incident. The audit looks clean. Dependabot has a batch of minor version bumps queued:
format-utils 3.2.1 → 3.2.2
express-validator 7.0.1 → 7.0.2
lodash 4.17.21 → 4.17.22
Nothing alarming. format-utils is a small utility package the team uses for date and currency formatting — 200,000 weekly downloads, actively maintained, clean audit history. The version bump is a patch release: "Fixed locale handling for currency formats." Alex reviews the changelog, merges the PR, and deploys to staging.
Tests pass. Staging looks fine. The deploy goes to production at 2 PM.
This is how supply chain attacks work. They don't arrive as obvious malware. They ride in on packages you already trust, in patch releases that look like routine bug fixes. The 2024 Sonatype State of the Software Supply Chain Report documented a 245% increase in supply chain attacks year over year — and the vast majority were delivered through exactly this vector.
Tuesday Morning: Something Strange in the Traces
Tuesday, 10:30 AM. Alex opens SecureNow's Trace Explorer to investigate a performance regression reported by a teammate on the user-service. Filtering by the last 24 hours, the traces look normal at first glance. But while scrolling through individual traces, Alex notices something unexpected.
A trace for a POST /api/users/login request shows 7 spans. This endpoint usually generates 4: the HTTP handler, the authentication middleware, a database query to verify credentials, and the session creation. Three extra spans shouldn't be there.
Alex clicks the trace and opens the Gantt view. The standard login flow is visible: Express handler → auth middleware → SELECT * FROM users WHERE email = ? → session insert. But nested under the auth middleware, there are three additional client spans:
HTTPS GET telemetry-cdn.xyz/collect [32ms]
HTTPS POST telemetry-cdn.xyz/data [45ms]
HTTPS GET telemetry-cdn.xyz/config [18ms]
Alex has never heard of telemetry-cdn.xyz. The user service has no legitimate reason to make outbound HTTPS calls during the login flow — especially to a domain that isn't in the team's infrastructure inventory.
The first instinct is to check whether a teammate added some new analytics integration. Alex posts in Slack: "Anyone know what telemetry-cdn.xyz is? Seeing outbound calls from user-service during login." The response is immediate and unanimous: nobody recognizes it.
Tuesday: The AI Analysis Confirms the Suspicion
Alex selects the trace with the suspicious spans and clicks AI Analysis in SecureNow. The AI examines the complete span tree and returns a structured security report:
High Severity Finding: Suspicious Outbound Network Activity
The trace contains three HTTP client spans making requests to
telemetry-cdn.xyz, a domain not associated with any known legitimate service. These calls occur during the user authentication flow and exhibit patterns consistent with data exfiltration:
GET /collect— Likely a beacon or initialization callPOST /data— Sends a payload during a flow that processes user credentialsGET /config— Retrieves configuration, possibly updating exfiltration parametersRisk Assessment: The timing and placement of these calls within the authentication flow suggests credential or session token exfiltration. No legitimate formatting or utility library should make outbound network requests.
Recommended Action: Immediately investigate the source of these outbound calls. Check recently updated dependencies. Consider rolling back recent deployments until the source is identified.
The AI caught exactly what Alex suspected: a formatting utility library has no business making HTTP calls. The fact that it happens during login, when session tokens and credentials are in memory, makes it worse.
For details on how SecureNow's AI analysis detects these patterns, see AI-Powered Trace Analysis: Detecting Security Issues Hidden in Application Spans.
Tuesday: Forensic Investigation
Now Alex needs to understand the scope. How long has this been happening? Which services are affected? How much data was exfiltrated?
Alex opens SecureNow's Forensics console and starts asking questions.
Question 1: "Show all outbound HTTP calls to telemetry-cdn.xyz in the last 48 hours"
The natural language query converts to ClickHouse SQL and returns results in seconds:
SELECT
serviceName,
name AS operation,
attribute_string_http_method AS method,
attribute_string_http_url AS url,
response_status_code AS status,
timestamp
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 48 HOUR
AND attribute_string_http_host = 'telemetry-cdn.xyz'
AND kind = 3
ORDER BY timestamp
The results are alarming. Three services have made calls to telemetry-cdn.xyz:
| Service | Call Count | First Seen | Last Seen |
|---|---|---|---|
| user-service | 847 | Mon 2:12 PM | Tue 10:28 AM |
| payment-service | 312 | Mon 2:15 PM | Tue 10:30 AM |
| account-service | 156 | Mon 2:18 PM | Tue 10:29 AM |
All three services started making these calls within minutes of Monday's deployment. The exfiltration has been running for about 20 hours.
Question 2: "How many unique user sessions were processed during outbound calls to telemetry-cdn.xyz?"
SELECT
serviceName,
uniqExact(traceID) AS affected_traces,
count(*) AS exfiltration_calls
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 48 HOUR
AND attribute_string_http_host = 'telemetry-cdn.xyz'
AND kind = 3
GROUP BY serviceName
ORDER BY affected_traces DESC
The user-service alone has 847 affected traces — meaning roughly 847 user logins potentially had session tokens exfiltrated. The payment-service calls are even more concerning: payment processing flows that may have exposed transaction data.
Question 3: "What data was sent in the POST requests?"
Alex switches to the Trace Explorer and opens individual traces with the POST /data span. The span attributes reveal the request body size (attribute: http.request_content_length) — ranging from 200 to 400 bytes per POST. The exact payload isn't captured in the span (SecureNow doesn't log request bodies by default for security reasons), but the size is consistent with a JSON payload containing a session token, user ID, and email address.
The Timeline View
Alex opens the timeline view on a trace that includes the exfiltration spans. The visualization makes the attack pattern clear:
- User submits login credentials (0ms)
- Express auth middleware validates credentials (2ms)
- Database query verifies user record (15ms)
GET telemetry-cdn.xyz/collectfires during token generation (18ms)- Session token is created (20ms)
POST telemetry-cdn.xyz/datasends payload with the fresh token (22ms)GET telemetry-cdn.xyz/configretrieves instructions (67ms)- Response returned to user (70ms)
The attack is surgically precise. The malicious code hooks into the login flow, waits until credentials are verified and a session token is generated, then exfiltrates both. The GET /config call likely retrieves targeting rules — which users to exfiltrate, which to skip.
Tuesday Afternoon: Finding the Source
Alex needs to identify which dependency is making these calls. The trace spans show the calls originating from within the auth middleware execution context, which narrows the search. Alex checks package-lock.json for changes since the last known-good deployment:
git diff HEAD~1 package-lock.json | grep format-utils
There it is. format-utils went from 3.2.1 to 3.2.2. Alex inspects the package contents:
npm pack format-utils@3.2.2 --dry-run
Buried in the minified distribution, the 3.2.2 release includes a new file — lib/telemetry.js — that wasn't in 3.2.1. The code, when deobfuscated, reveals an HTTPS client that activates when it detects authentication-related function calls in the call stack, collects data from the execution context, and posts it to telemetry-cdn.xyz.
This matches the pattern described in MITRE ATT&CK T1195.002: the attacker compromised the package maintainer's account (or the package itself) and injected malicious code into a patch release. The code is specifically designed to target authentication flows — it doesn't activate for general formatting calls, making it nearly invisible to unit tests and functional testing.
Tuesday Afternoon: Containment
Alex executes the incident response in parallel:
Immediate Actions
-
Pin to the last known-good version across all affected services:
npm install format-utils@3.2.1 --save-exact -
Deploy the rollback to production immediately. With the lockfile pinned, the malicious 3.2.2 code is replaced.
-
Invalidate all sessions created since Monday 2 PM. Every session token generated during the compromise window is potentially compromised.
-
Notify affected users through the standard breach notification process.
Detection and Prevention
-
Set up an alert rule in SecureNow to detect future outbound calls to unexpected domains:
SELECT serviceName, attribute_string_http_host AS destination, count(*) AS call_count, min(timestamp) AS first_seen FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= now() - INTERVAL 15 MINUTE AND serviceName IN (__USER_APP_KEYS__) AND kind = 3 AND attribute_string_http_host NOT IN ( 'api.stripe.com', 'api.sendgrid.com', 'db.meridianfinance.io', 'cache.meridianfinance.io', 'auth0.com' ) AND attribute_string_http_host != '' GROUP BY serviceName, destination HAVING call_count > 5 ORDER BY call_count DESCThis allowlist-based approach flags any outbound connection to a domain not in the known-good list. It's aggressive — new integrations will trigger it until the allowlist is updated — but for a fintech processing financial data, that trade-off is appropriate.
-
Use API Map to verify no other shadow endpoints appeared. Alex opens SecureNow's API Map Discovery feature to get an AI-enhanced view of all endpoints the services expose. The map confirms no unauthorized endpoints were added — the attack was limited to outbound exfiltration, not inbound backdoors.
-
Report the package to the npm security team and file a CVE.
What Traditional Tools Missed
The sobering part of this incident is what didn't catch it:
- npm audit — reported no known vulnerabilities for
format-utils@3.2.2. The malicious code was injected into a new release, not a version with a published CVE. - Snyk/Dependabot — no alerts. These tools check for known vulnerabilities in the advisory databases. A zero-day supply chain compromise has no advisory until someone reports it.
- SAST (static analysis) — the malicious code was minified and obfuscated in the distribution bundle. Static analysis of the team's source code found nothing because the exploit lived in
node_modules. - WAF — the attack involved outbound HTTPS calls, not inbound exploitation. The WAF inspects incoming traffic and has no visibility into outbound connections from the application.
- Unit tests — all tests passed. The malicious code only activates in production-like conditions when it detects real authentication flows.
The only tool that saw the attack was the OpenTelemetry trace data in SecureNow. Because the http instrumentation automatically captures all outbound HTTP calls as client spans, the exfiltration calls appeared as visible spans in the trace tree. The data was there from the first compromised request — it just needed someone (or some AI) to look at it.
Building Ongoing Supply Chain Defenses
Alex's experience led the Meridian Finance team to implement a permanent defense strategy built on trace observability:
1. Outbound Domain Allowlisting Alert
The alert rule from the containment phase becomes a permanent fixture. Every outbound connection to an unlisted domain triggers investigation. The team reviews and updates the allowlist quarterly.
2. Span Count Baseline Alerts
Significant increases in span count for specific endpoints indicate new code paths — which could be legitimate feature additions or injected behavior:
SELECT
name AS endpoint,
avg(span_count) AS avg_spans,
max(span_count) AS max_spans
FROM (
SELECT
name,
traceID,
count(*) AS span_count
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 1 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND kind = 2
GROUP BY name, traceID
)
GROUP BY endpoint
HAVING avg_spans > 10
ORDER BY avg_spans DESC
When the average span count for the login endpoint jumped from 4 to 7, that should have been a detection signal. Now it is.
3. Post-Deploy Trace Comparison
After every deployment, the team runs a forensic comparison: "Show me new outbound destinations that appeared in the last hour that weren't present in the previous 24 hours." This catches supply chain compromises at the moment of deployment — reducing the exposure window from 20 hours (in Alex's case) to minutes.
4. Dependency Update Hygiene
The team adopted stricter practices aligned with the OpenSSF Scorecard recommendations:
- Pin exact versions in
package.json(no^or~ranges) - Review
package-lock.jsondiffs in PRs - Run
npm audit signaturesto verify package provenance - Delay adoption of new patch releases by 48 hours for non-critical packages
For the complete guide on setting up security monitoring with OpenTelemetry, see Adding Security Observability to Your App in 15 Minutes. For building the alert rules that would have caught this sooner, see Application Security Monitoring Workflow.
<!-- CTA:demo -->The Lesson: Your Traces Already Tell the Story
Supply chain attacks are designed to evade the tools you traditionally rely on — audit databases, static analysis, WAFs, and penetration tests. They succeed because they operate inside the trust boundary of your application, using code you explicitly installed and your runtime environment explicitly loaded.
But they can't hide from your traces. Every network call a compromised package makes generates a span. Every file it reads, every environment variable it accesses, every DNS lookup it performs — if your application is instrumented with OpenTelemetry, the evidence is there.
The difference between catching a supply chain attack in 20 hours and catching it in 20 minutes is whether you're watching your traces for security signals. Alex was lucky — a performance investigation accidentally revealed the compromise. With the right alert rules and AI analysis in place, the detection would have been automatic.
Your dependencies are someone else's code running with your application's permissions. Your traces are the only system that sees what that code actually does.
Frequently Asked Questions
Can OpenTelemetry traces detect supply chain attacks?
Yes. Compromised packages often make unexpected network calls, access unusual resources, or exhibit anomalous timing patterns — all visible in OpenTelemetry trace data as additional spans.
What signs of supply chain compromise appear in traces?
Look for unexpected outbound HTTP/HTTPS calls to unknown domains, DNS lookups to suspicious hosts, unusual file system access, environment variable reads, and increased span counts from specific libraries.
How can developers protect against supply chain attacks?
Use lockfiles, audit dependencies regularly, monitor trace data for anomalous behavior, set up alerts for unexpected outbound connections, and use SecureNow's AI trace analysis to flag suspicious patterns.
What should I do if I find a compromised dependency?
Immediately pin the last known-good version, notify your security team, check all services using that dependency, report to the package registry, and set up alert rules to detect similar patterns.