Skip to content

fix(sidecar/telemetry): retry FFI telemetry batches when session conifg not yet available#1929

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit into
mainfrom
glopes/sidecar-tel-retry
Apr 29, 2026
Merged

fix(sidecar/telemetry): retry FFI telemetry batches when session conifg not yet available#1929
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit into
mainfrom
glopes/sidecar-tel-retry

Conversation

@cataphract

Copy link
Copy Markdown
Contributor

What does this PR do?

When the appsec helper posts telemetry via the in-process FFI path before the PHP IPC set_session_config message is processed, the telemetry client cache would be poisoned with a Config { endpoint: None } worker. All subsequent IPC enqueue_actions calls (including AddEndpoint) would reuse this bad client and never send HTTP requests to the agent, causing the Laravel8xTests "Endpoints are sent" test to time out.

Fix: when get_telemetry_client finds session_config is None, return None instead of calling get_or_create with Config::default(). The receiver task retries the batch up to 3 times with a 1.5 s delay before dropping. In the normal case set_session_config arrives within milliseconds so the first retry succeeds and no telemetry is lost.

How to test the change?

Tested via ./gradlew test7.4-debug --tests "com.datadog.appsec.php.integration.Laravel8xTests.Endpoints are sent" -PuseHelperRust. The error reproduced reliably on my machine after at most 3 or 4 retries.

@cataphract cataphract requested review from a team as code owners April 27, 2026 15:28
@github-actions

github-actions Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/glopes/sidecar-tel-retry

Summary by Rule

Rule Base Branch PR Branch Change
unwrap_used 2 3 ⚠️ +1 (+50.0%)
Total 2 3 ⚠️ +1 (+50.0%)

Annotation Counts by File

File Base Branch PR Branch Change
datadog-sidecar/src/service/telemetry.rs 2 3 ⚠️ +1 (+50.0%)

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 21 21 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 56 57 ⚠️ +1 (+1.8%)
libdd-common 10 10 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 19 19 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 8 8 No change (0%)
libdd-trace-stats 1 1 No change (0%)
libdd-trace-utils 15 15 No change (0%)
Total 198 199 ⚠️ +1 (+0.5%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1873fcefb9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread datadog-sidecar/src/service/telemetry.rs Outdated

@bwoebi bwoebi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that's a good stop-gap until we unify communication over the sidecar socket.

@codecov-commenter

codecov-commenter commented Apr 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 181 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.65%. Comparing base (cff7291) to head (44e71ac).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1929      +/-   ##
==========================================
- Coverage   71.80%   71.65%   -0.16%     
==========================================
  Files         434      434              
  Lines       69978    70091     +113     
==========================================
- Hits        50248    50222      -26     
- Misses      19730    19869     +139     
Components Coverage Δ
libdd-crashtracker 66.00% <ø> (-0.02%) ⬇️
libdd-crashtracker-ffi 34.47% <ø> (ø)
libdd-alloc 98.77% <ø> (ø)
libdd-data-pipeline 85.86% <ø> (ø)
libdd-data-pipeline-ffi 71.94% <ø> (ø)
libdd-common 79.41% <ø> (ø)
libdd-common-ffi 73.87% <ø> (ø)
libdd-telemetry 68.06% <ø> (-0.06%) ⬇️
libdd-telemetry-ffi 19.37% <ø> (ø)
libdd-dogstatsd-client 82.64% <ø> (ø)
datadog-ipc 76.31% <ø> (ø)
libdd-profiling 81.61% <ø> (ø)
libdd-profiling-ffi 64.36% <ø> (ø)
datadog-sidecar 28.84% <0.00%> (-0.52%) ⬇️
datdog-sidecar-ffi 8.41% <ø> (ø)
spawn-worker 54.69% <ø> (ø)
libdd-tinybytes 93.16% <ø> (ø)
libdd-trace-normalization 81.71% <ø> (ø)
libdd-trace-obfuscation 87.26% <ø> (ø)
libdd-trace-protobuf 68.25% <ø> (ø)
libdd-trace-utils 89.27% <ø> (ø)
libdd-tracer-flare 86.88% <ø> (ø)
libdd-log 74.69% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-datadog-prod-us1

datadog-datadog-prod-us1 Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 0.00%
Overall Coverage: 71.67% (-0.14%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 44e71ac | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts

dd-octo-sts Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 7.63 MB 7.63 MB 0% (0 B) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 83.31 MB 83.31 MB 0% (0 B) 👌
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 99.42 MB 99.42 MB 0% (0 B) 👌
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 10.10 MB 10.10 MB 0% (0 B) 👌
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 25.19 MB 25.19 MB 0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 79.90 KB 79.90 KB 0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 184.54 MB 184.50 MB --.02% (-40.00 KB) 💪
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 918.37 MB 918.37 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 7.89 MB 7.89 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 79.90 KB 79.90 KB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 23.67 MB 23.67 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 46.19 MB 46.19 MB 0% (0 B) 👌
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 21.67 MB 21.67 MB 0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 81.14 KB 81.14 KB 0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 188.61 MB 188.60 MB -0% (-8.00 KB) 👌
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 904.02 MB 904.02 MB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 6.13 MB 6.13 MB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 81.14 KB 81.14 KB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 25.35 MB 25.35 MB 0% (0 B) 👌
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 43.67 MB 43.67 MB 0% (0 B) 👌
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 74.27 MB 74.27 MB 0% (0 B) 👌
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 8.55 MB 8.55 MB 0% (0 B) 👌
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 91.78 MB 91.78 MB 0% (0 B) 👌
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 10.20 MB 10.20 MB 0% (0 B) 👌

@cataphract cataphract force-pushed the glopes/sidecar-tel-retry branch 2 times, most recently from 811fb16 to c933868 Compare April 27, 2026 19:01
…ig not yet available

When the appsec helper posts telemetry via the in-process FFI path
before the PHP IPC set_session_config message is processed, the
telemetry client cache would be poisoned with a Config { endpoint: None
} worker. All subsequent IPC enqueue_actions calls (including
AddEndpoint) would reuse this bad client and never send HTTP requests to
the agent, causing the Laravel8xTests "Endpoints are sent" test to time
out.

Fix: when get_telemetry_client finds session_config is None, return None
instead of calling get_or_create with Config::default(). The receiver
task retries the batch up to 3 times with a 1.5 s delay before dropping.
In the normal case set_session_config arrives within milliseconds so the
first retry succeeds and no telemetry is lost.
@cataphract cataphract force-pushed the glopes/sidecar-tel-retry branch from 2eece2d to 44e71ac Compare April 29, 2026 10:56
@cataphract cataphract requested a review from a team as a code owner April 29, 2026 10:56
@cataphract

Copy link
Copy Markdown
Contributor Author

/merge

@gh-worker-devflow-routing-ef8351

gh-worker-devflow-routing-ef8351 Bot commented Apr 29, 2026

Copy link
Copy Markdown

View all feedbacks in Devflow UI.

2026-04-29 13:54:08 UTC ℹ️ Start processing command /merge


2026-04-29 13:54:12 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 44m (p90).


2026-04-29 15:52:06 UTC ℹ️ MergeQueue: This merge request was merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants