Skip to content

Fix void post hook overwriting retval in OTel hook polyfill#3920

Merged
Leiyks merged 4 commits into
masterfrom
leiyks/fix-otel-void-posthook-retval
May 28, 2026
Merged

Fix void post hook overwriting retval in OTel hook polyfill#3920
Leiyks merged 4 commits into
masterfrom
leiyks/fix-otel-void-posthook-retval

Conversation

@Leiyks

@Leiyks Leiyks commented May 27, 2026

Copy link
Copy Markdown
Contributor

Summary

  • dd_uhook_end in ext/hook/uhook_otel.c masked the post closure's return type with IS_VOID (the raw type code, 14) instead of MAY_BE_VOID (the bitmask flag 1 << 14 used inside PURE_MASK). The AND between a bitmask and a raw type code is always zero, so the "skip retval replacement for :void post hooks" branch never fired and every :void post hook silently overwrote the function's actual return value with the closure's implicit-null return.
  • Add a .phpt regression covering both branches (:void preserves retval, typed-non-void overrides retval).

How the bug surfaced

dd-trace polyfills \OpenTelemetry\Instrumentation\hook and forces extension_loaded("opentelemetry") to true (ext/hook/uhook_otel.c:269). Packages like opentelemetry-auto-symfony register hooks against this polyfill even when the real OTel C extension is absent.

SymfonyInstrumentation::register() installs a :void post hook on Symfony\Component\HttpKernel\HttpKernel::handle. With the buggy mask, that :void closure's implicit null return was copied into the real Response retval. The outer Symfony\Component\HttpKernel\Kernel::handle's :Response runtime check then fired:

TypeError: Symfony\Component\HttpKernel\Kernel::handle(): Return value must be of type
Symfony\Component\HttpFoundation\Response, null returned

Not Symfony-version-specific. Triggers for any :void post hook on any function with a non-void return type.

@datadog-datadog-prod-us1-2

datadog-datadog-prod-us1-2 Bot commented May 27, 2026

Copy link
Copy Markdown

Pipelines  Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🚦 17 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [8.3]   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). 2 failed tests in dynamic configuration and environment reading. Error: A non-numeric value encountered in run-tests.php:3474.

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [7.4]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. 2 failed tests: HTTP Agent headers are ignored from userland and Installing a live debugger span decoration probe. Error: Connection timed out while fetching request

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [8.1]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Job failed due to timeout during test execution.

View all 17 failed jobs.

❄️ 1 New flaky test detected

tmp/build_extension/tests/ext/sandbox/install_hook/otel_void_post_hook_preserves_retval.phpt (OpenTelemetry hook polyfill: post hook with `: void` return type must not overwrite the function&#39;s return value) from PHP.tmp.build_extension.tests.ext.sandbox.install_hook   View in Datadog (Fix with Cursor)
001&#43; Parse error: syntax error, unexpected &#39;:&#39;, expecting &#39;)&#39; in tmp/build_extension/tests/ext/sandbox/install_hook/otel_void_post_hook_preserves_retval.php on line 20
001- string(21) &#34;expected-return-value&#34;
002- string(24) &#34;overridden-by-typed-hook&#34;

New test introduced in this PR is flaky.

View in Flaky Test Management

ℹ️ Info

No other issues found (see more)

🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 60.71% (-0.01%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 6211e0a | Docs | Datadog PR Page | Give us feedback!

@Leiyks Leiyks force-pushed the leiyks/fix-otel-void-posthook-retval branch from d4a4192 to b93b323 Compare May 27, 2026 12:30
@Leiyks Leiyks marked this pull request as ready for review May 27, 2026 12:33
@Leiyks Leiyks requested a review from a team as a code owner May 27, 2026 12:33

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b93b323195

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

\OpenTelemetry\Instrumentation\hook(
Target::class,
'handle',
pre: null,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the new PHPT parseable on PHP 7

In PHP 7.x runs of tests/ext/sandbox/install_hook, this new test is parsed as part of the PHPT and pre:/post: named arguments are a syntax error before the regression can run. The repository still carries PHP 7 test coverage and adjacent install-hook tests add explicit skips for PHP-version-specific syntax, so this will fail those jobs rather than testing the C fix; use positional arguments here (and skip only versions that cannot parse : void, if needed).

Useful? React with 👍 / 👎.

Leiyks added 3 commits May 27, 2026 14:42
dd_uhook_end in ext/hook/uhook_otel.c masked the post closure's return
type with IS_VOID (= 14, the raw type code) instead of MAY_BE_VOID
(= 1 << 14, the bitmask flag used inside PURE_MASK). The bitwise AND
between a bitmask and a raw type code is always zero, so the "skip
retval replacement for :void post hooks" branch never fired and every
:void post hook silently overwrote the function's actual return value
with the closure's implicit-null return.

The visible symptom is a TypeError on Symfony\Component\HttpKernel\Kernel::handle
when opentelemetry-auto-symfony is present in vendor/. dd-trace polyfills
\OpenTelemetry\Instrumentation\hook and forces extension_loaded("opentelemetry")
to true, so the package's :void post hook on HttpKernel::handle is installed
against the polyfill — and then nuked the Response the outer Kernel::handle
was about to return. The bug isn't Symfony-version-specific; it triggers
for any :void post hook on any function with a non-void return type.

Add a focused .phpt regression covering both the :void preserves-retval
path and the typed-non-void overrides-retval path.
The OpenTelemetry hook polyfill (dd_register_opentelemetry_wrapper) is
gated on PHP_VERSION_ID >= 80000 in ext/hook/uhook.c:1183, and the test
itself uses named-argument syntax (pre:, post:) introduced in PHP 8.0.
Without the SKIPIF, CI fails on every PHP 7.x leg with a parse error
before SKIPIF would even be evaluated.
@bwoebi

bwoebi commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Fix is right, the test needs fixing :-)

@pr-commenter

pr-commenter Bot commented May 27, 2026

Copy link
Copy Markdown

Benchmarks [ tracer ]

Benchmark execution time: 2026-05-27 16:27:15

Comparing candidate commit 6211e0a in PR branch leiyks/fix-otel-void-posthook-retval with baseline commit 3fb0a73 in branch master.

Found 4 performance improvements and 4 performance regressions! Performance is the same for 186 metrics, 0 unstable metrics.

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

  • 🟩 execution_time [-5.508µs; -4.312µs] or [-5.239%; -4.100%]

scenario:PDOBench/benchPDOOverhead-opcache

  • 🟩 execution_time [-8.980µs; -7.039µs] or [-3.552%; -2.784%]

scenario:PDOBench/benchPDOOverheadWithDBM-opcache

  • 🟩 execution_time [-8.758µs; -6.703µs] or [-3.464%; -2.651%]

scenario:PHPRedisBench/benchRedisOverhead-opcache

  • 🟩 execution_time [-48.572µs; -35.359µs] or [-4.683%; -3.409%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1

  • 🟥 execution_time [+51.264ns; +137.536ns] or [+3.498%; +9.386%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2

  • 🟥 execution_time [+56.607ns; +148.593ns] or [+3.870%; +10.159%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

  • 🟥 execution_time [+65.053ns; +140.347ns] or [+4.438%; +9.574%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4-opcache

  • 🟥 execution_time [+274.177ns; +506.823ns] or [+2.365%; +4.372%]

The MAY_BE_VOID regression test is meaningful only when DD_TRACE_ENABLED=1
at request shutdown. When tracing is disabled, ddtrace_post_deactivate
fires before PHP releases the extra reference it holds on :void closure
literals declared in the main script's op_array, producing a stray leak
warning unrelated to the IS_VOID/MAY_BE_VOID fix this test guards. Gate
the test on datadog.trace.cli_enabled=1 (with an explicit comment) so
make test_c_disabled cleanly skips it, mirroring the precedent set by
tests/ext/bailout_double_hook_clear.phpt.
@bwoebi

bwoebi commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Nice find!

@Leiyks Leiyks merged commit 217e1c7 into master May 28, 2026
2110 of 2128 checks passed
@Leiyks Leiyks deleted the leiyks/fix-otel-void-posthook-retval branch May 28, 2026 10:16
@github-actions github-actions Bot added this to the 1.21.0 milestone May 28, 2026
@Leiyks Leiyks restored the leiyks/fix-otel-void-posthook-retval branch May 28, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants