[PROF-15268] POC: per-thread JFR-inflight counter (follow-up to #614)#629
Draft
r1viollet wants to merge 1 commit into
Draft
[PROF-15268] POC: per-thread JFR-inflight counter (follow-up to #614)#629r1viollet wants to merge 1 commit into
r1viollet wants to merge 1 commit into
Conversation
POC follow-up to #614 evaluating whether moving the SignalInflight counter from a global atomic to per-thread storage on ProfiledThread is a viable alternative. - ProfiledThread: adds _jfr_inflight (atomic RMW on owner-thread write, ACQUIRE-read from drain), _registry_next intrusive pointer, and a spinlock-protected registry head. initCurrentThread / freeKey / current() insert / remove around the pthread_key lifecycle. - SignalInflight: enter/exit prefer the current ProfiledThread's per- thread counter; fall back to the existing global counter for threads that fire signals before initCurrentThread runs. drain() iterates the registry summing per-thread counters plus the fallback. Cache-line contention on the counter is eliminated on the fast path. The J9 longjmp leak documented in signalInflight.h is not yet closed by this commit (needs segvHandler/busHandler hooks to reset the current thread's slot before chaining) — left for a follow-up if the design survives review. Verified: buildDebug, compileRelease --rerun-tasks, ShutdownTest, JavaProfilerTest, CollapsingSleepTest, SmokeWallTest (all cstack modes).
Contributor
CI Test ResultsRun: #28648085482 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-07-03 08:39:21 UTC |
Contributor
Benchmark Results (commit c27d0bd)Pipeline: https://gitlab.ddbuild.io/DataDog/apm-reliability/benchmarking-platform/-/pipelines/122534027 Commit:
|
| Benchmark | JDK | Latest | Dev | Δ (dev vs latest) | Issues L/D |
|---|---|---|---|---|---|
| akka-uct | 21 | ✅ 10314 ms (7 iters) | ✅ 10135 ms (7 iters) | ≈ -1.7% (±19.2%) | — / — |
| akka-uct | 25 | ✅ 8752 ms (8 iters) | ✅ 8932 ms (8 iters) | ≈ +2.1% (±20.4%) | — / — |
| finagle-chirper | 21 | ✅ 5973 ms (11 iters) | ✅ 5940 ms (11 iters) | ≈ -0.6% (±45.5%) | |
| finagle-chirper | 25 | ✅ 5526 ms (12 iters) | ✅ 5411 ms (12 iters) | ≈ -2.1% (±42.4%) | |
| fj-kmeans | 21 | ✅ 2834 ms (22 iters) | ✅ 2747 ms (23 iters) | ≈ -3.1% (±4.3%) | — / — |
| fj-kmeans | 25 | ✅ 2803 ms (22 iters) | ✅ 2833 ms (22 iters) | ≈ +1.1% (±4.5%) | — / — |
| future-genetic | 21 | ✅ 2191 ms (28 iters) | ✅ 2084 ms (30 iters) | 🟢 -4.9% | — / — |
| future-genetic | 25 | ✅ 2076 ms (30 iters) | ✅ 2140 ms (29 iters) | ≈ +3.1% (±4.4%) | — / — |
| naive-bayes | 21 | ✅ 1215 ms (47 iters) | ✅ 1298 ms (44 iters) | ≈ +6.8% (±58%) | — / — |
| naive-bayes | 25 | ✅ 1016 ms (56 iters) | ✅ 1016 ms (56 iters) | ≈ 0% (±53.9%) | — / — |
| reactors | 21 | ✅ 16142 ms (5 iters) | ✅ 16550 ms (5 iters) | ≈ +2.5% (±14.6%) | — / — |
| reactors | 25 | ✅ 19105 ms (5 iters) | ✅ 18537 ms (5 iters) | ≈ -3% (±7%) | — / — |
Internal counter details (ddprof)
ddprof internal counters, latest / dev (✅ = 0, · = unavailable):
| Benchmark | JDK | Dropped rec | Dropped jvmti | Dropped trace | Skipped WC | AGCT fail | Unwind fail |
|---|---|---|---|---|---|---|---|
| akka-uct | 21 | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ |
| akka-uct | 25 | ✅ / ✅ | ✅ / ✅ | 7 / 4 | 2297 / 2130 | ✅ / ✅ | ✅ / ✅ |
| finagle-chirper | 21 | ✅ / ✅ | ✅ / ✅ | 1 / ✅ | 8310 / ✅ | ✅ / ✅ | ✅ / ✅ |
| finagle-chirper | 25 | ✅ / ✅ | ✅ / ✅ | 3 / 2 | 8690 / 8477 | ✅ / ✅ | ✅ / ✅ |
| fj-kmeans | 21 | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ |
| fj-kmeans | 25 | ✅ / ✅ | ✅ / ✅ | 2 / 8 | 1257 / 1274 | ✅ / ✅ | ✅ / ✅ |
| future-genetic | 21 | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ |
| future-genetic | 25 | ✅ / ✅ | ✅ / ✅ | 3 / 2 | 2879 / 2909 | ✅ / ✅ | ✅ / ✅ |
| naive-bayes | 21 | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ |
| naive-bayes | 25 | ✅ / ✅ | ✅ / ✅ | 4 / 3 | 3486 / 3453 | ✅ / ✅ | ✅ / ✅ |
| reactors | 21 | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ |
| reactors | 25 | ✅ / ✅ | ✅ / ✅ | ✅ / ✅ | 1937 / 1886 | ✅ / ✅ | ✅ / ✅ |
|
Contributor
Reliability & Chaos Results❌ 1 failure(s) detected Pipeline: https://gitlab.ddbuild.io/DataDog/java-profiler/-/pipelines/122534023 ❌ chaos: profiler tracer gmalloc amd64 21 0 3 temXchaos |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
POC follow-up to #614. Moves the
SignalInflightcounter from a single global atomic to per-thread storage onProfiledThread, to see if the alternative discussed in PROF-15268 holds up.How
ProfiledThreadgets an_jfr_inflightfield (owner-thread atomic RMW, cross-thread ACQUIRE read from drain) and an intrusive_registry_nextpointer.ProfiledThread.initCurrentThread/freeKey/current()insert and remove.SignalInflight::enter/exituse the currentProfiledThreadwhen available and fall back to the existing global counter otherwise.SignalInflight::drain()iterates the registry summing per-thread counters plus the fallback.What this fixes vs #614
segvHandler/busHandlerbefore chaining. Not done in this POC.What this does not do
segvHandler/busHandlerhooks yet — the J9 longjmp leak is still open until that follow-up.initCurrentThreadruns.Local testing
buildDebug,compileRelease --rerun-tasks,ShutdownTest,JavaProfilerTest,CollapsingSleepTest,SmokeWallTest(all cstack modes). All green.For Datadog employees