Skip to content

Check the DBM comment region in place, dropping the duplicate-comment substring#11737

Open
dougqh wants to merge 6 commits into
masterfrom
dougqh/dbcommenter-scan-overload
Open

Check the DBM comment region in place, dropping the duplicate-comment substring#11737
dougqh wants to merge 6 commits into
masterfrom
dougqh/dbcommenter-scan-overload

Conversation

@dougqh

@dougqh dougqh commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

What Does This Do

Replaces the per-call sql.substring used by SQLCommenter.hasDDComment.
In place of a sub-string, the code now uses a SubSequence which has lower overhead

Motivation

SubSequence avoids the allocation and copying of a regular sub-string.
And frequently, SubSequence is completely optimized away by escape analysis.

For this code path, the net effect is zero allocation and a modest ~12% improvement in throughput.

Additional Notes

Measured — SQLCommenterDuplicateCommentBenchmark (JDK 17, @Threads(8), @Fork(5), -prof gc)

throughput gc.alloc.rate.norm
before (substring) 23.5M ± 1.1M ops/s 140 B/op
after (range/view) 26.2M ± 1.5M ops/s ~0 (10⁻⁵)

The allocation delta (140 → 0 B/op) is the win and is exact/fork-stable. This path is scan-CPU-dominated, so the ~12% throughput is a modest bonus — the real, compounding win is the per-inject allocation removed across comment-bearing queries.

🤖 Generated with Claude Code

@dougqh dougqh added tag: performance Performance related changes tag: no release notes Changes to exclude from release notes inst: jdbc JDBC instrumentation comp: database Database Monitoring tag: ai generated Largely based on code generated by an AI or LLM labels Jun 24, 2026
@dd-octo-sts

dd-octo-sts Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite Status
Startup 🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results
Scenario Candidate master Δ (95% CI of mean)
startup:insecure-bank:iast:Agent 13.99 s 13.94 s [-0.5%; +1.2%] (no difference)
startup:insecure-bank:tracing:Agent 12.93 s 13.02 s [-1.5%; +0.1%] (no difference)
startup:petclinic:appsec:Agent 16.91 s 16.71 s [+0.2%; +2.2%] (maybe worse)
startup:petclinic:iast:Agent 16.82 s 16.93 s [-1.5%; +0.2%] (no difference)
startup:petclinic:profiling:Agent 15.93 s 16.89 s [-9.9%; -1.5%] (significantly better)
startup:petclinic:sca:Agent 16.83 s 16.66 s [+0.2%; +1.8%] (maybe worse)
startup:petclinic:tracing:Agent 16.09 s 16.21 s [-1.6%; +0.1%] (no difference)

Commit: 0e91753b · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

@dougqh

dougqh commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Benchmark results (zulu-17, JDK 17.0.7, @Threads(8), @Fork(5), -prof gc)

SQLCommenterDuplicateCommentBenchmark.alreadyCommented:

throughput gc.alloc.rate.norm
before (substring) 23.5M ± 1.1M ops/s 140 B/op
after (range/view) 26.2M ± 1.5M ops/s ~0 B/op (10⁻⁵)

The extractCommentContent substring (140 B/op) is gone — the in-place range scan and the SubSequence view it flows through are both EA-elided. The headline win is the allocation: this path is dominated by the nine indexOf scans (CPU the alloc-removal doesn't touch), so throughput is a modest ~1.1× uplift, now resolved at @Fork(5). Benchmark Javadoc refreshed in bd0983f.

@dougqh dougqh force-pushed the dougqh/dbcommenter-scan-overload branch from bd0983f to 24d8ea0 Compare June 29, 2026 18:16
dougqh added a commit that referenced this pull request Jun 29, 2026
…ld the view

Per review on #11737: move SubSequence.of to the call sites (SQLCommenter builds
the comment-region view; the String overload wraps the whole content) and make
SharedDBCommenter.containsTraceComment take a SubSequence instead of (String,
from, to). Behavior unchanged (SubSequence.contains -> Strings.regionContains);
test + dup-comment-benchmark Javadoc updated to the new shape.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dougqh dougqh marked this pull request as ready for review June 29, 2026 19:53
@dougqh dougqh requested review from a team as code owners June 29, 2026 19:53
@dougqh dougqh requested review from jordan-wong and mhlidd and removed request for a team June 29, 2026 19:53
dougqh and others added 4 commits June 29, 2026 20:54
…tent substring

SQLCommenter.hasDDComment materialized sql.substring(commentStart, commentEnd)
on every duplicate-comment check just to feed containsTraceComment. Add a
SharedDBCommenter.containsTraceComment(sql, from, to) range overload that scans
the comment body in place, and a Strings.regionContains primitive it (and the
String delegate) build on -- no per-call substring. The String overload now
delegates to the range form, so Mongo behavior is unchanged.

regionContains is the allocation-free, copy-free primitive; the natural-reading
SubSequence.contains layer can delegate to it later. Boundary semantics unit-
tested on Strings.regionContains; DB needle behavior on the overloads; existing
SQLCommenter/SharedDBCommenter suites unchanged and green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…diom)

Add SubSequence.contains (delegating to the Strings.regionContains primitive)
and rewrite containsTraceComment(sql, from, to) as
SubSequence.of(sql, from, to).contains(...) -- a 1-to-1 substitution for what
you'd idiomatically write on a substring, with no copy. EA elides the view in
the transient consumption; even if it doesn't, the string copy is still avoided.

Couples this branch to the SubSequence base (#10640); regionContains stays as
the allocation-free primitive the view delegates to.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Isolates the duplicate-comment guard (dbType=null skips the first-word scan;
already-DD-commented SQL makes inject return early). The extractCommentContent
substring allocated 140 B/op; the in-place range/view scan is EA-elided (~0).
@threads(8), @fork(2), -prof gc; numbers in the class Javadoc. Throughput is
flat-to-slightly-up but within @fork(2) noise -- this path is scan-CPU-dominated,
so the win is the allocation, not throughput.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
zulu-17 @fork(5), -prof gc: 23.5M -> 26.2M ops/s (~1.1x), 140 -> ~0 B/op.
Headline win is the allocation; @fork(5) tightens the earlier bimodal spread.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…scan-overload

# Conflicts:
#	internal-api/src/main/java/datadog/trace/util/SubSequence.java
#	internal-api/src/test/java/datadog/trace/util/SubSequenceTest.java
@datadog-official

This comment has been minimized.

@dougqh dougqh enabled auto-merge July 1, 2026 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: database Database Monitoring inst: jdbc JDBC instrumentation tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes tag: performance Performance related changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants