Skip to content

Return SubSequence from SQLCommenter.getFirstWord to avoid per-inject substring#11736

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 5 commits into
masterfrom
dougqh/sqlcommenter-getfirstword-subseq
Jun 30, 2026
Merged

Return SubSequence from SQLCommenter.getFirstWord to avoid per-inject substring#11736
gh-worker-dd-mergequeue-cf854d[bot] merged 5 commits into
masterfrom
dougqh/sqlcommenter-getfirstword-subseq

Conversation

@dougqh

@dougqh dougqh commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

What Does This Do

Updates SQLCommenter.getFirstWord to return a SubSequence instead of a sub-String.
By using SubSequence, we avoid the sub-string allocation and often the SubSequence will be optimized away completely via escape analysis.

Motivation

Reduce work in application threads
Reduce JDBC integration allocation

Additional Notes

SQLCommenter.getFirstWord returns a zero-copy SubSequence instead of sql.substring(...). Its result is only consumed by startsWith("{") / equalsIgnoreCase("call") in inject, so the per-inject substring allocation was pure waste. The returned view carries a "transient — do not retain" warning (it pins its backing String).

It also carries SubSequence.startsWith/equalsIgnoreCase (retracted from #10640 so they land with their motivating consumer rather than as speculative API).

SQLCommenterGetFirstWordBenchmark (JDK 17, @Threads(8), @Fork(5), -prof gc)

throughput gc.alloc.rate.norm
before (substring) 258.1M ± 21.0M ops/s 48 B/op
after (SubSequence) 508.0M ± 21.6M ops/s ~0 (10⁻⁴)

Escape analysis fully elides the view in the transient consumption: 48 B/op → ~0, ~2× throughput. (The win is contingent on non-escape — confirming why the view must not be retained.)

🤖 Generated with Claude Code

@dougqh dougqh added tag: performance Performance related changes tag: no release notes Changes to exclude from release notes type: refactoring inst: jdbc JDBC instrumentation tag: ai generated Largely based on code generated by an AI or LLM labels Jun 24, 2026
@dd-octo-sts

dd-octo-sts Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite Status
Startup 🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results
Scenario Candidate master Δ (95% CI of mean)
startup:insecure-bank:iast:Agent 13.96 s 13.92 s [-0.4%; +0.9%] (no difference)
startup:insecure-bank:tracing:Agent 12.84 s 13.00 s [-1.9%; -0.5%] (maybe better)
startup:petclinic:appsec:Agent 17.44 s 17.12 s [+0.9%; +2.7%] (maybe worse)
startup:petclinic:iast:Agent 17.41 s 17.41 s [-1.1%; +1.2%] (no difference)
startup:petclinic:profiling:Agent 17.37 s 17.32 s [-0.8%; +1.4%] (no difference)
startup:petclinic:sca:Agent 16.82 s 17.42 s [-7.7%; +1.0%] (no difference)
startup:petclinic:tracing:Agent 16.52 s 16.62 s [-1.7%; +0.5%] (no difference)

Commit: e66f238a · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

@dougqh

dougqh commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Benchmark results (zulu-17, JDK 17.0.7, @Threads(8), @Fork(5), -prof gc)

SQLCommenterGetFirstWordBenchmark.firstWordCheck:

throughput gc.alloc.rate.norm
before (substring) 258.1M ± 21.0M ops/s 48 B/op
after (SubSequence) 508.0M ± 21.6M ops/s ~0 B/op (10⁻⁴)

The SubSequence view is escape-analysis–elided in the transient firstWord check, so the old substring's 48 B/op (a String + its backing array) drops to ~0 and throughput rises ~2× at @Threads(8). @Fork(5) tightens the earlier bimodal @Fork(2) spread. Benchmark Javadoc refreshed in 89ca7b8.

dougqh and others added 4 commits June 29, 2026 11:06
…comparisons)

These are the ops a real classification parse needs but CharSequence lacks —
surfaced by porting SQLCommenter.getFirstWord (firstWord.startsWith("{"),
firstWord.equalsIgnoreCase("call")). Lets transient parse views be compared
without materializing a String. + JUnit 5 tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… substring

getFirstWord's result is only consumed by startsWith("{") and
equalsIgnoreCase("call"), so materializing a substring on every inject() was
pure waste. Return a zero-copy SubSequence view instead and use its
startsWith/equalsIgnoreCase. Behavior is unchanged (91 SQLCommenterTest cases
pass); the test compares getFirstWord(sql).toString().

This is the motivating consumer for SubSequence.startsWith/equalsIgnoreCase
(retracted from #10640 and carried here so they land with their use case).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Benchmarks the getFirstWord first-word scan, consuming the result via the real
transient pattern (startsWith, returning the boolean) so EA isn't faked away.
@threads(8), @fork(2), -prof gc: the old substring allocated 48 B/op per call;
the SubSequence view is fully EA-elided (~0 B/op), ~1.6x throughput. Numbers in
the class Javadoc.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
zulu-17 @fork(5), -prof gc: 258.1M -> 508.0M ops/s (~2x), 48 -> ~0 B/op.
@fork(5) tightens the earlier bimodal @fork(2) spread.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dougqh dougqh force-pushed the dougqh/sqlcommenter-getfirstword-subseq branch from 89ca7b8 to 2f648c7 Compare June 29, 2026 15:10
@dougqh dougqh marked this pull request as ready for review June 29, 2026 17:04
@dougqh dougqh requested review from a team as code owners June 29, 2026 17:04
@dougqh dougqh requested review from ygree and removed request for a team June 29, 2026 17:04
* for large SQL). Consume it in place and discard; if the value must be retained, call {@link
* SubSequence#toString()} to detach a standalone copy.
*/
protected static SubSequence getFirstWord(String sql) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for the generic SubSequence case to have extra helper methods such as startsWith and equalsIgnoreCase. Looking back at the concrete getFirstWord use cases, though, it could be much simpler: I would eliminate getFirstWord completely, since it does not seem to add much value here.

It is only used to skip whitespace and check whether the next token is either { or a case-insensitive call followed by whitespace. That could be checked directly in a small amount of code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I've been erring on the side of keeping the changes 1-to-1 and very mechanical, but I do agree that further simplification might make sense here.

@dougqh dougqh enabled auto-merge June 30, 2026 17:32
@datadog-datadog-prod-us1

This comment has been minimized.

@dougqh dougqh added this pull request to the merge queue Jun 30, 2026
@dd-octo-sts

dd-octo-sts Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

/merge

@gh-worker-devflow-routing-ef8351

gh-worker-devflow-routing-ef8351 Bot commented Jun 30, 2026

Copy link
Copy Markdown

View all feedbacks in Devflow UI.

2026-06-30 18:25:48 UTC ℹ️ Start processing command /merge


2026-06-30 18:25:52 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 2h (p90).


2026-06-30 19:24:13 UTC ℹ️ MergeQueue: This merge request was merged

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 30, 2026
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 015a24f into master Jun 30, 2026
583 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the dougqh/sqlcommenter-getfirstword-subseq branch June 30, 2026 19:24
@github-actions github-actions Bot added this to the 1.64.0 milestone Jun 30, 2026
dougqh added a commit that referenced this pull request Jun 30, 2026
The #11736 CharSequence (charAt-loop) versions are superseded by the
String-delegating overloads here; String-literal callers (SQLCommenter)
bind to the String overloads. Removes the redundant pair + their tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: jdbc JDBC instrumentation tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes tag: performance Performance related changes type: refactoring

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants