fix(test): isolate e2e suites by unique workflow/execution id by Ma77Ball · Pull Request #5888 · apache/texera

Ma77Ball · 2026-06-22T09:11:26Z

What changes were proposed in this PR?

Add TestUtils.workflowContext(id, settings) that sets both workflowId and executionId to id, and make the DB fixtures plus setUp/cleanupWorkflowExecutionData take an id (the user email is derived from id to avoid the unique-email collision).
Give each materializing e2e suite a distinct id so concurrent suites no longer share an Iceberg result keyspace or DB rows: DataProcessingSpec=1, PauseSpec=2, ReconfigurationSpec=3, ReconfigurationIntegrationSpec=4.
Test-only change, no production code; BatchSizePropagationSpec and CheckpointSpec are untouched because they do not materialize results.

Any related issues, documentation, discussions?

Closes: #5887

How was this PR tested?

sbt "WorkflowExecutionService/Test/compile" compiles clean on Java 17.
Run the previously-flaky suite against the integration services (Postgres test DB + MinIO/S3 + Iceberg catalog, as in the build / amber CI job): sbt "WorkflowExecutionService/testOnly *DataProcessingSpec"; expect all DataProcessingSpec tests green with no CommitFailedException flake.
The timing-dependent flake could not be reproduced locally (no MinIO/Iceberg env), so final verification is the build / amber CI job staying green across re-runs.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8 in compliance with ASF

…llision

github-actions · 2026-06-22T09:37:52Z

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

Contributors with relevant context: @Yicong-Huang, @aglinxinyuan
You can notify them by mentioning @Yicong-Huang, @aglinxinyuan in a comment.

codecov-commenter · 2026-06-22T10:16:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 54.11%. Comparing base (439ea72) to head (39f3d0c).

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #5888      +/-   ##
============================================
+ Coverage     54.09%   54.11%   +0.02%     
+ Complexity     2817     2812       -5     
============================================
  Files          1103     1103              
  Lines         42650    42624      -26     
  Branches       4588     4586       -2     
============================================
- Hits          23070    23067       -3     
+ Misses        18236    18217      -19     
+ Partials       1344     1340       -4

Flag	Coverage Δ		*Carryforward flag
access-control-service	`70.44% <ø> (ø)`		Carriedforward from 3ced705
agent-service	`34.36% <ø> (ø)`		Carriedforward from 3ced705
amber	`55.63% <ø> (+0.01%)`	⬆️
computing-unit-managing-service	`1.65% <ø> (ø)`		Carriedforward from 3ced705
config-service	`57.35% <ø> (ø)`		Carriedforward from 3ced705
file-service	`57.36% <ø> (-1.24%)`	⬇️	Carriedforward from 3ced705
frontend	`48.15% <ø> (+0.06%)`	⬆️	Carriedforward from 3ced705
pyamber	`90.20% <ø> (ø)`		Carriedforward from 3ced705
python	`90.76% <ø> (ø)`		Carriedforward from 3ced705
workflow-compiling-service	`58.69% <ø> (ø)`		Carriedforward from 3ced705

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-22T11:23:54Z

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 3 worse · ⚪ 12 noise (<±5%) · 0 without baseline

Compared against main 439ea72 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🔴	bs=10 sw=10 sl=64	361	0.22	27,844/41,024/41,024 us	🔴 +10.7% / 🔴 +17.3%
⚪	bs=100 sw=10 sl=64	812	0.496	121,677/142,043/142,043 us	⚪ within ±5% / 🔴 -9.0%
⚪	bs=1000 sw=10 sl=64	918	0.56	1,093,160/1,131,893/1,131,893 us	⚪ within ±5% / 🔴 +12.4%

Baseline details

Latest main 439ea72 from same runner

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	361 tuples/sec	384 tuples/sec	410.82 tuples/sec	-6.0%	-12.1%
bs=10 sw=10 sl=64	MB/s	0.22 MB/s	0.235 MB/s	0.251 MB/s	-6.4%	-12.3%
bs=10 sw=10 sl=64	p50	27,844 us	25,144 us	23,785 us	+10.7%	+17.1%
bs=10 sw=10 sl=64	p95	41,024 us	42,620 us	34,980 us	-3.7%	+17.3%
bs=10 sw=10 sl=64	p99	41,024 us	42,620 us	34,980 us	-3.7%	+17.3%
bs=100 sw=10 sl=64	throughput	812 tuples/sec	821 tuples/sec	891.94 tuples/sec	-1.1%	-9.0%
bs=100 sw=10 sl=64	MB/s	0.496 MB/s	0.501 MB/s	0.544 MB/s	-1.0%	-8.9%
bs=100 sw=10 sl=64	p50	121,677 us	118,606 us	112,277 us	+2.6%	+8.4%
bs=100 sw=10 sl=64	p95	142,043 us	146,972 us	139,802 us	-3.4%	+1.6%
bs=100 sw=10 sl=64	p99	142,043 us	146,972 us	139,802 us	-3.4%	+1.6%
bs=1000 sw=10 sl=64	throughput	918 tuples/sec	938 tuples/sec	1,041 tuples/sec	-2.1%	-11.8%
bs=1000 sw=10 sl=64	MB/s	0.56 MB/s	0.572 MB/s	0.635 MB/s	-2.1%	-11.9%
bs=1000 sw=10 sl=64	p50	1,093,160 us	1,069,627 us	972,714 us	+2.2%	+12.4%
bs=1000 sw=10 sl=64	p95	1,131,893 us	1,107,261 us	1,023,057 us	+2.2%	+10.6%
bs=1000 sw=10 sl=64	p99	1,131,893 us	1,107,261 us	1,023,057 us	+2.2%	+10.6%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,554.57,200,128000,361,0.220,27843.59,41024.08,41024.08
1,100,10,64,20,2462.66,2000,1280000,812,0.496,121676.66,142043.34,142043.34
2,1000,10,64,20,21784.47,20000,12800000,918,0.560,1093159.85,1131893.35,1131893.35

Ma77Ball · 2026-06-22T18:48:55Z

/request-review @aglinxinyuan @Yicong-Huang

Copilot

Pull request overview

This PR addresses flaky Amber end-to-end (e2e) tests caused by multiple suites concurrently sharing the same (workflowId, executionId) and DB fixture primary keys, which leads to collisions in the shared Iceberg result keyspace and Postgres rows. It introduces an explicit per-suite identifier to isolate workflows/executions and related DB fixtures so the suites can run concurrently without interfering.

Changes:

Added TestUtils.workflowContext(id, settings) to create a WorkflowContext with both workflowId and executionId set from a provided suite id.
Parameterized e2e DB fixtures and setUp/cleanupWorkflowExecutionData to use the suite id (including deriving a unique email from that id).
Updated the materializing e2e suites to use distinct ids (DataProcessing=1, Pause=2, Reconfiguration=3, ReconfigurationIntegration=4).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
amber/src/test/scala/org/apache/texera/amber/engine/e2e/TestUtils.scala	Adds an id-based `WorkflowContext` constructor and id-parameterized DB fixtures/setup/cleanup to avoid cross-suite collisions.
amber/src/test/scala/org/apache/texera/amber/engine/e2e/DataProcessingSpec.scala	Assigns a stable per-suite id and uses it for workflow contexts and DB setup/cleanup.
amber/src/test/scala/org/apache/texera/amber/engine/e2e/PauseSpec.scala	Assigns a stable per-suite id and uses it for workflow context creation and DB setup/cleanup.
amber/src/test/scala/org/apache/texera/amber/engine/e2e/ReconfigurationSpec.scala	Assigns a stable per-suite id and uses it for workflow context creation and DB setup/cleanup.
amber/src/test/integration/org/apache/texera/amber/engine/e2e/ReconfigurationIntegrationSpec.scala	Assigns a stable per-suite id and uses it for workflow contexts and DB setup/cleanup (including warmup).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Yicong-Huang

LGTM, thanks. Let's mark this PR a fix(test) and backport it to v1.2 I assume this flakiness should also happen for v1.2

Yicong-Huang · 2026-06-22T20:48:17Z

cc @xuang7 for backport

Yicong-Huang · 2026-06-23T05:18:46Z

As there are conflicts, @Ma77Ball please help on manually backport this to release/v1.2. The reason for backport is this flaky test could causes CI to fail from time to time on that branch.

…e/v1.2 backport] (#5903) ### What changes were proposed in this PR? - Backport of #5888 to `release/v1.2`, applied as `git cherry-pick -x 8803d08`. - Each amber e2e suite now gets a distinct id via a new `TestUtils.workflowContext(id, ...)` helper (DataProcessingSpec=1, PauseSpec=2, ReconfigurationSpec=3, ReconfigurationIntegrationSpec=4), and the DB fixture rows (user/workflow/version/execution, plus a per-id email) are derived from that id, so concurrent suites no longer collide in the shared Iceberg keyspace. - One conflict resolved, in `TestUtils.scala` imports only: kept release/v1.2's `org.apache.texera.amber.config.StorageConfig` path and took the cherry-pick's added `ExecutionIdentity`/`WorkflowIdentity`/`WorkflowSettings` imports needed by the new helper. The four spec files applied cleanly. - The `runWorkflowAndReadTerminalResults`/`readMaterializedResults` helpers that exist on `main` came from a separate change not on `release/v1.2`; #5888 does not touch them, so they are correctly absent here. ### Any related issues, documentation, discussions? Related to #5887 ### How was this PR tested? - Behavior is unchanged from #5888 (test-only isolation), so it relies on the existing e2e suites carried over verbatim. Reviewer can run them in the integration job: `sbt "amber/IntegrationTest/testOnly *ReconfigurationIntegrationSpec"`, and the unit e2e suites via `sbt "amber/testOnly *DataProcessingSpec *PauseSpec *ReconfigurationSpec"`; expect all green with no Iceberg `CommitFailedException` keyspace collisions across concurrently-running suites. - Local `sbt` could not load on `release/v1.2` (its `project/` lacks `AddMetaInfLicenseFiles`, unrelated to this change), so full compile/test is left to CI on this PR. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.8 in compliance with ASF

…#5888) ### What changes were proposed in this PR? - Add `TestUtils.workflowContext(id, settings)` that sets both `workflowId` and `executionId` to `id`, and make the DB fixtures plus `setUp`/`cleanupWorkflowExecutionData` take an `id` (the user email is derived from `id` to avoid the unique-email collision). - Give each materializing e2e suite a distinct id so concurrent suites no longer share an Iceberg result keyspace or DB rows: DataProcessingSpec=1, PauseSpec=2, ReconfigurationSpec=3, ReconfigurationIntegrationSpec=4. - Test-only change, no production code; BatchSizePropagationSpec and CheckpointSpec are untouched because they do not materialize results. ### Any related issues, documentation, discussions? Closes: apache#5887 ### How was this PR tested? - `sbt "WorkflowExecutionService/Test/compile"` compiles clean on Java 17. - Run the previously-flaky suite against the integration services (Postgres test DB + MinIO/S3 + Iceberg catalog, as in the `build / amber` CI job): `sbt "WorkflowExecutionService/testOnly *DataProcessingSpec"`; expect all DataProcessingSpec tests green with no CommitFailedException flake. - The timing-dependent flake could not be reproduced locally (no MinIO/Iceberg env), so final verification is the `build / amber` CI job staying green across re-runs. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.8 in compliance with ASF

Reworks the loop back-edge write address per the apache#5900 review decision: the URI is constant per-execution config, not per-iteration data, so it no longer rides State rows / the StateFrame envelope (the merged 3-column format has no loop_start_state_uri). Instead the scheduler resolves it and ships it to workers at setup: - PhysicalOp gains a compile-time `isLoopStart` marker (set by LoopStartOpDesc, mirroring requiresMaterializedExecution). - WorkflowExecutionCoordinator derives {LoopStart logical op id -> state URI of its single input port} from the final resource-allocated schedule; the two single-port/single-reader guards that lived in the Python worker's _compute_loop_start_id move controller-side. - InitializeExecutorRequest gains `map<string,string> loopStartStateUris` (field 4), sent to every worker each region (re-)run; the Python handler stores it on Context. No static LoopStart<->LoopEnd pairing is needed: a Loop End selects the entry by the loop_start_id carried on the StateFrame it consumes (nested loops work unchanged). - MainLoop: _compute_loop_start_id shrinks to the id parse; _jump_to_loop_start looks the URI up from the injected config and fails loudly (before the jump RPC) if it is missing; the inner-LoopStart nested pass-through gate keys on frame.loop_start_id instead of the removed URI field. - LoopIntegrationSpec adopts the per-suite id isolation from apache#5888 (specId 5), and WorkflowService drops a leftover errorSubject call removed on main by apache#5922. Tests: reworked test_main_loop.py to the 3-arg emit convention and the config-driven jump (plus a new missing-URI fail-loud case), added InitializeExecutorHandler coverage for the new field, pinned isLoopStart true/false in the Loop Start/End descriptor specs.

added unique specId to engine test cases to avoid iceberg keyspace co…

e35854f

…llision

github-actions Bot added engine fix amber-integration labels Jun 22, 2026

github-actions Bot assigned Ma77Ball Jun 22, 2026

Merge branch 'main' into fix-e2e-iceberg-keyspace-collision

c019c5d

Ma77Ball marked this pull request as ready for review June 22, 2026 11:07

github-actions Bot requested a review from aglinxinyuan June 22, 2026 11:16

Merge branch 'main' into fix-e2e-iceberg-keyspace-collision

9249d92

github-actions Bot requested a review from Yicong-Huang June 22, 2026 18:49

aglinxinyuan requested a review from Copilot June 22, 2026 19:43

Copilot started reviewing on behalf of aglinxinyuan June 22, 2026 19:46 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Yicong-Huang approved these changes Jun 22, 2026

View reviewed changes

xuang7 added the release/v1.2 back porting to release/v1.2 label Jun 22, 2026

Ma77Ball changed the title ~~test(amber): isolate e2e suites by unique workflow/execution id~~ fix(test): isolate e2e suites by unique workflow/execution id Jun 22, 2026

Merge branch 'main' into fix-e2e-iceberg-keyspace-collision

39f3d0c

Yicong-Huang removed the release/v1.2 back porting to release/v1.2 label Jun 23, 2026

Yicong-Huang enabled auto-merge June 23, 2026 05:27

Yicong-Huang added this pull request to the merge queue Jun 23, 2026

Merged via the queue into apache:main with commit 8803d08 Jun 23, 2026
32 of 36 checks passed

This was referenced Jun 23, 2026

Backport e2e suite Iceberg keyspace isolation (#5888) to release/v1.2 #5902

Closed

fix(test): isolate e2e suites by unique workflow/execution id [release/v1.2 backport] #5903

Merged

aglinxinyuan mentioned this pull request Jul 2, 2026

feat: add loop operators #5700

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(test): isolate e2e suites by unique workflow/execution id#5888

fix(test): isolate e2e suites by unique workflow/execution id#5888
Yicong-Huang merged 4 commits into
apache:mainfrom
Ma77Ball:fix-e2e-iceberg-keyspace-collision

Ma77Ball commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

codecov-commenter commented Jun 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

Ma77Ball commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Yicong-Huang left a comment

Uh oh!

Yicong-Huang commented Jun 22, 2026

Uh oh!

Yicong-Huang commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

Ma77Ball commented Jun 22, 2026

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

github-actions Bot commented Jun 22, 2026

Automated Reviewer Suggestions

Uh oh!

codecov-commenter commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Benchmark changes need a look

Uh oh!

Ma77Ball commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Yicong-Huang left a comment

Choose a reason for hiding this comment

Uh oh!

Yicong-Huang commented Jun 22, 2026

Uh oh!

Yicong-Huang commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Jun 22, 2026 •

edited

Loading

github-actions Bot commented Jun 22, 2026 •

edited

Loading