fix(kubernetes): terminate idle computing units by yrenat · Pull Request #6046 · apache/texera

yrenat · 2026-07-01T07:34:32Z

What changes were proposed in this PR?

This PR adds backend-side cleanup for idle Kubernetes computing units.

The main change is a scheduled cleanup task in the computing unit managing service that periodically scans active Kubernetes computing units and terminates units that have been inactive longer than a configurable timeout.

The implementation includes the following changes:

Added new Kubernetes configuration entries for:
- computing unit idle timeout
- computing unit idle check interval
Exposed both settings through environment-variable-based configuration so deployment-side overrides can be applied without code changes.
Added a scheduled background task in ComputingUnitManagingService that runs the idle cleanup logic at a fixed interval.
Added idle Kubernetes computing unit termination logic in ComputingUnitManagingResource:
- only considers Kubernetes computing units that are not already terminated
- checks whether the computing unit has any active workflow executions
- computes the latest execution activity timestamp from existing execution metadata
- terminates the Kubernetes pod when the computing unit is considered idle past the configured timeout
- updates the computing unit termination time in the database after cleanup

The timeout and check interval are configurable through environment variables, so the behavior can be tuned for different deployment or testing needs without modifying the code.

Any related issues, documentation, discussions?

Fixes #5362

How was this PR tested?

Tested locally on the Kubernetes deployment flow.

fix-idle-CU-demo.mp4

Was this PR authored or co-authored using generative AI tooling?

Generated-by: OpenAI Codex GPT-5

github-actions · 2026-07-01T07:34:42Z

👋 Thanks for opening this pull request, @yrenat!

It looks like the pull request description doesn't quite follow our template yet:

The What changes were proposed in this PR? section is missing; please keep the template's headings.
The How was this PR tested? section is missing; please keep the template's headings.
The Was this PR authored or co-authored using generative AI tooling? section is missing; please keep the template's headings.

Filling out the template helps reviewers understand and triage your contribution faster. Please edit the description to complete it. This message will disappear automatically once the template is followed.

You can find the template prompts by editing the description, or see CONTRIBUTING.md for the full contribution flow.

github-actions · 2026-07-01T07:34:51Z

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

Contributors with relevant context: @Ma77Ball, @aicam
You can notify them by mentioning @Ma77Ball, @aicam in a comment.

codecov-commenter · 2026-07-01T07:36:35Z

Codecov Report

❌ Patch coverage is 0% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.80%. Comparing base (1a58433) to head (1b66a97).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
.../texera/service/ComputingUnitManagingService.scala	0.00%	11 Missing ⚠️
...rvice/resource/ComputingUnitManagingResource.scala	0.00%	8 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #6046      +/-   ##
============================================
- Coverage     56.82%   56.80%   -0.03%     
  Complexity     3023     3023              
============================================
  Files          1126     1126              
  Lines         43708    43727      +19     
  Branches       4733     4737       +4     
============================================
  Hits          24837    24837              
- Misses        17402    17421      +19     
  Partials       1469     1469

Flag	Coverage Δ	*Carryforward flag
access-control-service	`70.00% <ø> (ø)`
agent-service	`44.59% <ø> (ø)`	Carriedforward from 1a58433
amber	`58.64% <ø> (ø)`	Carriedforward from 1a58433
computing-unit-managing-service	`0.00% <0.00%> (ø)`
config-service	`52.30% <ø> (ø)`
file-service	`62.81% <ø> (ø)`
frontend	`49.97% <ø> (ø)`	Carriedforward from 1a58433
notebook-migration-service	`78.57% <ø> (ø)`
pyamber	`90.20% <ø> (ø)`	Carriedforward from 1a58433
python	`90.76% <ø> (ø)`	Carriedforward from 1a58433
workflow-compiling-service	`55.14% <ø> (ø)`

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-07-01T07:40:39Z

⚠️ Benchmark changes need a look

🟢 4 better · 🔴 5 worse · ⚪ 6 noise (<±5%) · 0 without baseline

Compared against main 104bcc4 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🔴	bs=10 sw=10 sl=64	368	0.225	23,977/73,209/73,209 us	🔴 +117.5% / 🔴 +385.8%
🟢	bs=100 sw=10 sl=64	926	0.565	104,775/127,589/127,589 us	🟢 -20.1% / 🔴 +18.6%
🟢	bs=1000 sw=10 sl=64	1,094	0.668	918,884/957,384/957,384 us	🟢 -6.2% / 🟢 -9.6%

Baseline details

Latest main 104bcc4 from same runner

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	368 tuples/sec	433 tuples/sec	777.62 tuples/sec	-15.0%	-52.7%
bs=10 sw=10 sl=64	MB/s	0.225 MB/s	0.264 MB/s	0.475 MB/s	-14.8%	-52.6%
bs=10 sw=10 sl=64	p50	23,977 us	21,815 us	12,612 us	+9.9%	+90.1%
bs=10 sw=10 sl=64	p95	73,209 us	33,655 us	15,070 us	+117.5%	+385.8%
bs=10 sw=10 sl=64	p99	73,209 us	33,655 us	18,360 us	+117.5%	+298.7%
bs=100 sw=10 sl=64	throughput	926 tuples/sec	909 tuples/sec	988.31 tuples/sec	+1.9%	-6.3%
bs=100 sw=10 sl=64	MB/s	0.565 MB/s	0.555 MB/s	0.603 MB/s	+1.8%	-6.3%
bs=100 sw=10 sl=64	p50	104,775 us	104,560 us	101,066 us	+0.2%	+3.7%
bs=100 sw=10 sl=64	p95	127,589 us	159,743 us	107,594 us	-20.1%	+18.6%
bs=100 sw=10 sl=64	p99	127,589 us	159,743 us	115,830 us	-20.1%	+10.2%
bs=1000 sw=10 sl=64	throughput	1,094 tuples/sec	1,079 tuples/sec	1,019 tuples/sec	+1.4%	+7.3%
bs=1000 sw=10 sl=64	MB/s	0.668 MB/s	0.658 MB/s	0.622 MB/s	+1.5%	+7.4%
bs=1000 sw=10 sl=64	p50	918,884 us	918,865 us	986,982 us	+0.0%	-6.9%
bs=1000 sw=10 sl=64	p95	957,384 us	1,020,247 us	1,028,491 us	-6.2%	-6.9%
bs=1000 sw=10 sl=64	p99	957,384 us	1,020,247 us	1,058,493 us	-6.2%	-9.6%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,543.26,200,128000,368,0.225,23976.79,73208.97,73208.97
1,100,10,64,20,2159.88,2000,1280000,926,0.565,104775.10,127589.32,127589.32
2,1000,10,64,20,18282.56,20000,12800000,1094,0.668,918884.38,957383.59,957383.59

fix(kubernetes): terminate idle computing units

1b66a97

github-actions Bot assigned yrenat Jul 1, 2026

github-actions Bot added fix common platform Non-amber Scala service paths labels Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(kubernetes): terminate idle computing units#6046

fix(kubernetes): terminate idle computing units#6046
yrenat wants to merge 1 commit into
apache:mainfrom
yrenat:fix/idle-kubernetes-cus

yrenat commented Jul 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

codecov-commenter commented Jul 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yrenat commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Automated Reviewer Suggestions

Uh oh!

codecov-commenter commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jul 1, 2026

⚠️ Benchmark changes need a look

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yrenat commented Jul 1, 2026 •

edited

Loading

codecov-commenter commented Jul 1, 2026 •

edited

Loading