Skip to content

refactor(k8s): organize Helm templates into base/aws/on-prem#5757

Merged
aicam merged 4 commits into
apache:mainfrom
aicam:aws-eks/01-template-reorg
Jun 24, 2026
Merged

refactor(k8s): organize Helm templates into base/aws/on-prem#5757
aicam merged 4 commits into
apache:mainfrom
aicam:aws-eks/01-template-reorg

Conversation

@aicam

@aicam aicam commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

This is a behavior-neutral refactoring of the Helm chart under bin/k8s/ — it reorganizes the flat templates/ directory into a clear, self-documenting layout with no value or logic changes. It is the first in a planned series of small, non-breaking PRs that make the chart cleanly deployable on AWS/EKS while keeping on-prem/local as the unchanged default.

Helm renders templates/** recursively, so moving files into subdirectories does not change rendered output. Templates are now grouped two ways:

  • By where they applytemplates/base/ (shared by every deployment), templates/on-prem/ (minio-persistence.yaml), and templates/aws/ (placeholder for AWS-only, value-gated templates added in later PRs; empty for now).
  • Within base/, by component — one subfolder per service holding all of its manifests, e.g. base/access-control-service/, base/gateway/, base/workflow-computing-unit-manager/, base/workflow-computing-unit-pool/, etc.

A templates/README.md documents the base/aws/on-prem and per-component conventions, and .helmignore is added so *.md/.gitkeep are not loaded as manifests.

No values were renamed, added, or removed, and templates/aws/ is empty, so the default on-prem/local install is unchanged.

Any related issues, documentation, discussions?

Closes #5892 (refactor sub-task).
Part of #5891 — unify AWS (EKS) and on-premise Kubernetes deployment under bin/k8s (parent feature).
Follows the design discussion in #5641.

This is the first of a planned series of incremental, non-breaking PRs to add AWS/EKS deployment support to the chart; later PRs build on this structure (pluggable object storage, node placement, AWS load balancer, computing-unit warm pool, eksctl/runbook).

How was this PR tested?

This is a pure chart refactor, so it was verified to be a no-op:

Render no-op proof: helm template texera bin/k8s was captured before and after the file moves. The output is byte-identical — same resources both times — with the only differences being the # Source: provenance comments (the new paths) and a per-render randomized secret value. helm lint bin/k8s passes.

No unit tests were added because the change is limited to the organization of Helm chart files, which is validated by the render-diff and helm lint above rather than by JVM unit tests.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

@github-actions github-actions Bot added docs Changes related to documentations infra labels Jun 17, 2026
@aicam aicam marked this pull request as draft June 17, 2026 18:47
Group the chart's templates by where they apply so the layout makes the
deployment surface obvious at a glance:

  templates/common/  resources every deployment needs (services, gateway,
                     postgres/lakefs/lakekeeper, computing-unit pool, RBAC)
  templates/onprem/  self-hosted-only resources (in-cluster MinIO)
  templates/aws/     AWS/EKS-only resources (added in later PRs; placeholder)

Helm renders templates/** recursively, so this is purely organizational:
`helm template` output is byte-identical to before the move (verified, modulo
the chart's pre-existing per-render random LakeFS keys). A templates/README.md
documents the convention and a .helmignore keeps the doc and .gitkeep
placeholder from being loaded as manifests.

Also bring values-development.yaml back in line with values.yaml (the ground
truth), which had drifted:
  - image source: docker.io/apache + 1.3.0-incubating-SNAPSHOT (was the stale
    ghcr.io/apache + latest), so both value files pull from the same place;
  - add the AUTH_JWT_SECRET entry to texeraEnvVars. values.yaml has it but the
    dev profile omitted it, so the computing-unit manager started without the
    secret and k8s computing-unit creation crashed with NoSuchElementException
    (None.get) in ComputingUnitManagingResource. Adding it (same dev-only
    default as values.yaml) makes CU creation work under the dev profile.

No behavior change to the default (on-prem) install.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@aicam aicam force-pushed the aws-eks/01-template-reorg branch from a1fcbab to bd9fa60 Compare June 22, 2026 16:51
@github-actions

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @bobbai00, @aglinxinyuan, @mengw15
    You can notify them by mentioning @bobbai00, @aglinxinyuan, @mengw15 in a comment.

@aicam aicam requested a review from bobbai00 June 22, 2026 18:16

@Ma77Ball Ma77Ball left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM!! Please make sure the helm chart and Docker can still be built, but otherwise I think it can be merged and looks like good work.

Comment thread bin/k8s/values-development.yaml Outdated

@Ma77Ball Ma77Ball left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aicam, this might need to be adjusted on Docker or here.

Comment thread bin/k8s/values-development.yaml Outdated

@bobbai00 bobbai00 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename onprem to on-prem, common to base. Make sure you also update the README.md under the templates folder accordingly

Comment thread bin/k8s/values-development.yaml Outdated
…-development

- Rename templates/common -> templates/base and templates/onprem ->
  templates/on-prem per review (@bobbai00), and update templates/README.md
  accordingly.
- Revert values-development.yaml to apache/main: drop the
  imageRegistry/imageTag override (@Ma77Ball, @bobbai00) and the hardcoded
  AUTH_JWT_SECRET (@Ma77Ball), keeping this PR a pure mechanical reorg.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@aicam aicam changed the title refactor(k8s): organize Helm templates into common/aws/onprem refactor(k8s): organize Helm templates into base/aws/on-prem Jun 23, 2026
@aicam

aicam commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review @bobbai00 @Ma77Ball — addressed in ca0a4c7:

  • Folder names (@bobbai00): renamed templates/commontemplates/base and templates/onpremtemplates/on-prem, and updated templates/README.md to match.
  • imageRegistry/imageTag (@Ma77Ball, @bobbai00): reverted values-development.yaml back to ghcr.io/apache / latest so the out-of-the-box helm install -f values-development.yaml keeps pulling published images.
  • Hardcoded AUTH_JWT_SECRET (@Ma77Ball): dropped from this PR as well. This keeps PR1 a pure mechanical reorg with zero value/logic changes. I'll open a separate follow-up for an install-time generated dev secret (lookup + randAlphaNum) as you suggested.

Verification: helm template texera bin/k8s renders byte-identical output to the pre-rename commit (5085 lines, only # Source: paths differ), and helm lint is clean. The reorg remains behavior-neutral.

@aicam aicam requested a review from bobbai00 June 23, 2026 22:46
@bobbai00

Copy link
Copy Markdown
Contributor

Please update the PR description to match the latest code structure

@aicam

aicam commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Updated the PR description to match the final structure (base/aws/on-prem) and the pure-reorg scope (no value changes). Thanks @bobbai00!

@aicam aicam enabled auto-merge June 24, 2026 16:16
license-eye cannot determine the comment style of Helm's .helmignore and
fails the 'Check License Headers' job. Add it to paths-ignore in
.licenserc.yaml, consistent with .gitignore/.dockerignore/.gitkeep. Also
refresh the stale common/aws/onprem reference in the .helmignore comment to
base/aws/on-prem.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@aicam aicam disabled auto-merge June 24, 2026 16:21
@aicam aicam added this pull request to the merge queue Jun 24, 2026
Merged via the queue into apache:main with commit e55d1ab Jun 24, 2026
16 checks passed
@aicam aicam deleted the aws-eks/01-template-reorg branch June 24, 2026 17:13
xuang7 pushed a commit to ELin2025/texera that referenced this pull request Jul 1, 2026
…pache#5932)

### What changes were proposed in this PR?

This makes the **application tier's** object-storage target
configurable, as the first step toward supporting an external S3 store
alongside the in-cluster MinIO. It is **non-breaking**: the default
(on-prem / in-cluster MinIO) install renders an identical set of
resources.

Today `file-service` and `workflow-computing-unit-manager` hardcode the
S3 endpoint and credentials to the in-cluster MinIO Service (`{{
.Release.Name }}-minio`) and its auto-generated Secret. This PR routes
both through a new `storage.s3` values block:

- **`values.yaml`** — new `storage.s3` block: `endpoint`, `region`,
`existingSecret`, `accessKeyId`, `secretAccessKey`. All default to
empty.
- **`templates/base/_helpers.tpl`** (new) — helpers that resolve the S3
endpoint and the credentials Secret name/keys. When
`storage.s3.endpoint` is empty they fall back to the in-cluster MinIO
Service and its `{{ .Release.Name }}-minio` Secret (keys
`root-user`/`root-password`), so the default render is unchanged.
- **`templates/aws/s3-credentials-secret.yaml`** (new) — a `{{
.Release.Name }}-s3-credentials` Secret, rendered **only** when an
external `endpoint` is set and no `existingSecret` is supplied. Renders
nothing on the default install.
- **`file-service` / `workflow-computing-unit-manager` deployments** —
the `STORAGE_S3_*` env now comes from the helpers; `STORAGE_S3_REGION`
is added only in external mode.

How it behaves:
- **Default (no config):** services point at the in-cluster MinIO
exactly as before.
- **External S3:** set `storage.s3.endpoint` + `region` + credentials
(or `existingSecret`) → both services use that store; the chart
materializes the credentials Secret unless you bring your own.

**Out of scope (intentionally deferred to keep this PR small and
atomic):** LakeFS blockstore + Lakekeeper warehouse external-S3 wiring,
the `minio.enabled` switch to drop the in-cluster MinIO entirely, and a
`values-aws.yaml` example overlay. Those are the LakeFS/Lakekeeper half
of "make object storage pluggable" and will land as a follow-up.

### Any related issues, documentation, discussions?

Closes apache#5931 (app-tier `storage.s3` task).
Part of apache#5891 — unify AWS (EKS) and on-premise Kubernetes deployment
under `bin/k8s` (parent feature).
Follows apache#5757 (Helm template reorg) and the design discussion in apache#5641.

### How was this PR tested?

Verified the default install is unchanged and the external path renders
correctly:

1. **No-op render proof:** `helm template texera bin/k8s` on this branch
vs on `main` renders the **same 102 resources** — identical after
ignoring comments. The only textual artifact is one empty `---` document
from the gated-off `s3-credentials-secret.yaml` (its `# license` header
is emitted by Helm even though the `{{- if }}` body is empty); it
produces no Kubernetes object. `helm lint` passes.
2. **External S3 render:** `helm template ... --set
storage.s3.endpoint=https://s3.us-west-2.amazonaws.com --set
storage.s3.accessKeyId=… --set storage.s3.secretAccessKey=…` renders the
`*-s3-credentials` Secret, repoints both deployments'
`STORAGE_S3_ENDPOINT`/credentials at it (keys
`access-key-id`/`secret-access-key`), and adds `STORAGE_S3_REGION`.
3. **Bring-your-own Secret:** with `--set
storage.s3.existingSecret=my-creds`, the deployments reference
`my-creds` and the chart generates **no** Secret.
4. `helm lint bin/k8s` passes for both the default and the external
value sets.
5. `helm install` locally with Minikube and test creating dataset and
running workflows

No unit tests were added — the change is limited to Helm chart
values/templates, validated by the render diff and `helm lint` above.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Changes related to documentations infra

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(k8s): reorganize Helm templates into common/aws/onprem + per-component subfolders

3 participants