feat(security): SSO/JWT authentication migration (Phase 1) by jsell-rh · Pull Request #1569 · ambient-code/platform

jsell-rh · 2026-05-12T17:14:41Z

Summary

SSO authentication migration — replaces OpenShift OAuth proxy with direct OIDC authentication via Keycloak. Includes a validated end-to-end native SSO deployment path (api-server + control plane) that bypasses the legacy backend and operator entirely.

What's implemented

Frontend BFF OIDC: Next.js acts as confidential OIDC client. Browser gets httpOnly session cookie, never sees a raw JWT. Authorization Code Flow with PKCE. Transparent token refresh (5-min access tokens, 30-min sessions).
Backend JWT validation: JWKS-based validation via lestrrat-go/jwx/v2. Validates signature, expiration, issuer, and audience.
K8s impersonation: Backend SA + Impersonate-User/Impersonate-Group headers preserve all existing RBAC enforcement without cluster OIDC federation. Prefers preferred_username over email for impersonation (matches Keycloak identity brokering claim order). Defaults to system:authenticated group when JWT has no groups.
Dual-path auth: JWT validation first, K8s TokenReview fallback for API keys (SA tokens). Both paths use impersonation.
SSO_ENABLED env var: SSO can be enabled via env var without depending on Unleash feature flag.
API server JWK_CERT_URL: Production environment no longer hardcodes the JWKS URL. Configurable via JWK_CERT_URL env var or --jwk-cert-url CLI flag, enabling non-RH-SSO OIDC providers (e.g., Keycloak).
Session expired UX: Global 401 detection via React Query cache, blocking dialog with login redirect, no retry storms.
E2E test auth: client_credentials grant from Keycloak with K8s SA fallback.
Local Keycloak: Kind cluster includes Keycloak with pre-configured realm, dev users, and protocol mappers. make kind-sso-toggle switches between SSO and legacy mode.
Native SSO deployment workflow: Complete workflow for deploying with Keycloak SSO, validated on hcmais cluster. See workflows/deploy-native-sso.md.
Migration ordering fix: roleBindings typedFKMigration ID changed from 202505130001 to 202603100139 to sort after the table-creation migration it depends on (202603100138 creates the role_bindings table; the old ID sorted before it, breaking fresh databases).

Production DB impact verified: the ambient-code staging RDS already has 202505130001 in its migrations table. Shipping 202603100139 means gormigrate treats it as a new migration and runs it. Every statement uses IF EXISTS/IF NOT EXISTS/DROP NOT NULL (all idempotent), so all operations no-op on production. The migrations table will contain both IDs — redundant but harmless. Fresh databases (Kind, new namespaces) will only have 202603100139 in the correct sort position.

Deployment overlays

Overlays reorganized by cluster under components/manifests/overlays/hcmais/:

Overlay	Purpose
`hcmais/jsell-sso-poc/`	Native SSO ambient deployment (api-server + control plane, no oauth-proxy)
`hcmais/keycloak/`	Keycloak instance deployment

Key files

Area	Files
Spec	`specs/security/sso-authentication.spec.md`
Deployment workflow	`workflows/deploy-native-sso.md`
API server JWKS config	`components/ambient-api-server/cmd/ambient-api-server/environments/e_production.go`
Backend SSO	`components/backend/handlers/sso.go`, `server/server.go`
Frontend OIDC	`components/frontend/src/lib/oidc.ts`, `session.ts`, `auth.ts`
Overlay (SSO PoC)	`components/manifests/overlays/hcmais/jsell-sso-poc/`
Overlay (Keycloak)	`components/manifests/overlays/hcmais/keycloak/`
RBAC	`base/rbac/backend-clusterrole.yaml`, `base/rbac/control-plane-*`

Default behavior

make kind-up deploys with legacy auth (SA token, no Keycloak redirect)
make kind-sso-toggle enables Keycloak OIDC for both frontend and backend
Existing deployments using oauth-proxy are unaffected — SSO code paths only activate with SSO_ENABLED=true or the sso-authentication Unleash flag
API server falls back to sso.redhat.com JWKS if JWK_CERT_URL is not set

Native SSO deployment (validated)

The end-to-end path without legacy components was validated on the hcmais cluster:

acpctl → Keycloak auth → API Server (REST) → DB
                                ↓ gRPC event
                          Control Plane → Runner Pod → Claude response
                                ↑ gRPC messages
                          Runner → API Server → acpctl

Prerequisites for a new environment:

Keycloak with 3 clients: ambient-frontend, ambient-control-plane (service account), ambient-cli (public)
Identity brokering with RH SSO (optional, for RH user federation)
Secrets: sso-credentials, control-plane-oidc, ambient-vertex
See workflows/deploy-native-sso.md for the complete checklist

Test plan

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- SSO via OIDC/Keycloak with session-based login/logout, secure cookies, and backend JWT validation
- Dual-mode auth (legacy & SSO) behind a feature flag for gradual rollout
- Frontend session-expired dialog, explicit Logged Out page, and SSO-aware logout flow
- Improved E2E helpers to run tests in both legacy and SSO modes
Documentation
- Updated local dev, Kind/Keycloak setup, E2E, and SSO migration/deployment guides

Define desired state for migrating from OpenShift OAuth proxy to direct SSO/JWT authentication. Key decisions: - BFF pattern: Next.js as OIDC confidential client, browser gets session cookie - K8s impersonation: backend SA + Impersonate-User/Group preserves RBAC - Dual-path auth: JWT first, TokenReview fallback for API keys - Feature-flagged migration for incremental rollout - Supersedes ADR-0002 (raw token passthrough → impersonation) Includes migration workflow with consumer impact map and implementation notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

netlify · 2026-05-12T17:14:49Z

✅ Deploy Preview for cheerful-kitten-f556a0 canceled.

Name	Link
🔨 Latest commit	`2ce3305`
🔍 Latest deploy log	https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/6a185f395e8cb40008f67ab0

coderabbitai · 2026-05-12T17:14:49Z

📝 Walkthrough

<review_stack_artifact>

</review_stack_artifact>

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jsell/spec/sso-authentication

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch jsell/spec/sso-authentication

Reference the IAM consolidation proposal (PR #1466) as the long-term direction. This spec is Phase 1; future phases cover API keys → SSO service accounts, runner → OIDC token exchange, DB RBAC reconciler, and credential consolidation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… requirements - OIDC callback must coexist with existing integration auth routes - SSO client configuration requirements (one per environment, audience isolation) - Post-logout redirect URI and web origins specified Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Kind/local-dev environments include Keycloak with pre-configured realm - Replaces static JWKS ConfigMap, DISABLE_AUTH mock mode, and OC_TOKEN - Same JWT validation code path as production (no dev-only auth logic) - Realm config version-controlled as JSON export - E2E tests use local Keycloak in Kind environments - Design decision: Keycloak Identity Brokering for deployed environments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Keycloak Deployment, realm JSON, and env var config for Kind overlay - Maps what it replaces (static JWKS, DISABLE_AUTH, test-user SA) - Identity Brokering section for deployed environments - Updated manifest changes to include Kind overlay additions/removals Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…d impersonation RBAC Slice 1 of SSO authentication migration (Phase 1): - Deploy Keycloak to Kind cluster with pre-configured realm (ambient-code) including confidential frontend client, public CLI client, and E2E client_credentials client. Dev users: developer/developer, admin/admin. - Add jwtauth package with JWKS-based JWT validation using lestrrat-go/jwx/v2. Validates signature, expiration, issuer, and audience. Extracts OIDC claims (sub, email, preferred_username, groups). - Add impersonate verb on users, groups, and serviceaccounts to backend-api ClusterRole for K8s impersonation under SSO auth. - Fix Kind overlay: relax runAsNonRoot for ambient-api-server, make control-plane OIDC env vars optional. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…mpersonation Slice 2 of SSO authentication migration (Phase 1): - Wire JWT validation into backend middleware: forwardedIdentityMiddleware validates JWT against Keycloak JWKS, extracts identity from OIDC claims (sub, email, preferred_username, groups), and stores validated claims in Gin context for reuse by handlers. - Add dual-path auth in getK8sClientsDefault: JWT validation first, then TokenReview fallback for API keys (K8s ServiceAccount tokens). - Use K8s impersonation (Impersonate-User/Group) instead of raw bearer token when SSO is enabled. Backend SA token + impersonation preserves all existing RBAC enforcement. - Fix SSAR cache key to include impersonated identity instead of shared SA token, preventing cross-user authorization cache leaks. - Gate SSO path behind "sso-authentication" Unleash feature flag. - Add SSO env vars (SSO_ISSUER_URL, SSO_AUDIENCE) to backend Kind overlay. - Fix Keycloak realm: add audience mapper and protocol mappers for sub, email, preferred_username claims in access token. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Slice 3 of SSO authentication migration (Phase 1): - Add openid-client v6, iron-session v8, and jose as dependencies - Create OIDC client layer (src/lib/oidc.ts): discovery, authorization URL construction with PKCE, code exchange, token refresh, end-session URL - Create encrypted session cookie management (src/lib/session.ts): iron-session with httpOnly/secure/sameSite cookies, transparent token refresh when access token is within 60s of expiry - Add SSO API routes: - /api/auth/sso/login: generates PKCE, stores verifier/state in cookies, redirects to Keycloak authorization endpoint - /api/auth/sso/callback: exchanges code for tokens, stores in session - /api/auth/sso/logout: destroys session, redirects to Keycloak logout - Add Next.js middleware: redirects unauthenticated page requests to SSO login when SSO_ENABLED=true - Modify buildForwardHeadersAsync: SSO path extracts JWT from session, sets Authorization: Bearer and X-Forwarded-* headers from JWT claims. All 97+ consumers are unaffected. - Update navigation logout to use SSO logout route when enabled - Update /api/me to accept Authorization header for auth check - Add SSO env vars to Kind frontend deployment patch - Support SSO_PUBLIC_ISSUER_URL for Kind dev (browser vs cluster URLs) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keycloak supports any port for localhost redirect URIs per RFC 8252 section 7.3. Registering http://localhost/* (without port) accepts callbacks on any ephemeral port, eliminating port-forward mismatches. Also set webOrigins to "+" (all valid redirect origins) for CORS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openid-client v6 requires a standard URL instance (not NextURL). Construct callback URL from SSO_REDIRECT_URI base to match the redirect_uri sent during authorization, since request.url inside the container resolves to 0.0.0.0:3000 rather than localhost:11646. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

In Kind, Keycloak's iss response parameter uses the public URL (localhost:30090) while openid-client validates against the internal URL (keycloak-service:8080). Remap the iss param before passing to authorizationCodeGrant so RFC 9207 issuer validation passes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Set KC_HOSTNAME to the internal service URL so Keycloak uses a consistent issuer in all tokens and OIDC responses, regardless of whether the browser reaches it via localhost:30090 or the server reaches it via keycloak-service:8080. This eliminates issuer mismatches in ID token validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

In Kind, the browser reaches Keycloak via localhost:30090 while backend and frontend servers use keycloak-service:8080. Keycloak sets the token issuer based on the authorization session URL, causing mismatches. Fixes: - Add alt issuer support to JWT validator (AddAltIssuer) so the backend accepts tokens from both internal and public Keycloak URLs. Production environments use a single URL and don't need alt issuers. - Use standard openid-client authorizationCodeGrant in production (full ID token validation). Fall back to manual token exchange in dev when SSO_PUBLIC_ISSUER_URL differs from SSO_ISSUER_URL. - Set cookies directly on redirect response in login route (cookies() API mutations don't transfer to NextResponse.redirect). - Derive post-login redirect origin from SSO_REDIRECT_URI to avoid container-internal 0.0.0.0:3000 address. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ints The OIDC discovery config was cached as a module-level singleton with no expiry. If Keycloak restarted and got a new ClusterIP, token refresh calls would fail silently (ECONNREFUSED) and the session would be destroyed, logging the user out. Add a 5-minute TTL so the config is re-discovered periodically. This matches the Keycloak JWKS cache interval and ensures endpoint URLs stay current after dependency restarts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

getUserSubjectFromContext now prefers userEmail (matching the Impersonate-User header) when creating RoleBindings. Previously it used userName (preferred_username), causing a mismatch: the RoleBinding subject would be "developer" but impersonation would use "developer@local.dev", so RBAC checks would fail. This ensures lazy RoleBinding creation in CreateProject works correctly with SSO impersonation — no manual RoleBindings needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Callback route: redirect to /api/auth/sso/login instead of showing JSON error when OIDC state cookies are missing or exchange fails. Handles stale Keycloak sessions that skip the login page. - Logout route: derive post-logout redirect URI from SSO_REDIRECT_URI to avoid 0.0.0.0:3000 container address. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

NEXT_PUBLIC_* env vars are inlined at build time in Next.js client components, so they're unavailable when the image is built without them. Instead, expose ssoEnabled from the /api/me server route and read it in the navigation component via useCurrentUser(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The project layout had its own handleLogout hardcoded to /oauth/sign_out, separate from the main navigation. Unified both to use the runtime ssoEnabled flag from useCurrentUser(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Slice 4 of SSO authentication migration (Phase 1): - Update extract-token.sh to obtain JWT from Keycloak via client_credentials grant (ambient-e2e client). Falls back to K8s SA token when Keycloak is not available. - Add audience and sub protocol mappers to ambient-e2e Keycloak client so tokens have proper aud claim for backend validation. - Add ClusterRoleBinding for e2e service account identity (service-account-ambient-e2e) so E2E tests can access projects. - No developer RoleBindings — JIT provisioning via CreateProject handles first-time access correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When the OIDC session expires and token refresh fails, the user now sees a blocking dialog instead of silent 401 errors: - Global 401 detection via QueryCache/MutationCache onError handlers - Skip retries on 401 to prevent request storms against the IdP - Non-dismissable AlertDialog with "Log in" button that preserves returnTo path so users land back on the same page - No "expiring soon" warning — server-side refresh handles access token renewal transparently; only surfaces when refresh token dies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add logging to getAccessToken so token refresh attempts and failures are visible in pod logs (was silently swallowing errors). - Fix middleware to return 401 JSON for RSC/fetch requests instead of redirecting to Keycloak. Cross-origin redirects fail as XHR and cause CORS errors. Full page navigations still redirect to login. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openid-client's refreshTokenGrant validates the ID token iss claim in the refresh response, which fails when the token was issued by localhost:30090 but the refresh goes through keycloak-service:8080. Use manual fetch to the token endpoint in split-URL mode (same approach as code exchange). Production uses the library's standard refreshTokenGrant with full validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The split-URL problem (browser→localhost:30090, server→keycloak-service:8080) caused token issuer mismatches that broke refresh tokens and ID token validation. Every workaround added complexity. Root fix: proxy Keycloak through the frontend at /sso/* so browser and server both reach Keycloak through the same origin. Combined with KC_HOSTNAME=http://keycloak-service:8080, all tokens now have a consistent issuer that matches the discovery endpoint. Changes: - Add /sso/[...path] catch-all route that proxies to Keycloak, rewriting Location headers on redirects - Set KC_HOSTNAME to internal service URL for consistent token issuer - Update SSO_PUBLIC_ISSUER_URL to use the proxy path - Exclude /sso from auth middleware matcher - Remove unused next.config.js rewrites (build-time, not runtime) This eliminates: alt issuers on the backend, manual token exchange fallbacks, iss parameter remapping in callbacks, and CORS errors on session expiry redirects. Production deployments use a single URL and don't need the proxy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Eliminates the split-URL issuer mismatch by properly configuring Keycloak's hostname-backchannel-dynamic feature: - KC_HOSTNAME=http://localhost:11646/sso — all tokens use the public URL as issuer, login pages render with proxy URLs - KC_HOSTNAME_BACKCHANNEL_DYNAMIC=true — internal services get backchannel URLs (token_endpoint, jwks_uri) via keycloak-service:8080 Frontend changes: - Manual OIDC discovery to bypass openid-client v6's issuer validation (known issue: github.com/panva/openid-client/issues/737) - Remove all split-URL workarounds (manual token exchange, iss remapping, URL rewriting in auth/logout/refresh) - openid-client's standard authorizationCodeGrant and refreshTokenGrant now work correctly for all flows Backend changes: - JWT validator uses discovered issuer from OIDC metadata (not the discovery URL) so it accepts the public issuer in tokens Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Frontend README: replace OC_TOKEN/OAuth proxy header docs with SSO env vars and OIDC session model description - Frontend .env.example: add SSO_* vars, move OC_* to legacy section - Backend README: replace DISABLE_AUTH migration guide with Keycloak dev auth instructions (JWT and SA token examples) - E2E README: update quick start to use extract-token.sh (Keycloak client_credentials with K8s SA fallback) - Kind dev guide: add Keycloak to bootstrap steps, document dev credentials and session lifetimes - CONTRIBUTING.md: add Keycloak to kind-up description, update access instructions with login info - OPENSHIFT_OAUTH.md: mark as legacy, link to SSO spec Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…facing The sso-authentication Unleash flag controls which auth path the backend uses. It is not visible in workspace settings and is not user-configurable — ops enables it per-environment during migration. Kind dev cluster creates and enables it automatically. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…bient-code/platform into jsell/spec/sso-authentication

markturansky

Amber Code Review — feat(security): SSO/JWT authentication migration (Phase 1)

Summary

Substantial, well-structured SSO migration. The core auth path — JWKS-based JWT validation, K8s impersonation, dual-path (JWT + TokenReview) fallback, and encrypted httpOnly session cookie — is architecturally sound and the approach is correct. One critical security flaw in the OIDC callback (open redirect) must be fixed before merge, along with a few major issues around the Keycloak proxy surface and committed credential placeholders.

Findings

Critical

Open redirect in OIDC callback — components/frontend/src/app/api/auth/sso/callback/route.ts

const returnTo = cookieStore.get("oidc_return_to")?.value || "/";
// ...
return NextResponse.redirect(new URL(returnTo, origin));

new URL(absoluteUrl, base) ignores base when the first argument is already absolute. An attacker can craft /api/auth/sso/login?returnTo=https://evil.com; the server stores it in the httpOnly oidc_return_to cookie; after successful OIDC login the user is silently redirected to evil.com. Classic phishing vector.

Fix: validate returnTo is a relative path before storing/using it:

function safeReturnTo(value: string | undefined): string {
  if (!value) return "/";
  try {
    const url = new URL(value, "http://localhost");
    // Reject if the input was already absolute (has its own host)
    if (new URL(value, "http://localhost").origin !== "http://localhost") return "/";
    return url.pathname + url.search;
  } catch { return "/"; }
}

Apply in both login/route.ts (before storing to cookie) and callback/route.ts (before using from cookie).

Major

1. CHANGE_ME placeholders committed to cluster overlay — components/manifests/overlays/hcmais/jsell-sso-poc/

sso-credentials.yaml has SSO_CLIENT_SECRET: CHANGE_ME, SESSION_SECRET: CHANGE_ME. keycloak-admin-secret.yaml has password: CHANGE_ME. keycloak-db.yaml has POSTGRES_PASSWORD: CHANGE_ME. keycloak-realm.json has "clientSecret": "CHANGE_ME" for both the frontend client and the RH SSO identity provider.

These are in a cluster-specific overlay intended for production (hcmais). If applied without rotation the session cookie encryption key is trivially known. Should either:

Add a # REQUIRED: rotate before applying comment block and a pre-flight check in the deployment workflow
Or use ExternalSecret / SealedSecret so real values are never committed

The Kind overlay correctly uses dev-secret-do-not-use-in-prod labels and is isolated to dev. The hcmais overlay needs the same level of care.

2. Unrestricted Keycloak proxy — components/frontend/src/app/sso/[...path]/route.ts

This route proxies all GET and POST requests to SSO_ORIGIN (Keycloak) with no path allowlisting. It exposes every Keycloak HTTP endpoint to the public internet via the frontend, including the admin REST API (/auth/admin/...), token introspection, user endpoints, etc. Keycloak has its own auth on admin routes, but this needlessly widens the attack surface and makes ingress WAF rules harder.

Suggested: restrict to paths the OIDC flow actually needs:

const ALLOWED_PATHS = /^(\/realms\/[^/]+\/(protocol\/openid-connect|\.well-known))/;
if (!ALLOWED_PATHS.test(`/${path}`)) {
  return NextResponse.json({ error: "Not found" }, { status: 404 });
}

3. E2E auth bypass reachable with SSO enabled on Kind — components/manifests/overlays/kind/frontend-test-patch.yaml

E2E_TEST_HELPERS=true is set unconditionally in the Kind test patch. The e2e-login route's second guard (AMBIENT_ENV === "production") is only meaningful if production overlays explicitly set AMBIENT_ENV=production on the frontend deployment. The Kind overlay doesn't set it, so after make kind-sso-toggle (SSO on), the auth bypass endpoint is live at /api/auth/sso/e2e-login for anyone who can reach the Kind cluster and knows the route exists.

Dev-only cluster, so production impact is nil. But a developer running Kind SSO mode with external access (e.g., port-forwarded to a shared network) is exposed. Consider removing E2E_TEST_HELPERS from the standard Kind patch and requiring it to be set via a separate make kind-e2e-mode toggle, or add AMBIENT_ENV=development to the route guard so the check is explicit both ways.

Minor

4. Module-level OIDC config cache — components/frontend/src/lib/oidc.ts:5-9

cachedConfig and cachedAt are module-level variables. In serverless / Next.js edge deployments, module state is not guaranteed to persist across invocations, which could cause repeated OIDC discovery on every request. Consider using lru-cache or an explicit cache tied to the Next.js request lifecycle, or document that this is intentional for long-lived Node.js processes only.

5. Silent session destroy on missing refresh token — components/frontend/src/lib/session.ts:60-63

When the access token is expired and there's no refresh token, session.destroy() is called silently. No log entry makes it hard to debug "why did my session disappear?" Add: console.warn("SSO: session expired with no refresh token, destroying").

6. Kind Keycloak SecurityContext — components/manifests/overlays/kind/keycloak-deployment.yaml:16

runAsNonRoot: false for Kind Keycloak and Kind api-server (kind/api-server-security-patch.yaml). CLAUDE.md mandates runAsNonRoot on all containers. Dev-only, so no prod risk, but worth noting the deliberate divergence with an inline comment explaining why Keycloak requires it (# Keycloak upstream image requires root; acceptable in Kind dev only).

7. Empty smoke test — components/ambient-api-server/test/integration/integration_test.go:258-261

TestServerStarts has no assertions — the comment explains the intent, but this pattern is fragile: a future refactor that skips TestMain would leave a silently-passing test. Consider require.NotNil(t, helper.AppConfig()) or equivalent to make the assertion explicit.

Positive Highlights

JWT validator is well-built: JWKS cache with 5-min min-refresh, proper issuer/audience/expiration validation, manual issuer check to support alt-issuers (dev vs. public URL). The validator_test.go is thorough — expired, tampered signature, wrong issuer/audience, malformed, empty, minimal claims — all covered.
K8s impersonation approach is the right call: no cluster OIDC federation required, all existing RBAC bindings continue to work, SA tokens (API keys) accepted via TokenReview fallback. Identity claim resolution order (preferred_username > email > sub) is consistent across buildImpersonatingClients, setIdentityFromClaims, and getUserSubjectFromContext.
SSAR cache key improved from raw bearer token to OIDC sub — reduces cache churn on token rotation and is more correct semantically.
InitJWTValidator non-fatal design: server starts normally without SSO configured; the feature flag is also off in that state. Clean.
Session cookie correctly configured: httpOnly, SameSite=lax, encrypted via iron-session.
PKCE+state implemented correctly for Authorization Code Flow.
Migration ID change well-reasoned and idempotency analysis is rigorous.
forwardedIdentityMiddleware falls through to legacy header-based path on JWT failure, preserving backward compatibility.

Recommendations (priority order)

Fix open redirect in callback/route.ts and login/route.ts — validate returnTo is relative before storing/using. (~10 lines)
Add path allowlist to sso/[...path]/route.ts — block everything except /realms/*/protocol/openid-connect/* and /.well-known/*.
Address CHANGE_ME placeholders in hcmais/jsell-sso-poc/ — add deployment workflow gate or switch to ExternalSecret/SealedSecret.
Document / isolate E2E_TEST_HELPERS — avoid having it enabled by default on the Kind overlay when SSO is toggled on.
Items 4–7 are quality improvements that can be addressed in follow-up tickets.

Review by Amber · standards refs: specs/standards/security/security.spec.md, skills/amber-review/SKILL.md, CLAUDE.md

Critical: - Fix open redirect in OIDC callback — validate returnTo is a relative path before storing in cookie and before redirecting. Absolute URLs (e.g. https://evil.com) are rejected and replaced with "/". Major: - Add path allowlist to Keycloak proxy — only OIDC-required paths (realms/*/protocol/openid-connect, .well-known, login-actions, resources) are proxied. Admin API and other endpoints return 404. - Add warning comments to CHANGE_ME placeholders in sso-credentials.yaml pointing to the deployment workflow. Minor: - Add console.warn when session is destroyed due to missing refresh token - Add real assertion to TestServerStarts (verify helper initialized and server URL is non-empty) - Add inline comment explaining runAsNonRoot:false on Kind Keycloak Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jsell-rh · 2026-05-21T18:39:11Z

Response to Amber code review (#4339633631)

All findings addressed in 1d194b00:

Critical — Open redirect

Fixed. Added safeReturnTo() validation in both login/route.ts and callback/route.ts. Absolute URLs are rejected and replaced with /. Validates by parsing with a localhost base and checking the origin matches.

Major 1 — CHANGE_ME placeholders

Added warning comment block pointing to workflows/deploy-native-sso.md Step 3a.

Major 2 — Keycloak proxy allowlist

Added ALLOWED_PATH_PREFIX regex restricting to realms/*/protocol/openid-connect, .well-known, login-actions, and resources. Everything else returns 404.

Major 3 — E2E auth bypass

Already addressed in prior commit — added AMBIENT_ENV === "production" guard. The Kind concern (dev cluster exposure) is valid but low-risk; noted for follow-up.

Minor items

OIDC config cache: Intentional for long-lived Node.js processes. Serverless edge is not a current target.
Silent session destroy: Added console.warn before session.destroy().
Kind SecurityContext: Added inline comment explaining the deliberate divergence.
Empty smoke test: Added real assertions — verifies testHelper initialized and RestURL returns non-empty.

markturansky

Re-review: SSO/JWT Authentication Migration (Phase 1)

Holistic re-review after fixup commit 1d194b00. Previous review posted 2026-05-21.

Fixes verified ✅

All six items from the initial review were addressed in the fixup commit. Each one is correctly implemented:

Open redirect in callback/login routes — safeReturnTo() now resolves all inputs against http://localhost and rejects anything whose origin differs from that base. Tested mentally against the classic attack vectors: https://evil.com → origin mismatch → /; //evil.com/path → resolves to http://evil.com/path → origin mismatch → /; javascript:alert(1) → URL parse throws → catch → /; /projects/foo → same origin → /projects/foo. Fix is correct.

Keycloak proxy path allowlist — regex /^realms\/[^/]+\/(protocol\/openid-connect|\.well-known|login-actions|resources)\b/ blocks everything except the four legitimate Keycloak endpoint families. Correct.

E2E auth bypass double guard — now requires both E2E_TEST_HELPERS === "true" and AMBIENT_ENV !== "production" before the bypass activates. Returns 404 otherwise. Correct.

CHANGE_ME comment on sso-credentials.yaml — header comment added, clearly marking placeholder values and linking to the deploy workflow. Correct.

runAsNonRoot in Kind overlay — inline comment explains why root is acceptable for the Keycloak upstream image in a dev-only context. Correct.

Integration test guard — TestServerStarts now fails fast with a clear message if TestMain didn't initialize the helper. Correct.

Remaining gaps ⚠️

1. CHANGE_ME comment missing from three sibling overlay files (minor → should fix)

sso-credentials.yaml got the # REQUIRED: Replace CHANGE_ME values before deploying header, but the other three files in the same overlay directory did not:

components/manifests/overlays/hcmais/jsell-sso-poc/keycloak-admin-secret.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/keycloak-db.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/keycloak-realm.json

All three contain literal placeholder strings that would break a real deployment if applied as-is. Same fix needed: add the required-replace comment at the top of each file.

2. E2E_TEST_HELPERS=true unconditional in Kind patch (minor → should fix)

components/manifests/overlays/kind/frontend-test-patch.yaml still sets E2E_TEST_HELPERS=true unconditionally. The E2E route is now double-guarded in code, but anyone running the Kind overlay gets an always-on test-helper surface. This should either be removed from the base Kind patch and injected only in CI test runs, or at minimum carry a comment explaining the intent. As it stands, any developer who applies the Kind overlay locally activates the bypass endpoint.

New bug: `session.idToken` is never stored 🐛 (must fix before merge)

logout/route.ts reads session.idToken to build the OIDC end_session_endpoint URL:

// logout/route.ts
const endSessionUrl = await getEndSessionUrl(
  session.idToken,   // ← always undefined
  redirectUrl,
);

getEndSessionUrl uses idToken as the id_token_hint parameter, which tells the identity provider which session to terminate. If it is undefined, the function falls back to passing only client_id, which means Keycloak may not terminate the upstream SSO session — only the local cookie is destroyed. Users will find themselves re-logged-in silently the next time they visit, which defeats single sign-out.

The root cause is in callback/route.ts. exchangeCode() returns idToken, but the callback only persists three of the four fields:

// callback/route.ts (current)
const tokens = await exchangeCode(code, codeVerifier, redirectUri);
session.accessToken  = tokens.accessToken;
session.refreshToken = tokens.refreshToken;
// session.idToken is dropped here   ← bug
session.expiresAt    = tokens.expiresAt;
await session.save();

Fix is one line:

session.idToken      = tokens.idToken;

This must be added before session.save() in the callback route.

Style nit (optional)

safeReturnTo() is copy-pasted identically into both login/route.ts and callback/route.ts. It should live in a shared module (e.g. src/lib/auth-utils.ts) and be imported from both routes. Not a blocker, but worth cleaning up before this ships to production to avoid the two copies drifting.

Verdict

The fixup commit is a solid response to the initial review — all six critical and major issues are correctly addressed. Two minor gaps remain (CHANGE_ME comments on sibling files, unconditional E2E_TEST_HELPERS in Kind patch), and there is one new must-fix bug: session.idToken is never persisted to the iron-session cookie, which silently breaks OIDC single sign-out.

Request changes on the idToken omission. The other items can be addressed in the same commit. Happy to re-review once those three lines are in.

…E_ME Must-fix: - Persist idToken to session cookie in OIDC callback so logout can pass id_token_hint to Keycloak end_session_endpoint (single sign-out) Cleanup: - Extract safeReturnTo() to shared lib/auth-utils.ts, import from both login and callback routes - Add CHANGE_ME warnings to keycloak-admin-secret.yaml, keycloak-db.yaml, and keycloak kustomization.yaml (keycloak-realm.json is JSON, noted in kustomization instead) - Add security comment to E2E_TEST_HELPERS in Kind overlay Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jsell-rh · 2026-05-21T18:53:35Z

Response to re-review

All items addressed in 05bb922e:

idToken bug (must-fix): Added session.idToken = tokens.idToken in the callback route. Single sign-out via id_token_hint will now work correctly.

CHANGE_ME on sibling files: Added warnings to keycloak-admin-secret.yaml, keycloak-db.yaml, and keycloak/kustomization.yaml (JSON can't have comments, so noted there instead).

safeReturnTo dedup: Extracted to src/lib/auth-utils.ts, imported from both routes.

E2E_TEST_HELPERS comment: Added security note to the Kind overlay explaining the double-guard and advising against shared-network exposure.

…bber The sso-credentials.yaml contains CHANGE_ME placeholders. Including it in kustomize resources causes oc apply -k to overwrite manually-set secret values. Same pattern as the CRB namespace clobber fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

markturansky

Third Pass — Thorough Line-by-Line Review

Reviewed after commit 05bb922e. All three previous fixup commits now absorbed.

All prior issues confirmed resolved ✅

Every finding from reviews 1 and 2 is correctly addressed. I walked each file end-to-end below; I won't rehash what's already fixed.

New findings

MAJOR — hcmais Keycloak realm missing `groups` scope

File: components/manifests/overlays/hcmais/keycloak/keycloak-realm.json

The Kind realm correctly defines a realm-level groups clientScope with an oidc-group-membership-mapper, and all three clients include it in defaultClientScopes. The hcmais realm does neither:

Kind realm:    ambient-frontend defaultClientScopes: ['openid', 'email', 'profile', 'groups'] ✅
hcmais realm:  ambient-frontend defaultClientScopes: ['openid', 'email', 'profile']         ✗

There is no groups clientScope definition at the realm level and no per-client protocol mapper for group membership. Tokens issued by the hcmais Keycloak will have no groups claim.

In buildImpersonatingClients (handlers/sso.go):

groups := claims.Groups
if len(groups) == 0 {
    groups = []string{"system:authenticated"}
}

Every SSO user on hcmais will be impersonated as system:authenticated and nothing else. Any K8s RoleBinding bound to a specific group (e.g., an LDAP-synced group like ambient-users) will not be honoured. Users may see unexpected 403s on resources gated by group membership.

Fix: Add the groups clientScope to the hcmais realm, mirroring the Kind config:

"clientScopes": [
  {
    "name": "groups",
    "protocol": "openid-connect",
    "protocolMappers": [{
      "name": "groups",
      "protocolMapper": "oidc-group-membership-mapper",
      "config": {
        "full.path": "false",
        "access.token.claim": "true",
        "id.token.claim": "true",
        "claim.name": "groups"
      }
    }]
  }
]

And add "groups" to ambient-frontend.defaultClientScopes.

MAJOR — Identity claim precedence is inconsistent across the stack

There are three places that resolve "who is this user"; all three use a different priority order:

Site	Priority order
`buildImpersonatingClients` (sso.go)	`preferred_username` → `email` → `sub`
`setIdentityFromClaims` — `userID` (server.go)	`email` → `sub`
`setIdentityFromClaims` — `userName` (server.go)	`preferred_username` → `email`

K8s impersonation uses preferred_username as the primary. The context userID uses email as the primary. For any user that has both claims (the common case), the K8s identity is their preferred_username (e.g. jsell) but the application userID is their email (e.g. jsell@redhat.com). These two values are used in different places — userID for secret key naming, credential lookup; K8s identity for RBAC. If any RBAC RoleBinding is authored with the email address as the subject, it would match K8s impersonation only when preferred_username is absent. If it is authored with the username, userID-keyed lookups won't match.

The root fix is to make setIdentityFromClaims use preferred_username first for userID, matching the impersonation order:

// Consistent with buildImpersonatingClients
primaryID := claims.PreferredUsername
if primaryID == "" {
    primaryID = claims.Email
}
if primaryID == "" {
    primaryID = claims.Sub
}
c.Set("userID", SanitizeUserID(primaryID))
c.Set("userIDOriginal", primaryID)

MINOR — SSO middleware falls through to X-Forwarded-* headers after JWT failure

File: components/backend/server/server.go, forwardedIdentityMiddleware

When SSO is enabled and JWT validation fails (e.g., the request carries an API key that is not a Keycloak JWT), the middleware logs the failure and falls through to the legacy OAuth proxy path:

// JWT validation failed — fall through to header-based extraction
// (API keys will be handled by getK8sClientsDefault via TokenReview)
// [falls through to reading X-Forwarded-User, X-Forwarded-Email, etc.]

The comment is correct — API key identity is re-established by setIdentityFromTokenReview inside getK8sClientsDefault, which overwrites whatever the middleware set. So in practice no handler sees wrong identity. But the middleware shouldn't trust X-Forwarded-* headers at all when SSO is the active auth mode — those headers belonged to the OpenShift OAuth proxy era and should be treated as untrusted in the SSO path.

The minimal fix is a guarded return:

if SSOEnabled() {
    // JWT failed AND we're in SSO mode — don't fall through to OAuth proxy headers.
    // Identity will be established by tokenReviewIdentity in getK8sClientsDefault.
    c.Next()
    return
}

This is defense-in-depth: it closes the theoretical path where a direct-to-backend caller sets crafted X-Forwarded-* headers alongside an API key token.

MINOR — OIDC authorization request only asks for `openid` scope

File: components/frontend/src/lib/oidc.ts, buildAuthorizationUrl

scope: "openid",

The code later depends on email and preferred_username claims. These are delivered by Keycloak because email and profile are in the client's defaultClientScopes, but the request doesn't ask for them explicitly. If the Keycloak admin ever changes the default scopes, the claims silently disappear and auth degrades in a hard-to-diagnose way. Should be:

scope: "openid email profile",

MINOR — `getAccessToken()` doesn't persist refreshed `idToken`

File: components/frontend/src/lib/session.ts

refreshOIDCTokens returns a new idToken (Keycloak always issues a new one on refresh). The refresh path saves accessToken, refreshToken, and expiresAt but drops idToken:

session.accessToken  = tokens.accessToken;
session.refreshToken = tokens.refreshToken;
// session.idToken not updated here
session.expiresAt    = tokens.expiresAt;

The old idToken from initial login continues to work as id_token_hint at logout (Keycloak accepts it), so this isn't broken today. But Keycloak rotates session keys on refresh, and some configurations invalidate old ID tokens. Add session.idToken = tokens.idToken to the refresh block for correctness.

MINOR — `ambient-cli` client lacks an audience mapper

File: components/manifests/overlays/hcmais/keycloak/keycloak-realm.json (and Kind equivalent)

The ambient-frontend client has a protocol mapper that adds ambient-frontend to the aud claim of access tokens. The ambient-cli client has no such mapper.

The backend validates JWT audience against SSO_AUDIENCE (which is ambient-frontend). A user who authenticates via the CLI client will receive a token without aud: ambient-frontend, and the backend will reject it with a validation error. If the CLI is intended to reach the backend in Phase 1 (or Phase 2), this needs an audience mapper on the CLI client. If the CLI is out of scope for now, add a comment in the realm JSON noting this.

MINOR — `new URL(SSO_ISSUER_URL)` at module init in Keycloak proxy

File: components/frontend/src/app/sso/[...path]/route.ts

const SSO_ORIGIN = process.env.SSO_ISSUER_URL
  ? new URL(process.env.SSO_ISSUER_URL).origin
  : null;

This executes at module load time. If SSO_ISSUER_URL is set to a malformed value (missing protocol, trailing space, etc.), new URL() throws a TypeError that propagates as an unhandled module-load error, crashing the Next.js worker. Wrap it:

function parseSSOOrigin(): string | null {
  try {
    return process.env.SSO_ISSUER_URL
      ? new URL(process.env.SSO_ISSUER_URL).origin
      : null;
  } catch {
    console.error('SSO_ISSUER_URL is not a valid URL:', process.env.SSO_ISSUER_URL);
    return null;
  }
}
const SSO_ORIGIN = parseSSOOrigin();

Nits

Keycloak proxy Location rewrite uses String.replace (first match only):

value.replace(SSO_ORIGIN, request.nextUrl.origin + "/sso")

String.replace with a string argument replaces only the first occurrence. Use replaceAll or a global regex. Unlikely to matter in practice (Location headers don't repeat the origin), but correct API is replaceAll.

hcmais realm client secret in keycloak-realm.json is CHANGE_ME:

If the realm JSON is imported into Keycloak as-is (which the --import-realm flag does on startup), the ambient-frontend Keycloak client secret will be set to CHANGE_ME, separate from the SSO_CLIENT_SECRET in the K8s secret. Both must be updated together. The kustomization now has a comment about this; consider adding a line specifically calling out the realm JSON client secret requirement: "secret": "<must match SSO_CLIENT_SECRET>".

Overall assessment

The core implementation is solid. The JWT validator, JWKS cache, K8s impersonation, BFF OIDC session model, and E2E bypass guards are all well-designed. The two major findings (groups scope gap and identity precedence inconsistency) should be addressed before this goes to production — they don't block the PoC deployment but will cause real RBAC failures when users with group-based access controls hit the system.

The five minor items are fixes, not blockers for the PoC. Requesting changes on the groups scope and identity precedence; everything else is advisory.

iron-session's cookie limit (4096 bytes) is exceeded when idToken is included alongside accessToken and refreshToken — especially with Keycloak identity brokering which produces large tokens. Store idToken in a dedicated httpOnly oidc_id_token cookie. Logout reads it from there for id_token_hint. Session type no longer includes idToken. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Redirect post-logout to /logged-out instead of app root (which triggers auto-redirect back to SSO, re-authenticating silently) - Add /logged-out page excluded from SSO middleware - Include ssoEnabled in /api/me response even when unauthenticated so the logout button always uses the correct path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…s scope, proxy hardening Align setIdentityFromClaims with buildImpersonatingClients claim order (preferred_username > email > sub) so userID and K8s impersonation identity are consistent. Guard X-Forwarded-* header fallthrough when SSO is enabled. Add groups clientScope and audience mapper to both Kind and hcmais Keycloak realms. Request explicit OIDC scopes, persist refreshed idToken, and harden the Keycloak proxy init. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

markturansky · 2026-05-21T22:15:01Z

Fourth-pass review — amber

Reviewing 5 commits since 05bb922e (third pass):

235bc38 — exclude sso-credentials from kustomize
f00fbaf — idToken in separate cookie to avoid iron-session 4096-byte limit
6346433 — fix logout redirect loop, add /logged-out landing page
5da471a — address all third-pass findings (identity precedence, groups scope, proxy hardening)
f23c8bc — use Next.js Link on logged-out page

✅ All third-pass findings resolved

Finding	Status
[MAJOR] Identity claim precedence inconsistency (`setIdentityFromClaims` vs `buildImpersonatingClients`)	Fixed — both now `preferred_username > email > sub` (`server.go:188-197`)
[MAJOR] hcmais realm missing `groups` clientScope + `oidc-group-membership-mapper`	Fixed — `clientScopes` block added with mapper; both clients have `groups` in `defaultClientScopes`
[MINOR] X-Forwarded-* fallthrough when SSO is enabled and JWT fails	Fixed — `server.go:133-137`: explicit early return when JWT validation fails in SSO mode
[MINOR] OIDC scopes only requested `openid`	Fixed — `oidc.ts:55`: `scope: "openid email profile"`
[MINOR] `getAccessToken()` didn't persist refreshed idToken	Fixed — `session.ts:51-57`: refreshed idToken stored to `oidc_id_token` cookie
[MINOR] ambient-cli lacked audience mapper	Fixed — audience mapper added in both Kind and hcmais realm JSON
[NIT] Location rewrite used `replace` not `replaceAll`	Fixed — `route.ts:60`: `replaceAll`
[NIT] `SSO_ISSUER_URL` parse crash at module init	Fixed — `parseSSOOrigin()` wraps in try/catch, returns `null` → 503

✅ New commits look good

f00fbaf — idToken separate cookie
Correct approach for the 4096-byte iron-session limit. The oidc_id_token cookie is httpOnly, secure in production, sameSite: lax, maxAge: 86400. Logout reads and deletes it before calling Keycloak end_session_endpoint. One note: if token refresh fails and session.destroy() is called in the error path (session.ts:62), the oidc_id_token cookie is not cleared — it will persist up to 24 hours. Low risk (idToken alone doesn't grant platform access), but worth noting.

6346433 — logout redirect loop fix
Root cause correctly identified: logout was redirecting to / which auto-re-authenticated via Keycloak session. Now redirects to /logged-out. Middleware matcher now excludes logged-out (middleware.ts config). Clean fix.

235bc38 — kustomize exclusion of sso-credentials
Same pattern as the CRB fix. Prevents oc apply -k from clobbering manually-set secrets with CHANGE_ME values. Correct.

f23c8bc — Link component
Standard Next.js best practice. Good.

Notes (no action required)

SSO_ORIGIN is still eagerly evaluated at module init — it's now resilient to bad URLs (returns null → 503 instead of crashing), but if SSO_ISSUER_URL is unset at module load time, the proxy will serve 503 until the module is reloaded. Non-issue in K8s (env vars are set at pod start), but if this ever runs in a hot-reload or serverless context, the lazy init pattern (evaluate inside the handler) would be more robust.
X-Forwarded- still trusted in SSO mode when no Bearer token is present* — This is the intended API key / SA token path. My original concern was about JWT failure fallthrough, which is now fixed. When there's no token at all in SSO mode, X-Forwarded-* → TokenReview is the correct fallback chain.

Verdict: LGTM ✅

All major and critical findings across four review passes are resolved. The implementation is sound: PKCE + iron-session BFF, consistent identity claim ordering, groups propagation to Keycloak JWTs, proper X-Forwarded-* isolation in SSO mode, and clean logout flow. The draft status can be lifted when ready.

Amber — hcmai-01

Trailing space removal, line ending normalization, PEP 8 formatting fixes, and unused import cleanup. No functional changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…r pattern Use /projects prefix with /{id}/credentials at the route level instead of baking {id} into the PathPrefix, matching the standard gorilla/mux pattern used by other project-scoped resources. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…bient-code/platform into jsell/spec/sso-authentication

Replace jQuery snapshot approach ($inner.find + cy.wrap) with Cypress retry-able queries (cy.get) and wait for loading skeletons to settle before interacting with inputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

components/backend/server/server.go (1)

138-177: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop trusting X-Forwarded-* identity headers in SSO mode.

When SSO is enabled, requests without a validated JWT still fall through to the legacy header path and populate userID, userName, userEmail, and userGroups from caller-supplied headers. In the direct-JWT deployment path those headers are not authenticated, so a client can spoof identity metadata for any downstream handler that reads Gin context before GetK8sClientsForRequest() runs.

Suggested change

 func forwardedIdentityMiddleware() gin.HandlerFunc {
 	return func(c *gin.Context) {
-		// SSO path: extract identity from JWT Bearer token
-		if (featureflags.IsEnabled("sso-authentication") || os.Getenv("SSO_ENABLED") == "true") && JWTValidator != nil {
+		ssoEnabled := featureflags.IsEnabled("sso-authentication") || os.Getenv("SSO_ENABLED") == "true"
+
+		// SSO path: extract identity from JWT Bearer token
+		if ssoEnabled && JWTValidator != nil {
 			if token := extractBearerToken(c); token != "" {
 				if claims, err := JWTValidator.Validate(token); err == nil {
 					setIdentityFromClaims(c, claims)
@@
 				return
 			}
 		}
+
+		if ssoEnabled {
+			if auth := c.GetHeader("Authorization"); auth != "" {
+				c.Set("authorizationHeader", auth)
+			}
+			c.Next()
+			return
+		}
 
 		// Legacy path: extract identity from OAuth proxy headers

As per coding guidelines, Flag missing authentication/authorization on API endpoints.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/backend/server/server.go` around lines 138 - 177, When SSO is
enabled (featureflags.IsEnabled("sso-authentication") ||
os.Getenv("SSO_ENABLED") == "true") and JWTValidator != nil, do not populate
legacy identity from X-Forwarded-* if there's no bearer token or
JWTValidator.Validate fails; change the middleware around extractBearerToken /
JWTValidator.Validate so that on missing/invalid JWT you skip the legacy header
block entirely (do not set "userID","userName","userEmail","userGroups" from
headers) and instead mark the request as unauthenticated (e.g., set a context
key like "ssoUnauthenticated" or abort with 401) so downstream handlers (before
GetK8sClientsForRequest / tokenReviewIdentity) cannot be spoofed; keep
setIdentityFromClaims only for the successful branch where JWTValidator.Validate
returns claims.

🟠 Major comments (22)

components/ambient-api-server/plugins/roleBindings/migration.go-35-35 (1)
35-35: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate rollback failures instead of discarding them.

Rollback currently drops multiple errors, which can leave schema state partially reverted without surfacing failure. Return contextual errors per step (or collect and return joined errors) rather than ignoring them.
Suggested fix
 import (
+	"fmt"
 	"gorm.io/gorm"

 	"github.com/go-gormigrate/gormigrate/v2"
 	"github.com/openshift-online/rh-trex-ai/pkg/db"
 )
@@
 		Rollback: func(tx *gorm.DB) error {
-			_ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_credential_id`).Error
-			_ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_session_id`).Error
-			_ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_agent_id`).Error
-			_ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_project_id`).Error
-			_ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS credential_id`).Error
-			_ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS session_id`).Error
-			_ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS agent_id`).Error
-			_ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS project_id`).Error
-			_ = tx.Exec(`ALTER TABLE role_bindings ALTER COLUMN user_id SET NOT NULL`).Error
-			return tx.Exec(`ALTER TABLE role_bindings ADD COLUMN IF NOT EXISTS scope_id TEXT`).Error
+			exec := func(op, query string) error {
+				if err := tx.Exec(query).Error; err != nil {
+					return fmt.Errorf("%s: %w", op, err)
+				}
+				return nil
+			}
+			if err := exec("drop idx_role_bindings_credential_id", `DROP INDEX IF EXISTS idx_role_bindings_credential_id`); err != nil { return err }
+			if err := exec("drop idx_role_bindings_session_id", `DROP INDEX IF EXISTS idx_role_bindings_session_id`); err != nil { return err }
+			if err := exec("drop idx_role_bindings_agent_id", `DROP INDEX IF EXISTS idx_role_bindings_agent_id`); err != nil { return err }
+			if err := exec("drop idx_role_bindings_project_id", `DROP INDEX IF EXISTS idx_role_bindings_project_id`); err != nil { return err }
+			if err := exec("drop credential_id column", `ALTER TABLE role_bindings DROP COLUMN IF EXISTS credential_id`); err != nil { return err }
+			if err := exec("drop session_id column", `ALTER TABLE role_bindings DROP COLUMN IF EXISTS session_id`); err != nil { return err }
+			if err := exec("drop agent_id column", `ALTER TABLE role_bindings DROP COLUMN IF EXISTS agent_id`); err != nil { return err }
+			if err := exec("drop project_id column", `ALTER TABLE role_bindings DROP COLUMN IF EXISTS project_id`); err != nil { return err }
+			if err := exec("restore user_id not-null", `ALTER TABLE role_bindings ALTER COLUMN user_id SET NOT NULL`); err != nil { return err }
+			return exec("restore scope_id column", `ALTER TABLE role_bindings ADD COLUMN IF NOT EXISTS scope_id TEXT`)
 		},
As per coding guidelines, "Never silently swallow partial failures; every error path must propagate or be explicitly collected, never discarded".

Also applies to: 77-87
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/ambient-api-server/plugins/roleBindings/migration.go` at line 35,
The Rollback implementation for migration ID "202603100139" currently discards
errors from individual rollback steps (see Rollback and the block around lines
~77-87); update Rollback to collect all encountered errors instead of ignoring
them and return a single contextual error (e.g., using errors.Join or a
multi-error pattern) that includes which step failed and its error; ensure each
call in the rollback sequence (the same functions used in Up/Down) appends its
error with context and that Rollback returns the combined error if any failures
occurred.
components/manifests/overlays/kind/keycloak-realm.json-14-14 (1)
14-14: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Remove hardcoded client secrets and user passwords from realm seed

Lines 14, 128, 213-214, and 231-232 commit live credentials into source. Even for dev overlays, these values are easy to leak via repo clones/artifacts and easy to accidentally reuse.

Use externally managed secrets (generated per environment) and inject them at import time instead of storing plaintext in the realm JSON.

Also applies to: 128-128, 213-214, 231-232
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/overlays/kind/keycloak-realm.json` at line 14, The realm
JSON currently contains hardcoded credentials (e.g., the client "secret" field
and user "credentials"/"password" entries) which must be removed; replace those
literal values with external secret references or placeholders (for example: a
templated variable like "${KEYCLOAK_CLIENT_SECRET}" or a reference that your
deployment tooling can substitute from a Kubernetes Secret) and update any
import/manifest tooling to inject the real secrets at deploy/import time rather
than committing plaintext; search for the "secret" field and any
"credentials"/"password" entries in keycloak-realm.json (and similar occurrences
around the earlier referenced blocks) and change them to use environment/secret
injection placeholders and document the required secret names for your deploy
pipeline.
components/manifests/base/rbac/backend-clusterrole.yaml-95-98 (1)
95-98: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tighten impersonation RBAC scope to least privilege

Line 97 includes serviceaccounts, but the SSO impersonation path uses only username + groups (components/backend/handlers/sso.go Line 35-Line 70). Keeping serviceaccounts here broadens impersonation authority without a clear requirement.
Suggested change
 - apiGroups: [""]
-  resources: ["users", "groups", "serviceaccounts"]
+  resources: ["users", "groups"]
   verbs: ["impersonate"]
As per coding guidelines components/manifests/**/*.yaml: "RBAC must follow least-privilege."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/base/rbac/backend-clusterrole.yaml` around lines 95 -
98, The impersonation RBAC currently allows impersonating "serviceaccounts"
which is unnecessary for the SSO flow and violates least-privilege; update the
ClusterRole resource list in backend-clusterrole.yaml by removing
"serviceaccounts" from the resources array so only "users" and "groups" remain
(confirm this aligns with the SSO handler in components/backend/handlers/sso.go
that only uses username + groups), then validate the manifest and tests to
ensure no other code relies on serviceaccount impersonation.
components/manifests/overlays/kind/api-server-security-patch.yaml-8-9 (1)
8-9: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not force API server pod to run as root

Line 9 sets runAsNonRoot: false, which weakens pod security and violates the restricted security context requirement.
Suggested hardening patch
 apiVersion: apps/v1
 kind: Deployment
 metadata:
   name: ambient-api-server
 spec:
   template:
     spec:
       securityContext:
-        runAsNonRoot: false
+        runAsNonRoot: true
+      containers:
+        - name: ambient-api-server
+          securityContext:
+            runAsNonRoot: true
+            allowPrivilegeEscalation: false
+            readOnlyRootFilesystem: true
+            capabilities:
+              drop: ["ALL"]
As per coding guidelines **/manifests/**/*.{yaml,yml}: "All containers must have restricted SecurityContext: runAsNonRoot, drop ALL capabilities, readOnlyRootFilesystem."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/overlays/kind/api-server-security-patch.yaml` around
lines 8 - 9, The manifest currently sets securityContext.runAsNonRoot: false
which forces the API server pod to run as root; change this to a restricted
securityContext by setting runAsNonRoot: true (and add a non-root UID via
runAsUser, e.g., 1000), and harden the container-level securityContext by
dropping all capabilities (capabilities.drop: ["ALL"]) and enabling
readOnlyRootFilesystem: true; update the relevant securityContext blocks in the
API server Pod/Deployment spec (look for the securityContext and container
entries for the API server) to include runAsNonRoot, runAsUser,
capabilities.drop, and readOnlyRootFilesystem.
components/manifests/overlays/kind/sso-credentials.yaml-1-7 (1)
1-7: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Secret resource is missing ownerReferences.

This new Secret is created without metadata.ownerReferences, which violates the manifest ownership rule and can leave orphaned cross-env credentials behind on teardown.

As per coding guidelines, **/{k8s,kubernetes,manifests,deploy,config}/**/*.{yaml,yml,json}: “Flag child Kubernetes resources (Jobs, Secrets, PVCs) missing OwnerReferences”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/overlays/kind/sso-credentials.yaml` around lines 1 - 7,
The Secret resource "sso-credentials" is missing metadata.ownerReferences;
update the Secret manifest to add a metadata.ownerReferences entry pointing to
its owning resource (include apiVersion, kind, name matching the parent
resource, and the owner's uid), and set controller: true and blockOwnerDeletion:
true so Kubernetes will garbage-collect the Secret when the parent is removed;
if the owner UID is supplied dynamically, wire it through the templating system
(Helm/Kustomize/whatever generates this manifest) so the ownerReferences are
populated at render time.
components/manifests/overlays/kind/sso-credentials.yaml-8-14 (1)
8-14: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove committed secret values from manifest source.

Line 12 and Line 14 embed live-looking secret material directly in Git (SSO_CLIENT_SECRET, SESSION_SECRET). Even in dev overlays, this tends to leak into logs/forks and gets reused unintentionally. Use generated/externally managed secret injection (e.g., SealedSecret/ExternalSecret/SOPS) and keep only non-sensitive placeholders in tracked YAML.

As per coding guidelines, **/*.{go,ts,tsx,js,jsx} includes “Flag secrets/tokens logged in plaintext or hardcoded in source code”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/overlays/kind/sso-credentials.yaml` around lines 8 - 14,
The manifest currently commits plaintext secrets for SSO_CLIENT_SECRET and
SESSION_SECRET in stringData; remove these hardcoded values and replace them
with non-sensitive placeholders (e.g., "<REPLACE_WITH_SECRET>") or leave the
keys empty, and switch to external secret management
(SealedSecret/ExternalSecret/SOPS or injection at deploy time) so secrets are
generated/loaded securely; update any deployment/helm/CI templates that
reference SSO_CLIENT_SECRET or SESSION_SECRET to pull from the chosen secret
provider and ensure the manifest only contains the SSO_* keys as placeholders,
not real secret material.
components/manifests/overlays/hcmais/jsell-sso-poc/control-plane-env-patch.yaml-1-30 (1)
1-30: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add restricted SecurityContext to container specs.

Both control-plane-env-patch.yaml and frontend-sso-patch.yaml lack container security context. As per coding guidelines, all containers in manifests must have restricted SecurityContext: runAsNonRoot: true, drop ALL capabilities, and readOnlyRootFilesystem: true.
🔒 Proposed fix for control-plane-env-patch.yaml
       containers:
         - name: ambient-control-plane
+          securityContext:
+            runAsNonRoot: true
+            readOnlyRootFilesystem: true
+            allowPrivilegeEscalation: false
+            capabilities:
+              drop:
+                - ALL
           env:
🔒 Proposed fix for frontend-sso-patch.yaml
       containers:
       - name: frontend
+        securityContext:
+          runAsNonRoot: true
+          readOnlyRootFilesystem: true
+          allowPrivilegeEscalation: false
+          capabilities:
+            drop:
+              - ALL
         env:
As per coding guidelines: "All containers must have restricted SecurityContext: runAsNonRoot, drop ALL capabilities, readOnlyRootFilesystem"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@components/manifests/overlays/hcmais/jsell-sso-poc/control-plane-env-patch.yaml`
around lines 1 - 30, Add a restricted securityContext to the
ambient-control-plane container spec: under the container entry for name
"ambient-control-plane" in the Deployment template, add securityContext with
runAsNonRoot: true, capabilities: { drop: ["ALL"] } and readOnlyRootFilesystem:
true; repeat the same for the corresponding container in the frontend SSO patch
(container name used there) to ensure both manifests include the required
restricted SecurityContext.
components/ambient-api-server/cmd/ambient-api-server/environments/e_production.go-31-37 (1)
31-37: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

CLI precedence is not preserved when flag is explicitly set to the default URL.

Current check treats “explicitly set to default” the same as “not set”, so JWK_CERT_URL can override an explicit CLI value. That contradicts the intended flag > env > default ordering for a security-critical trust endpoint.

Use an explicit “flag was set” signal from the flag parser (instead of comparing the value to defaultJwkCertURL) and branch on that boolean in OverrideConfig.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@components/ambient-api-server/cmd/ambient-api-server/environments/e_production.go`
around lines 31 - 37, The switch in OverrideConfig that checks c.Auth.JwkCertURL
against defaultJwkCertURL allows an explicitly-provided CLI default to be
overridden by the JWK_CERT_URL env var; change the logic to use the
flag-parser's "was set" boolean (the explicit flag-set signal) for JwkCertURL
instead of comparing to defaultJwkCertURL: if the CLI flag for JwkCertURL was
set, keep c.Auth.JwkCertURL as provided; else if os.Getenv("JWK_CERT_URL") != ""
set c.Auth.JwkCertURL from the env var; otherwise set it to
defaultJwkCertURL—update the branching in OverrideConfig and reference
c.Auth.JwkCertURL, defaultJwkCertURL, and the JWK_CERT_URL env var accordingly.
components/manifests/overlays/hcmais/keycloak/keycloak-db.yaml-4-25 (1)
4-25: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Add controller ownerReferences for child resources (Secret and PersistentVolumeClaim).

keycloak-db Secret and keycloak-db-data PVC are created without metadata.ownerReferences. In this repo, child resources are required to be owner-scoped; otherwise cleanup and lifecycle safety are not guaranteed.

As per coding guidelines, "/manifests//*.{yaml,yml}: All child resources (Jobs, Secrets, PVCs) must have OwnerReferences set with controller owner refs".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/overlays/hcmais/keycloak/keycloak-db.yaml` around lines
4 - 25, The Secret named "keycloak-db" and the PVC "keycloak-db-data" lack
metadata.ownerReferences; add an ownerReferences array to both resources with
controller: true and the parent controller's apiVersion, kind, name and uid so
they are owner-scoped and garbage-collected with the parent; ensure the owner
reference entries include fields: apiVersion, kind, name, uid and controller:
true, matching the controller resource that manages these children (e.g., the
Keycloak/custom controller CR).
components/manifests/overlays/hcmais/keycloak/keycloak-deployment.yaml-81-85 (1)
81-85: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden container filesystem to read-only.

Security context is missing readOnlyRootFilesystem: true, which is required by manifest policy.
Proposed patch
           securityContext:
             allowPrivilegeEscalation: false
+            readOnlyRootFilesystem: true
             capabilities:
               drop:
                 - ALL
As per coding guidelines, "/manifests//*.{yaml,yml}: All containers must have restricted SecurityContext: runAsNonRoot, drop ALL capabilities, readOnlyRootFilesystem".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/overlays/hcmais/keycloak/keycloak-deployment.yaml`
around lines 81 - 85, The container's SecurityContext block (securityContext /
allowPrivilegeEscalation / capabilities.drop) is missing readOnlyRootFilesystem;
update the SecurityContext for the affected container to include
readOnlyRootFilesystem: true alongside the existing allowPrivilegeEscalation:
false and capabilities.drop: [ALL], and ensure runAsNonRoot is set if not
already (add runAsNonRoot: true) so the manifest meets the restricted
SecurityContext policy.
components/backend/server/k8s.go-72-93 (1)
72-93: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when SSO is enabled but JWT validation is unavailable.

This currently logs and continues, so SSO_ENABLED=true can boot with JWTValidator == nil and leave auth half-configured at runtime. Return an error here and abort startup when SSO is on but discovery/JWKS initialization fails.
Suggested change
-func InitJWTValidator() {
+func InitJWTValidator() error {
 	issuerURL := os.Getenv("SSO_ISSUER_URL")
 	audience := os.Getenv("SSO_AUDIENCE")
 	if issuerURL == "" {
+		if os.Getenv("SSO_ENABLED") == "true" {
+			return fmt.Errorf("SSO_ENABLED=true but SSO_ISSUER_URL is not set")
+		}
 		log.Printf("SSO: JWT validation not configured (SSO_ISSUER_URL not set)")
-		return
+		return nil
 	}

 	v, err := jwtauth.NewValidator(issuerURL, audience)
 	if err != nil {
-		log.Printf("SSO: failed to initialize JWT validator: %v", err)
-		return
+		return fmt.Errorf("failed to initialize JWT validator: %w", err)
 	}
@@
 	JWTValidator = v
 	log.Printf("SSO: JWT validator initialized (issuer=%s, audience=%s)", issuerURL, audience)
+	return nil
 }
if err := server.InitJWTValidator(); err != nil {
	log.Fatalf("Failed to initialize JWT validator: %v", err)
}
As per coding guidelines, Never silently swallow partial failures; every error path must propagate or be explicitly collected, never discarded.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/backend/server/k8s.go` around lines 72 - 93, InitJWTValidator
currently swallows errors and leaves JWTValidator nil; change its signature to
return an error (e.g., func InitJWTValidator() error), propagate and return
errors when issuer missing while SSO is enabled or when jwtauth.NewValidator
fails, and ensure it does not just log and continue; also update callers to
check the returned error and abort startup (e.g., log.Fatalf) when
InitJWTValidator returns non-nil so SSO_ENABLED cannot boot with a nil
JWTValidator; reference symbols: InitJWTValidator, JWTValidator,
jwtauth.NewValidator, SSO_ISSUER_URL, SSO_PUBLIC_ISSUER_URL.
components/frontend/src/lib/session.ts-10-12 (1)
10-12: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the hardcoded session secret fallback.

Using a static default secret makes session cookies forgeable whenever SESSION_SECRET is unset.
Suggested fix
 const sessionOptions: SessionOptions = {
-  password: process.env.SESSION_SECRET || "dev-session-secret-must-be-at-least-32-chars-long",
+  password: (() => {
+    const secret = process.env.SESSION_SECRET;
+    if (!secret || secret.length < 32) {
+      throw new Error("SESSION_SECRET must be set and at least 32 characters");
+    }
+    return secret;
+  })(),
As per coding guidelines, "Flag secrets/tokens logged in plaintext or hardcoded in source code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/frontend/src/lib/session.ts` around lines 10 - 12, The
sessionOptions currently falls back to a hardcoded secret in the password field
which is insecure; remove the default string and require
process.env.SESSION_SECRET to be present, and fail fast if it's missing (e.g.,
throw an error or assert during initialization) so the app never runs with a
static secret; update the code that constructs sessionOptions (the password
property) to reference only process.env.SESSION_SECRET and add a startup check
that validates SESSION_SECRET is set and has sufficient length before creating
sessionOptions or calling any session-related functions.
components/frontend/src/lib/oidc.ts-29-33 (1)
29-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a timeout to OIDC discovery, and prevent HTTP issuers from bypassing HTTPS-only checks
fetch(wellKnownUrl) has no timeout, so auth can stall on network issues.
Suggested fix
const resp = await fetch(wellKnownUrl);
const resp = await fetch(wellKnownUrl, {

signal: AbortSignal.timeout(5000),

});
</details>
- `client.allowInsecureRequests(cachedConfig)` is enabled whenever `SSO_ISSUER_URL` is `http:`, which disables the library’s HTTPS-only restriction for discovery and subsequent OIDC calls—reject non-HTTPS in non-local environments or gate this behind an explicit dev-only opt-in.

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @components/frontend/src/lib/oidc.ts around lines 29 - 33, The OIDC discovery
currently uses fetch(wellKnownUrl) with no timeout and unconditionally enables
client.allowInsecureRequests(cachedConfig) for http issuers; update the
discovery to use an AbortController timeout (e.g., configurable default like 5s)
around the fetch(wellKnownUrl) so network stalls abort, and validate the issuer
URL (wellKnownUrl or SSO_ISSUER_URL) to reject plain "http:" in non-local/test
environments—only call client.allowInsecureRequests(cachedConfig) when an
explicit dev-only opt-in flag (e.g., SSO_ALLOW_INSECURE=true) is set or when
running in a recognized local env; otherwise throw an error and avoid disabling
HTTPS-only checks. Use the existing wellKnownUrl, resp and metadata variables
and the client.allowInsecureRequests(cachedConfig) symbol to locate where to
apply these changes.
</details>

</blockquote></details>
<details>
<summary>components/frontend/src/lib/auth.ts-184-186 (1)</summary><blockquote>

`184-186`: _⚠️ Potential issue_ | _🟠 Major_ | _⚡ Quick win_

**Preserve token header contract in SSO forwarding path.**

`buildForwardHeadersSSO` sets `Authorization` but not `X-Forwarded-Access-Token`. This diverges from the existing forwarding contract and can break backend paths still reading the forwarded token header.

<details>
<summary>Suggested fix</summary>

```diff
   if (accessToken) {
+    headers['X-Forwarded-Access-Token'] = accessToken;
     headers['Authorization'] = `Bearer ${accessToken}`;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/frontend/src/lib/auth.ts` around lines 184 - 186, The SSO
forwarding function buildForwardHeadersSSO sets headers['Authorization'] but
omits the legacy X-Forwarded-Access-Token header; update buildForwardHeadersSSO
so that whenever you add headers['Authorization'] = `Bearer ${accessToken}` you
also set headers['X-Forwarded-Access-Token'] = accessToken to preserve the
existing forwarding contract and ensure backends that read
X-Forwarded-Access-Token continue to receive the raw token.

components/frontend/src/app/api/auth/sso/e2e-login/route.ts-9-14 (1)

9-14: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden production blocking for the unauthenticated E2E login endpoint.

This route creates sessions from arbitrary tokens without user auth, so env-gating must be fail-closed. Relying only on AMBIENT_ENV is brittle if unset/mis-set in production runtime.

Suggested fix

+  const ambientEnv = (process.env.AMBIENT_ENV ?? "").toLowerCase();
   if (
     process.env.E2E_TEST_HELPERS !== "true" ||
-    process.env.AMBIENT_ENV === "production"
+    ambientEnv === "production" ||
+    process.env.NODE_ENV === "production"
   ) {
     return NextResponse.json({ error: "Not available" }, { status: 404 });
   }

As per coding guidelines, Flag missing authentication/authorization on API endpoints.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/frontend/src/app/api/auth/sso/e2e-login/route.ts` around lines 9 -
14, The current allowlist for the unauthenticated E2E login endpoint is too
permissive; change the fail-open check in route.ts so the endpoint is strictly
disabled unless multiple explicit test-safe signals are present. Replace the
single if that checks process.env.E2E_TEST_HELPERS and process.env.AMBIENT_ENV
with a fail-closed guard that requires (1) process.env.E2E_TEST_HELPERS ===
"true", (2) process.env.NODE_ENV !== "production" (or NODE_ENV ===
"development"), (3) process.env.AMBIENT_ENV !== "production", and (4) a
dedicated secret like process.env.E2E_TEST_SECRET is set and non-empty; if any
of those fail, return the 404/Not available response. This ensures the
conditional around the route in route.ts only enables the handler when multiple
independent test-only signals are present.

components/frontend/src/components/navigation.tsx-31-35 (1)

31-35: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Logout fallback misroutes unresolved users to legacy sign-out.

Line 31 treats me?.ssoEnabled unknown as false, so initial-load or /me error cases go to /oauth/sign_out. In SSO mode this can fail logout. Default unknown to SSO (legacy only on explicit false), and apply the same fix in components/frontend/src/app/projects/[name]/layout.tsx.

Suggested fix

   const handleLogout = () => {
-    if (me?.ssoEnabled) {
-      window.location.href = '/api/auth/sso/logout';
-    } else {
-      window.location.href = '/oauth/sign_out';
-    }
+    const logoutPath =
+      me?.ssoEnabled === false ? '/oauth/sign_out' : '/api/auth/sso/logout';
+    window.location.href = logoutPath;
   };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/frontend/src/components/navigation.tsx` around lines 31 - 35, The
logout logic currently treats me?.ssoEnabled unknown as false and sends users to
the legacy '/oauth/sign_out'; change the condition to only use the legacy path
when ssoEnabled is explicitly false (e.g. check me?.ssoEnabled === false) and
otherwise default to the SSO logout URL ('/api/auth/sso/logout'), and apply this
same fix in both the Navigation logout code (where me and me?.ssoEnabled are
used) and the analogous logout block in the projects layout component
(components/frontend/src/app/projects/[name]/layout.tsx).

components/frontend/src/app/sso/[...path]/route.ts-37-43 (1)

37-43: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop forwarding sensitive inbound headers to Keycloak.

This forwards nearly all request headers upstream, including potentially cookie and authorization. That can leak frontend session material to the IdP proxy target.

Suggested fix

-  const headers = new Headers();
-  for (const [key, value] of request.headers.entries()) {
-    if (key === "host" || key === "connection") continue;
-    headers.set(key, value);
-  }
+  const headers = new Headers();
+  const blockedHeaders = new Set([
+    "host",
+    "connection",
+    "cookie",
+    "authorization",
+    "x-forwarded-access-token",
+  ]);
+
+  for (const [key, value] of request.headers.entries()) {
+    if (blockedHeaders.has(key.toLowerCase())) continue;
+    headers.set(key, value);
+  }
   headers.set("host", target.host);

As per coding guidelines, "Flag sensitive data leaked in API responses, WebSocket messages, or logs" and "Flag secrets/tokens logged in plaintext or hardcoded in source code."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/frontend/src/app/sso/`[...path]/route.ts around lines 37 - 43, The
current header-copy loop in the SSO proxy blindly forwards inbound headers
(using request.headers -> headers) which may leak sensitive values like Cookie
or Authorization; modify the loop in route.ts to skip forwarding sensitive
headers (at minimum "cookie" and "authorization", plus hop-by-hop headers such
as "connection", "keep-alive", "proxy-authenticate", "proxy-authorization",
"te", "trailers", "transfer-encoding", "upgrade") and only set safe headers
(retain headers.set("host", target.host) as needed); ensure the code references
the existing headers variable and request.headers iteration and explicitly
excludes those header names before calling headers.set.

Makefile-1126-1137 (1)

1126-1137: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

kind-sso-toggle can report success after partial failure.

Lines 1126–1137 ignore failures for Unleash toggle and backend rollout (|| true), then print “Done.” This makes the target nondeterministic and hard to trust.

As per coding guidelines "Handle errors and edge cases explicitly rather than ignoring them".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 1126 - 1137, The kind-sso-toggle Makefile target
currently swallows failures (uses "|| true") for the Unleash curl toggle and
backend rollout, then prints "Done", which hides partial failures; update the
target (kind-sso-toggle) to treat those commands as fatal by removing "|| true"
for the curl POST to Unleash and the "kubectl rollout status
deployment/backend-api" (and/or explicitly check their exit codes), and add
explicit error handling/logging so the make target exits non‑zero and prints a
clear error message if the Unleash toggle (curl -X POST ...) or the backend
rollout check (kubectl rollout status deployment/backend-api) fails before
printing the final "Done" message.

e2e/scripts/extract-token.sh-33-50 (1)

33-50: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

SSO mode should fail if Keycloak token acquisition fails.

Lines 33–50 fall back to service-account token even when E2E_USE_SSO=true, which can make the SSO suite pass without actually exercising Keycloak JWT flow.

As per coding guidelines "Handle errors and edge cases explicitly rather than ignoring them".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/scripts/extract-token.sh` around lines 33 - 50, The fallback loop
currently extracts a K8s SA token into TOKEN even when E2E_USE_SSO=true; change
the logic in extract-token.sh so that after attempting Keycloak token
acquisition you check if E2E_USE_SSO is set to "true" (or non-empty SSO flag)
and, if TOKEN is empty, immediately fail with a clear error and non-zero exit
rather than entering the kubectl fallback loop; specifically, update the section
around the TOKEN variable and the for-loop so the script tests E2E_USE_SSO
before attempting to read test-user-token and calls exit 1 with an explanatory
message when SSO was requested but no Keycloak token was obtained.

.github/workflows/e2e.yml-283-295 (1)

283-295: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat missing JWT validator initialization as a hard failure in SSO mode.

Line 291 only logs a warning and continues, which allows the SSO phase to proceed without verified JWT validation wiring.

Suggested fix

-          if [ "$i" -eq 15 ]; then
-            echo "WARNING: Backend JWT validator may not have initialized"
-            kubectl logs -n ambient-code -l app=backend-api --tail=20 | grep -i sso || true
-          fi
+          if [ "$i" -eq 15 ]; then
+            echo "ERROR: Backend JWT validator did not initialize in time"
+            kubectl logs -n ambient-code -l app=backend-api --tail=50 | grep -i sso || true
+            exit 1
+          fi

As per coding guidelines "Handle errors and edge cases explicitly rather than ignoring them".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e.yml around lines 283 - 295, The check that looks for
the "SSO: JWT validator initialized" log currently only prints a warning
("WARNING: Backend JWT validator may not have initialized") and continues;
change this to fail the workflow when running in SSO mode by exiting non‑zero
after emitting diagnostic logs. Locate the loop that greps for "SSO: JWT
validator initialized" (the for loop that echoes "Verifying backend SSO
initialization..." and the conditional that prints the WARNING) and replace the
warning+continue behavior with printing relevant kubectl logs for diagnosis and
calling exit 1 so the pipeline stops on missing JWT validator initialization.

e2e/cypress/support/commands.ts-33-38 (1)

33-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t swallow /api/auth/sso/e2e-login failures.

Line 37 sets failOnStatusCode: false but the response is never validated, so SSO session setup can fail silently before cy.visit().

Suggested fix

-    cy.request({
+    cy.request({
       method: 'POST',
       url: '/api/auth/sso/e2e-login',
       body: { token },
-      failOnStatusCode: false,
+      failOnStatusCode: true,
+    }).its('status').should('eq', 200)
-    })
+

As per coding guidelines "Handle errors and edge cases explicitly rather than ignoring them".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/cypress/support/commands.ts` around lines 33 - 38, The cy.request call to
'/api/auth/sso/e2e-login' currently uses failOnStatusCode: false and does not
validate the response, which can hide SSO setup failures; update the cy.request
for the SSO e2e-login (the POST with body { token }) to explicitly check the
response (e.g., assert response.status is 200 and/or validate expected response
body properties) and fail the test when the response is not successful (or
remove failOnStatusCode: false so Cypress fails on non-2xx). Ensure the
assertion occurs immediately after the cy.request so any SSO session setup error
is surfaced before cy.visit().

.github/workflows/e2e.yml-268-273 (1)

268-273: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail the workflow if the SSO flag cannot be turned on.

Line 272–273 currently ignores failure on the decisive /on call (|| true), so SSO tests may run while the feature is still off, giving invalid coverage.

Suggested fix

-        curl -sf -X POST "http://localhost:4242/api/admin/projects/default/features/sso-authentication/environments/development/on" \
-          -H "Authorization: $UNLEASH_ADMIN_TOKEN" 2>/dev/null || true
+        curl -sf -X POST "http://localhost:4242/api/admin/projects/default/features/sso-authentication/environments/development/on" \
+          -H "Authorization: $UNLEASH_ADMIN_TOKEN" >/dev/null

As per coding guidelines "Handle errors and edge cases explicitly rather than ignoring them".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e.yml around lines 268 - 273, The /on curl invocation
currently swallows failures with "|| true", so the workflow may continue if the
SSO feature toggle fails to enable; remove the "|| true" fallback from the curl
that posts to
"/api/admin/projects/default/features/sso-authentication/environments/development/on"
(the curl using -H "Authorization: $UNLEASH_ADMIN_TOKEN") so that the job fails
on non-zero exit, or alternatively replace it with an explicit
retry+exit-on-failure sequence that retries the same curl and exits non-zero if
it still fails.

🟡 Minor comments (1)

components/frontend/src/middleware.ts-31-32 (1)

31-32: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep query params in returnTo during SSO redirect.

Current redirect only stores pathname, so post-login navigation loses query-driven state.

Suggested fix

-  loginUrl.searchParams.set("returnTo", request.nextUrl.pathname);
+  loginUrl.searchParams.set(
+    "returnTo",
+    `${request.nextUrl.pathname}${request.nextUrl.search}`,
+  );

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/frontend/src/middleware.ts` around lines 31 - 32, The redirect
currently stores only request.nextUrl.pathname in
loginUrl.searchParams.set("returnTo", ...), losing query params; update that
call to include the full path and query (e.g., use request.nextUrl.pathname +
request.nextUrl.search or request.nextUrl.href/relative path) so the returnTo
preserves query-driven state before calling NextResponse.redirect(loginUrl).

🧹 Nitpick comments (2)

specs/security/sso-authentication.spec.md (2)

299-322: ⚡ Quick win

Clarify that Keycloak is opt-in, not the default auth mechanism.

Lines 302-303 state Keycloak "replaces" static JWKS/DISABLE_AUTH/OC_TOKEN "as the primary local auth mechanism," but the PR objectives and Kind doc updates show kind-up defaults to legacy auth with make kind-sso-toggle enabling SSO. The spec should clarify Keycloak is available but not active by default.

📝 Proposed clarification

 The Kind and local-dev environments SHALL include a Keycloak instance as part of the
 dev stack, providing a real OIDC flow without requiring VPN access to Red Hat SSO.
-This replaces the static JWKS ConfigMap, `DISABLE_AUTH=true` mock mode, and
-`OC_TOKEN` / `ENABLE_OC_WHOAMI` env vars as the primary local auth mechanism.
+Keycloak is deployed alongside the legacy auth mode and can be toggled on/off via
+`make kind-sso-toggle`. When enabled, it replaces the static JWKS ConfigMap,
+`DISABLE_AUTH=true` mock mode, and `OC_TOKEN` / `ENABLE_OC_WHOAMI` env vars.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specs/security/sso-authentication.spec.md` around lines 299 - 322, Update the
spec wording to state Keycloak is an opt-in local auth option rather than the
default: replace language that says Keycloak "replaces" static JWKS /
DISABLE_AUTH / OC_TOKEN and instead note that the Kind/local-dev stack ships
with legacy auth by default and Keycloak can be enabled via the kind SSO toggle
(e.g., make kind-sso-toggle) or equivalent dev workflow; keep details about the
pre-configured realm, JWKS validation parity with production, and that
DISABLE_AUTH may remain as a non-production fallback.

121-136: 💤 Low value

Rephrase as already-implemented change, not future action.

Lines 121-136 present the e_production.go change as a "Required code change" the reader must ensure, but layer 12 of this PR already implements it. Rephrase to clarify the change is included: "The api-server's production environment has been updated to read JWK_CERT_URL from the environment..."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specs/security/sso-authentication.spec.md` around lines 121 - 136, Update the
spec text to present the e_production.go change as already implemented rather
than a future requirement: replace the "Required code change" phrasing with
past-tense wording that the api-server's production environment has been updated
to read JWK_CERT_URL from the environment and that the backend uses the
ServiceAccount token with impersonation headers for K8s API calls; reference the
existing implementation in e_production.go and mention the JWK_CERT_URL env var
and the impersonation ClusterRole change so readers understand these are
included in the current deployment.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 39134db5-341a-4cbd-ace9-5277d8c23a1d

📥 Commits

Reviewing files that changed from the base of the PR and between 351fba7 and 9575817.

⛔ Files ignored due to path filters (48)

components/ambient-api-server/pkg/api/openapi/.travis.yml is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/README.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/Agent.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/AgentList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/AgentPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/AgentSessionList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ApiAmbientV1ProjectsIdScheduledSessionsSsIdTriggerPost200Response.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/Credential.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/CredentialList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/CredentialPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/CredentialTokenResponse.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/DefaultAPI.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/Error.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/InboxMessage.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/InboxMessageList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/InboxMessagePatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/List.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ObjectReference.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/Project.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectHome.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectHomeAgent.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectSettings.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectSettingsList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ProjectSettingsPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/Role.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/RoleBinding.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/RoleBindingList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/RoleBindingPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/RoleList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/RolePatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ScheduledSession.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ScheduledSessionList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/ScheduledSessionPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/Session.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/SessionList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/SessionMessage.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/SessionMessagePushRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/SessionPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/SessionStatusPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/StartRequest.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/StartResponse.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/User.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/UserList.md is excluded by !**/pkg/api/openapi/**
components/ambient-api-server/pkg/api/openapi/docs/UserPatchRequest.md is excluded by !**/pkg/api/openapi/**
components/backend/go.sum is excluded by !**/*.sum, !**/go.sum
components/frontend/package-lock.json is excluded by !**/package-lock.json, !**/package-lock.json

📒 Files selected for processing (99)

.github/workflows/e2e.yml
.specify/memory/constitution_update_checklist.md
.specify/scripts/bash/check-prerequisites.sh
.specify/scripts/bash/common.sh
.specify/scripts/bash/create-new-feature.sh
.specify/scripts/bash/setup-plan.sh
.specify/scripts/bash/update-agent-context.sh
.specify/templates/checklist-template.md
.specify/templates/plan-template.md
.specify/templates/spec-template.md
.specify/templates/tasks-template.md
CONTRIBUTING.md
Makefile
components/ambient-api-server/cmd/ambient-api-server/environments/e_production.go
components/ambient-api-server/plugins/roleBindings/migration.go
components/ambient-api-server/test/integration/integration_test.go
components/ambient-cli/cmd/acpctl/project/cmd.go
components/backend/README.md
components/backend/go.mod
components/backend/handlers/middleware.go
components/backend/handlers/models_test.go
components/backend/handlers/projects.go
components/backend/handlers/ssar_cache.go
components/backend/handlers/sso.go
components/backend/jwtauth/validator.go
components/backend/jwtauth/validator_test.go
components/backend/main.go
components/backend/server/k8s.go
components/backend/server/server.go
components/backend/server/server_test.go
components/frontend/.env.example
components/frontend/README.md
components/frontend/package.json
components/frontend/src/app/api/auth/sso/callback/route.ts
components/frontend/src/app/api/auth/sso/e2e-login/route.ts
components/frontend/src/app/api/auth/sso/login/route.ts
components/frontend/src/app/api/auth/sso/logout/route.ts
components/frontend/src/app/api/me/route.ts
components/frontend/src/app/logged-out/page.tsx
components/frontend/src/app/projects/[name]/layout.tsx
components/frontend/src/app/projects/[name]/sessions/[sessionName]/page.tsx
components/frontend/src/app/sso/[...path]/route.ts
components/frontend/src/components/navigation.tsx
components/frontend/src/components/providers/query-provider.tsx
components/frontend/src/components/session-expired-dialog.tsx
components/frontend/src/lib/auth-utils.ts
components/frontend/src/lib/auth.ts
components/frontend/src/lib/env.ts
components/frontend/src/lib/oidc.ts
components/frontend/src/lib/query-client.ts
components/frontend/src/lib/session.ts
components/frontend/src/middleware.ts
components/frontend/src/services/api/auth.ts
components/frontend/src/services/api/tasks.ts
components/manifests/base/rbac/backend-clusterrole.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/ambient-api-server-env-patch.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/api-server-route.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/api-server-secret-patch.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/backend-sso-patch.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/clusterrolebindings.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/control-plane-env-patch.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/frontend-route.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/frontend-sso-patch.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/kustomization.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/ldap-config.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/minio-pvc-patch.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/operator-config-openshift.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/public-api-route.yaml
components/manifests/overlays/hcmais/jsell-sso-poc/sso-credentials.yaml
components/manifests/overlays/hcmais/keycloak/keycloak-admin-secret.yaml
components/manifests/overlays/hcmais/keycloak/keycloak-db.yaml
components/manifests/overlays/hcmais/keycloak/keycloak-deployment.yaml
components/manifests/overlays/hcmais/keycloak/keycloak-realm.json
components/manifests/overlays/hcmais/keycloak/kustomization.yaml
components/manifests/overlays/hcmais/keycloak/namespace.yaml
components/manifests/overlays/kind/api-server-security-patch.yaml
components/manifests/overlays/kind/backend-sso-patch.yaml
components/manifests/overlays/kind/control-plane-env-patch.yaml
components/manifests/overlays/kind/e2e-rolebinding.yaml
components/manifests/overlays/kind/frontend-test-patch.yaml
components/manifests/overlays/kind/keycloak-deployment.yaml
components/manifests/overlays/kind/keycloak-realm.json
components/manifests/overlays/kind/kustomization.yaml
components/manifests/overlays/kind/sso-credentials.yaml
components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
components/runners/ambient-runner/ambient_runner/platform/auth.py
components/runners/ambient-runner/tests/test_grpc_client.py
components/runners/ambient-runner/tests/test_mcp_auth.py
components/runners/ambient-runner/tests/test_shared_session_credentials.py
docs/internal/deployment/OPENSHIFT_OAUTH.md
docs/internal/developer/local-development/kind.md
e2e/README.md
e2e/cypress.config.ts
e2e/cypress/support/commands.ts
e2e/scripts/extract-token.sh
e2e/scripts/wait-for-ready.sh
specs/security/sso-authentication.spec.md
workflows/deploy-native-sso.md
workflows/security/sso-migration.workflow.md

💤 Files with no reviewable changes (4)

components/frontend/src/services/api/tasks.ts
.specify/scripts/bash/common.sh
components/runners/ambient-runner/tests/test_grpc_client.py
components/runners/ambient-runner/tests/test_mcp_auth.py

coderabbitai · 2026-05-28T14:47:33Z

    local escaped_lang=$(printf '%s\n' "$NEW_LANG" | sed 's/[\[\.*^$()+{}|]/\\&/g')
    local escaped_framework=$(printf '%s\n' "$NEW_FRAMEWORK" | sed 's/[\[\.*^$()+{}|]/\\&/g')
    local escaped_branch=$(printf '%s\n' "$CURRENT_BRANCH" | sed 's/[\[\.*^$()+{}|]/\\&/g')
-    
+


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail tmp="$(mktemp)" trap 'rm -f "$tmp"' EXIT printf '[EXTRACTED FROM ALL PLAN.MD FILES]\n' > "$tmp" value='React & Next.js | v15' echo "Running current-style substitution with an unescaped replacement value..." set +e sed "s|\[EXTRACTED FROM ALL PLAN.MD FILES\]|- ${value}|" "$tmp" status=$? set -e echo "Exit status: $status (non-zero indicates delimiter breakage; zero with malformed output indicates '&' expansion issue)"

Repository: ambient-code/platform

Length of output: 324

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail # Show the relevant section from the script (around where the escaping/substitutions happen) target=".specify/scripts/bash/update-agent-context.sh" nl -ba "$target" | sed -n '280,370p'

Repository: ambient-code/platform

Length of output: 108

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail tmp="$(mktemp)" trap 'rm -f "$tmp"' EXIT printf '[EXTRACTED FROM ALL PLAN.MD FILES]\n' > "$tmp" run_case () { local label="$1" local value="$2" echo echo "$label" set +e out="$(sed "s|\[EXTRACTED FROM ALL PLAN.MD FILES\]|- ${value}|" "$tmp" 2>&1)" status=$? set -e echo "exit=$status" echo "output:" echo "$out" } run_case "Case: unescaped & (no | present)" 'React & Next.js' run_case "Case: unescaped backslash (no | present)" 'React \ Next.js' run_case "Case: unescaped | (breaks delimiter)" 'React & Next.js | v15'

Repository: ambient-code/platform

Length of output: 346

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail tmp="$(mktemp)" trap 'rm -f "$tmp"' EXIT printf '[EXTRACTED FROM ALL PLAN.MD FILES]\n' > "$tmp" # Expected safe escaping for sed replacement: escape & and backslash (and delimiter |) escape_sed_replacement() { printf '%s' "$1" | sed -e 's/[&|\\]/\\&/g' } value='React & Next.js | v15' escaped="$(escape_sed_replacement "$value")" echo "Escaped replacement value:" echo "$escaped" sed "s|\[EXTRACTED FROM ALL PLAN.MD FILES\]|- $escaped|" "$tmp"

Repository: ambient-code/platform

Length of output: 140

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail # Locate all sed uses of the bracket placeholders in this script rg -n --fixed-string 's|\[' ".specify/scripts/bash/update-agent-context.sh" || true rg -n --fixed-string '\[EXTRACTED FROM ALL PLAN.MD FILES\]' ".specify/scripts/bash/update-agent-context.sh" || true

Repository: ambient-code/platform

Length of output: 961

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail # Quick check: confirm the current escaping expression and whether it includes '&' or '|' rg -n "escaped_(lang|framework|branch)|sed 's/\\[\\\.\*\\^\\$\$\$\\+\\{\\}\\|]/" ".specify/scripts/bash/update-agent-context.sh" || true rg -n "sed 's/\\[\\\.\*\\^\\$\$\$\\+\\{\\}\\|]/\\\\&/g'" ".specify/scripts/bash/update-agent-context.sh" || true

Repository: ambient-code/platform

Length of output: 1199

Fix sed replacement escaping for dynamic values (&, \, and delimiter |)

The current escaping at .specify/scripts/bash/update-agent-context.sh lines 302-304 only prefixes regex metacharacters ([\[\.*^$()+{}|]) and does not escape sed replacement metacharacters. Dynamic plan fields containing:

& expands to the matched placeholder (confirmed: React & Next.js results in the placeholder being re-inserted into the output),

\ alters the replacement output (confirmed: React \ Next.js drops/remaps the backslash),

| breaks the sed command parsing (confirmed: React & Next.js | v15 fails with “unknown option to s'”). Adopt replacement-safe escaping (escape &, `, and | for the s|pattern|replacement| replacement context) before interpolating these values into the generated sed substitution strings.

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 302-302: Declare and assign separately to avoid masking return values.

(SC2155)

[info] 302-302: Expressions don't expand in single quotes, use double quotes for that.

(SC2016)

[warning] 303-303: Declare and assign separately to avoid masking return values.

(SC2155)

[info] 303-303: Expressions don't expand in single quotes, use double quotes for that.

(SC2016)

[warning] 304-304: Declare and assign separately to avoid masking return values.

(SC2155)

[info] 304-304: Expressions don't expand in single quotes, use double quotes for that.

(SC2016)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.specify/scripts/bash/update-agent-context.sh around lines 302 - 305, The current escaping for NEW_LANG, NEW_FRAMEWORK and CURRENT_BRANCH (variables used to build escaped_lang, escaped_framework, escaped_branch) only handles regex metacharacters but not sed replacement metacharacters (&, \, |); update the sed pipeline that creates escaped_lang/escaped_framework/escaped_branch to first escape backslashes then escape & and the chosen sed delimiter (|) before interpolation — for example run two or three sed expressions (or a single sed -e chain) that replaces '\' with '\\', then '&' with '\&', and '|' with '\|' so the values are safe when used in the s|pattern|replacement| substitution. Ensure you update the commands that reference escaped_lang/escaped_framework/escaped_branch accordingly.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@e2e/cypress/e2e/sessions.cy.ts`:
- Around line 1246-1247: The test currently uses
cy.get('`#runner-secrets-panel`').find('[class*="skeleton"]', { timeout: 0
}).should('not.exist'), which disables Cypress' retry and causes a race; remove
the explicit { timeout: 0 } (or replace it with a positive timeout) on the
.find() call so Cypress will retry until skeleton elements under
'`#runner-secrets-panel`' disappear before asserting should('not.exist').

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c8803191-c40f-4926-b093-ce72ef1a8752

📥 Commits

Reviewing files that changed from the base of the PR and between 9575817 and d1b8d5e.

📒 Files selected for processing (1)

e2e/cypress/e2e/sessions.cy.ts

…ys test The #runner-secrets-panel element only renders conditionally after accordion expansion AND loading completion. Target the actual input elements by their runner-secret-* IDs instead, letting Cypress retry until they're in the DOM. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@e2e/cypress/e2e/sessions.cy.ts`:
- Line 1246: Replace the hardcoded secret in the test where the runner secret
input is selected (the cy.get('input[id^="runner-secret-"]') ...
.type('test-api-key-value') call) with a value read from an environment variable
(e.g., Cypress.env('RUNNER_SECRET') or a CYPRESS_ prefixed env var) and provide
a safe, non-sensitive fallback (like an obviously placeholder string) so the
test does not embed secret-like literals in source; update the .type(...) call
to use that env-derived value instead of the hardcoded 'test-api-key-value'.
- Line 1246: The global selector cy.get('input[id^="runner-secret-"]') can match
hidden or duplicate inputs and cause flakes; scope the query to the expanded
panel by first selecting the panel element (e.g.,
cy.get('`#runner-secrets-panel`') or the opened accordion container), assert it is
visible with .should('be.visible'), then use
.find('input[id^="runner-secret-"]').first() and perform .clear({ force: true
}).type('test-api-key-value', { force: true }) so the test targets the intended
input reliably; use the existing Cypress commands cy.get, .find, .should,
.clear, and .type to implement this change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 17aaa800-266c-43ae-9e82-7d560579a040

📥 Commits

Reviewing files that changed from the base of the PR and between d1b8d5e and 2a22ba3.

📒 Files selected for processing (1)

e2e/cypress/e2e/sessions.cy.ts

coderabbitai · 2026-05-28T15:02:44Z

-            }
-          })
+          // Wait for the accordion panel to render and inputs to be available
+          cy.get('input[id^="runner-secret-"]', { timeout: 10000 }).first().clear({ force: true }).type('test-api-key-value', { force: true })


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid hardcoding secret-like values in test source.

Line 1246 hardcodes a secret-looking value (test-api-key-value). Even in E2E tests, this pattern increases accidental leakage risk and can normalize insecure secret handling. Use an environment variable with a safe fallback.

As per coding guidelines, "Flag secrets/tokens logged in plaintext or hardcoded in source code".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/cypress/e2e/sessions.cy.ts` at line 1246, Replace the hardcoded secret in the test where the runner secret input is selected (the cy.get('input[id^="runner-secret-"]') ... .type('test-api-key-value') call) with a value read from an environment variable (e.g., Cypress.env('RUNNER_SECRET') or a CYPRESS_ prefixed env var) and provide a safe, non-sensitive fallback (like an obviously placeholder string) so the test does not embed secret-like literals in source; update the .type(...) call to use that env-derived value instead of the hardcoded 'test-api-key-value'.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope the runner-secret input query to the expanded panel.

Line 1246 uses a global selector (input[id^="runner-secret-"]) and can match the wrong element if multiple matching inputs exist (or hidden content remains in DOM), causing flaky writes. Scope to #runner-secrets-panel (or the opened accordion content) and assert visibility before typing.

Suggested fix

- cy.get('input[id^="runner-secret-"]', { timeout: 10000 }).first().clear({ force: true }).type('test-api-key-value', { force: true }) + cy.get('`#runner-secrets-panel`', { timeout: 10000 }).should('be.visible') + cy.get('`#runner-secrets-panel`') + .find('input[id^="runner-secret-"]:visible', { timeout: 10000 }) + .first() + .clear({ force: true }) + .type(Cypress.env('E2E_RUNNER_API_KEY') ?? 'e2e-placeholder-key', { force: true })

As per coding guidelines, "Handle errors and edge cases explicitly rather than ignoring them".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cy.get('input[id^="runner-secret-"]', { timeout: 10000 }).first().clear({ force: true }).type('test-api-key-value', { force: true })

cy.get('`#runner-secrets-panel`', { timeout: 10000 }).should('be.visible')

cy.get('`#runner-secrets-panel`')

.find('input[id^="runner-secret-"]:visible', { timeout: 10000 })

.first()

.clear({ force: true })

.type(Cypress.env('E2E_RUNNER_API_KEY') ?? 'e2e-placeholder-key', { force: true })

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/cypress/e2e/sessions.cy.ts` at line 1246, The global selector cy.get('input[id^="runner-secret-"]') can match hidden or duplicate inputs and cause flakes; scope the query to the expanded panel by first selecting the panel element (e.g., cy.get('`#runner-secrets-panel`') or the opened accordion container), assert it is visible with .should('be.visible'), then use .find('input[id^="runner-secret-"]').first() and perform .clear({ force: true }).type('test-api-key-value', { force: true }) so the test targets the intended input reliably; use the existing Cypress commands cy.get, .find, .should, .clear, and .type to implement this change.

The test was intentionally written with jQuery if-guards that skip interaction when inputs aren't rendered (runner-types API may not return data in the Kind e2e environment). Prior commits made assertions mandatory, which broke CI. Restore the original defensive pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mergify · 2026-05-28T15:45:06Z

Merge Queue Status

✅ Entered queue — 2026-05-28 15:44 UTC · Rule: default
✅ Checks skipped · PR is already up-to-date
✅ Merged — 2026-05-28 15:45 UTC · at 2ce3305a95eeb3ffdb8a7b33173b4652b8605be1 · squash

This pull request spent 48 seconds in the queue, including 8 seconds running CI.

Required conditions to merge

jsell-rh and others added 27 commits May 12, 2026 15:05

Merge branch 'main' into jsell/spec/sso-authentication

047d324

Merge branch 'jsell/spec/sso-authentication' of https://github.com/am…

4e714b0

…bient-code/platform into jsell/spec/sso-authentication

markturansky reviewed May 21, 2026

View reviewed changes

jsell-rh and others added 4 commits May 21, 2026 15:05

fix(frontend): use Next.js Link component on logged-out page

f23c8bc

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jsell-rh and others added 5 commits May 28, 2026 10:11

chore: format

a5f316f

chore: normalize whitespace and formatting across codebase

d26188a

Trailing space removal, line ending normalization, PEP 8 formatting fixes, and unused import cleanup. No functional changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'main' into jsell/spec/sso-authentication

9575817

Merge branch 'jsell/spec/sso-authentication' of https://github.com/am…

5f5d119

…bient-code/platform into jsell/spec/sso-authentication

jsell-rh marked this pull request as ready for review May 28, 2026 14:21

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Comment thread e2e/cypress/e2e/sessions.cy.ts Outdated

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

mergify Bot added the queued label May 28, 2026

mergify Bot merged commit fb5b57c into main May 28, 2026
71 checks passed

mergify Bot deleted the jsell/spec/sso-authentication branch May 28, 2026 15:45

mergify Bot removed the queued label May 28, 2026

Conversation

jsell-rh commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's implemented

Deployment overlays

Key files

Default behavior

Native SSO deployment (validated)

Test plan

Summary by CodeRabbit

Uh oh!

netlify Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cheerful-kitten-f556a0 canceled.

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markturansky left a comment

Choose a reason for hiding this comment

Amber Code Review — feat(security): SSO/JWT authentication migration (Phase 1)

Summary

Findings

Critical

Major

Minor

Positive Highlights

Recommendations (priority order)

Uh oh!

jsell-rh commented May 21, 2026

Response to Amber code review (#4339633631)

Critical — Open redirect

Major 1 — CHANGE_ME placeholders

Major 2 — Keycloak proxy allowlist

Major 3 — E2E auth bypass

Minor items

Uh oh!

markturansky left a comment

Choose a reason for hiding this comment

Re-review: SSO/JWT Authentication Migration (Phase 1)

Fixes verified ✅

Remaining gaps ⚠️

New bug: session.idToken is never stored 🐛 (must fix before merge)

Style nit (optional)

Verdict

Uh oh!

jsell-rh commented May 21, 2026

Response to re-review

Uh oh!

markturansky left a comment

Choose a reason for hiding this comment

Third Pass — Thorough Line-by-Line Review

All prior issues confirmed resolved ✅

New findings

MAJOR — hcmais Keycloak realm missing groups scope

MAJOR — Identity claim precedence is inconsistent across the stack

MINOR — SSO middleware falls through to X-Forwarded-* headers after JWT failure

MINOR — OIDC authorization request only asks for openid scope

MINOR — getAccessToken() doesn't persist refreshed idToken

MINOR — ambient-cli client lacks an audience mapper

MINOR — new URL(SSO_ISSUER_URL) at module init in Keycloak proxy

Nits

Overall assessment

Uh oh!

markturansky commented May 21, 2026

Fourth-pass review — amber

✅ All third-pass findings resolved

✅ New commits look good

Notes (no action required)

Verdict: LGTM ✅

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

jsell-rh commented May 12, 2026 •

edited by coderabbitai Bot

Loading

netlify Bot commented May 12, 2026 •

edited

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading

New bug: `session.idToken` is never stored 🐛 (must fix before merge)

MAJOR — hcmais Keycloak realm missing `groups` scope

MINOR — OIDC authorization request only asks for `openid` scope

MINOR — `getAccessToken()` doesn't persist refreshed `idToken`

MINOR — `ambient-cli` client lacks an audience mapper

MINOR — `new URL(SSO_ISSUER_URL)` at module init in Keycloak proxy

mergify Bot commented May 28, 2026 •

edited

Loading