codex-lb Helm Chart

Production-ready Helm chart for codex-lb, an OpenAI API load balancer with account pooling, usage tracking, and dashboard.

Design Goal

This chart is organized around install modes, not cloud vendors.

The same chart should work on Docker Desktop, kind, EKS, GKE, OKE, and other Kubernetes distributions. Cluster-specific concerns such as storage classes, ingress classes, load balancer annotations, and secret backends are expressed through values, while the application install contract stays the same.

Prerequisites

Helm 3.7+
Kubernetes 1.32+
Optional:
- Prometheus Operator for ServiceMonitor and PrometheusRule
- cert-manager for automated ingress TLS
- Gateway API CRDs for HTTPRoute
- External Secrets Operator for externalSecrets.enabled=true

Version Policy

Minimum supported Kubernetes version: 1.32
Validation baseline in CI and smoke installs: 1.35

This is a project support policy. Cloud providers may keep older versions available for some time, but the chart and CI no longer optimize for pre-1.32 clusters.

Install Modes

1. Bundled

Use the bundled Bitnami PostgreSQL sub-chart. This is the easiest self-contained install mode for demos, development clusters, and disposable environments.

Key properties:

postgresql.enabled=true
values-bundled.yaml enables databaseMigrateOnStartup=true
the migration Job is reserved for upgrades (pre-upgrade)
fresh installs stay self-contained and single-replica friendly

Example:

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  --set postgresql.auth.password=change-me \
  --set config.databaseMigrateOnStartup=true \
  --set migration.schemaGate.enabled=false

From source

helm dependency build deploy/helm/codex-lb/
helm upgrade --install codex-lb deploy/helm/codex-lb/ \
  -f deploy/helm/codex-lb/values-bundled.yaml \
  --set postgresql.auth.password=change-me

2. External DB

Use an already reachable PostgreSQL database. This is the preferred production contract when the database is managed separately.

Key properties:

postgresql.enabled=false
direct DB URL or DB secret is available at install time
migration Job runs pre-install,pre-upgrade
application pods still keep the schema gate initContainer enabled

Supported DB wiring:

externalDatabase.url
externalDatabase.host, externalDatabase.port, externalDatabase.database, externalDatabase.user
externalDatabase.existingSecret
auth.existingSecret if one secret contains both database-url and encryption-key

Example using a direct URL:

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  --set postgresql.enabled=false \
  --set externalDatabase.url='postgresql+asyncpg://user:pass@db.example.com:5432/codexlb'

Example using separate secrets:

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  --set postgresql.enabled=false \
  --set externalDatabase.existingSecret=codex-lb-db \
  --set auth.existingSecret=codex-lb-app

From source

helm upgrade --install codex-lb deploy/helm/codex-lb/ \
  -f deploy/helm/codex-lb/values-external-db.yaml \
  --set externalDatabase.url='postgresql+asyncpg://user:pass@db.example.com:5432/codexlb'

3. External Secrets

Use External Secrets Operator to materialize credentials.

Key properties:

externalSecrets.enabled=true
DB credentials are not assumed to exist at render time
migration Job remains post-install,pre-upgrade
application pods keep the schema gate initContainer enabled and wait for schema head before starting the app container

Example:

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  --set postgresql.enabled=false \
  --set externalSecrets.enabled=true \
  --set externalSecrets.secretStoreRef.name=my-store

From source

helm upgrade --install codex-lb deploy/helm/codex-lb/ \
  -f deploy/helm/codex-lb/values-external-secrets.yaml \
  --set externalSecrets.secretStoreRef.name=my-store

Quick Start

No repo clone required — install directly from the OCI registry.

Docker Desktop / kind style cluster

Bundled PostgreSQL:

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  --set postgresql.auth.password=local-dev-password \
  --set config.databaseMigrateOnStartup=true \
  --set migration.schemaGate.enabled=false

Managed PostgreSQL

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  --set postgresql.enabled=false \
  --set externalDatabase.url='postgresql+asyncpg://user:pass@db.example.com:5432/codexlb'

From source (development)

If you need to customize the chart itself, clone the repo and install from path:

helm dependency build deploy/helm/codex-lb/
helm upgrade --install codex-lb deploy/helm/codex-lb/ \
  -f deploy/helm/codex-lb/values-bundled.yaml \
  --set postgresql.auth.password=local-dev-password

Included Value Overlays

Mode-centric overlays:

values-bundled.yaml
values-external-db.yaml
values-external-secrets.yaml

Environment-oriented overlays kept for convenience:

values-dev.yaml
values-staging.yaml
values-prod.yaml

The mode overlays define the installation contract. The environment overlays tune scale, observability, and routing posture.

Schema and Migration Behavior

This chart intentionally keeps migration behavior explicit by install mode.

In external DB and external secrets modes, the chart relies on the dedicated migration Job to advance schema.
Application pods use a schema gate initContainer when migration.enabled=true, config.databaseMigrateOnStartup=false, and migration.schemaGate.enabled=true.
That initContainer runs python -m app.db.migrate wait-for-head and blocks the app container until the database is at Alembic head.
In bundled mode, values-bundled.yaml enables startup migration instead of the schema gate so fresh self-contained installs do not deadlock on helm install --wait.

This means:

bundled PostgreSQL installs bootstrap themselves without requiring a separate install-time migration writer
external DB installs with direct credentials can migrate before StatefulSet creation
external secrets installs fail closed instead of serving on a stale schema

Secret Model

The chart supports two secret patterns.

Single secret

Use auth.existingSecret when one secret contains both:

database-url
encryption-key

Split secrets

Use externalDatabase.existingSecret for the database URL and let the chart manage or reference a separate app secret for encryption-key.

When externalDatabase.existingSecret is set and auth.existingSecret is not, the chart-managed app secret contains only the encryption key; the StatefulSet reads CODEX_LB_DATABASE_URL from the external DB secret.

Network Policy

When networkPolicy.enabled=true, the chart now fails closed for the main HTTP ingress port.

The chart does not open port 2455 to every namespace by default.
To allow ingress-controller traffic, set networkPolicy.ingressNSMatchLabels.
For custom cases, use networkPolicy.extraIngress.

Example:

networkPolicy:
  enabled: true
  ingressNSMatchLabels:
    kubernetes.io/metadata.name: ingress-nginx

Connection Pool Sizing

Each pod keeps its own SQLAlchemy pool.

total_connections = (databasePoolSize + databaseMaxOverflow) × replicas

Keep this within your PostgreSQL max_connections budget or place PgBouncer in front of the database.

Production Workload

Multi-replica production deployments require careful coordination of database connectivity, session routing, and graceful shutdown. This section covers the key patterns and tuning parameters.

Prerequisites for Multi-Replica

Single-replica deployments can use SQLite, but multi-replica requires PostgreSQL:

Database: PostgreSQL is mandatory for multi-replica because:
- SQLite does not support concurrent writes from multiple pods
- Leader election requires a shared database backend
- Session bridge ring membership is stored in the database
Leader Election: Enabled by default (config.leaderElectionEnabled=true)
- Ensures only one pod performs background tasks (e.g., session cleanup, metrics aggregation)
- Uses database-backed locking with a TTL (config.leaderElectionTtlSeconds=30)
- If the leader crashes, another pod acquires the lock within 30 seconds
Circuit Breaker: Enabled by default (config.circuitBreakerEnabled=true)
- Protects upstream API endpoints from cascading failures
- Opens after config.circuitBreakerFailureThreshold=5 consecutive failures
- Enters half-open state after config.circuitBreakerRecoveryTimeoutSeconds=60 seconds
- Prevents thundering herd when upstream is degraded

Session Bridge Ring

The session bridge is an in-memory cache of upstream WebSocket connections, shared across the pod ring.

Automatic Ring Membership (PostgreSQL)

When using PostgreSQL, ring membership is automatic and database-backed:

Each pod registers itself in the database on startup
Each pod auto-advertises its owner-handoff endpoint via headless-service DNS
The sessionBridgeInstanceRing field is optional and only needed for manual pod list override
Pods discover each other via database queries; no manual configuration required
Ring membership is cleaned up automatically when pods terminate

The chart configures each pod with:

StatefulSet name: <release>-codex-lb-workload
serviceName: <release>-codex-lb-bridge on the StatefulSet
CODEX_LB_HTTP_RESPONSES_SESSION_BRIDGE_INSTANCE_ID=$(POD_NAME)
CODEX_LB_HTTP_RESPONSES_SESSION_BRIDGE_ADVERTISE_BASE_URL=http://$(POD_NAME).<headless-service>.$(POD_NAMESPACE).svc.<clusterDomain>:2455

clusterDomain defaults to cluster.local. If your cluster uses another suffix, set:

clusterDomain: corp.internal

In most clusters no extra values are required for /responses owner handoff. If pods must be reached through a different internal address, override:

config:
  sessionBridgeAdvertiseBaseUrl: "http://codex-lb-internal.default.svc.cluster.local:2455"

When networkPolicy.enabled=true, the chart also allows port 2455 traffic between codex-lb pods so owner handoff can work without extra rules.

Manual Ring Override (Advanced)

If you need to manually specify the pod ring (e.g., for testing or debugging):

config:
  sessionBridgeInstanceRing: "codex-lb-0.codex-lb.default.svc.cluster.local,codex-lb-1.codex-lb.default.svc.cluster.local"

This is rarely needed in production; the database-backed discovery is preferred.

Connection Pool Budget

Each pod maintains its own SQLAlchemy connection pool. The total connections across all replicas must fit within PostgreSQL's max_connections:

(databasePoolSize + databaseMaxOverflow) × maxReplicas ≤ PostgreSQL max_connections

Example for values-prod.yaml:

config:
  databasePoolSize: 3
  databaseMaxOverflow: 2
autoscaling:
  maxReplicas: 20

Calculation: (3 + 2) × 20 = 100 connections, which fits within PostgreSQL's default max_connections=100.

Tuning:

Increase databasePoolSize if pods frequently wait for connections
Increase databaseMaxOverflow for temporary spikes, but keep it small (overflow is slower)
Reduce maxReplicas if you cannot increase PostgreSQL's max_connections
Use PgBouncer or pgcat as a connection pooler in front of PostgreSQL if needed

values-prod.yaml Reference

The values-prod.yaml overlay is pre-configured for production multi-replica deployments:

replicaCount: 3                    # Start with 3 replicas
postgresql:
  enabled: false                   # Use external PostgreSQL
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 600  # 10 min cooldown (see below)
affinity:
  podAntiAffinity: hard            # Spread pods across nodes
topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone  # Spread across zones
networkPolicy:
  enabled: true                    # Restrict ingress/egress
metrics:
  serviceMonitor:
    enabled: true                  # Prometheus scraping
  prometheusRule:
    enabled: true                  # Alerting rules
  grafanaDashboard:
    enabled: true                  # Pre-built dashboards
externalSecrets:
  enabled: true                    # Use External Secrets Operator

Install with:

helm install codex-lb oci://ghcr.io/soju06/charts/codex-lb \
  -f deploy/helm/codex-lb/values-prod.yaml \
  --set externalDatabase.url='postgresql+asyncpg://user:pass@db.example.com:5432/codexlb'

Graceful Shutdown Tuning

Graceful shutdown coordinates three timeout parameters to drain in-flight requests and session bridge connections:

preStopSleepSeconds (15s) → shutdownDrainTimeoutSeconds (30s) → terminationGracePeriodSeconds (60s)

Timeline:

preStopSleepSeconds (15s): Pod receives SIGTERM
- Sleep briefly to allow load balancer to remove the pod from rotation
- Prevents new requests from arriving during shutdown
shutdownDrainTimeoutSeconds (30s): Drain in-flight requests
- HTTP server stops accepting new connections
- Existing requests are allowed to complete (up to 30 seconds)
- Session bridge connections are gracefully closed
terminationGracePeriodSeconds (60s): Hard deadline
- Total time from SIGTERM to SIGKILL
- Must be ≥ preStopSleepSeconds + shutdownDrainTimeoutSeconds
- Default 60s allows 15s + 30s + 15s buffer

Tuning:

Increase preStopSleepSeconds if your load balancer takes longer to deregister
Increase shutdownDrainTimeoutSeconds if requests typically take >30s to complete
Increase terminationGracePeriodSeconds proportionally (must be larger than the sum)
Keep the buffer small; long shutdown times delay pod replacement

Example for long-running requests:

preStopSleepSeconds: 20
shutdownDrainTimeoutSeconds: 60
terminationGracePeriodSeconds: 90

Scale-Down Caution

The stabilizationWindowSeconds: 600 (10 minutes) in values-prod.yaml is intentionally high.

Why?

Session bridge connections have idle TTLs (sessionBridgeIdleTtlSeconds=120 for API, sessionBridgeCodexIdleTtlSeconds=900 for Codex)
When a pod scales down, its in-memory sessions are lost
Clients reconnecting to a different pod must re-establish upstream connections
A 10-minute cooldown prevents rapid scale-down/up cycles that would thrash session state

Behavior:

HPA will scale down at most 1 pod every 2 minutes (when cooldown is active)
If load drops suddenly, scale-down is delayed by up to 10 minutes
This trades off faster scale-down for session stability

Tuning:

Reduce stabilizationWindowSeconds if you prioritize cost over session stability
Increase it if you see frequent session reconnections during scale events
Monitor sessionBridgeInstanceRing size changes in logs to detect scale-down impact

Security

The chart targets the Kubernetes Restricted Pod Security Standard.

runAsNonRoot: true
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
all Linux capabilities dropped
automountServiceAccountToken: false

Rollout controls for externally managed config:

rollout.reloader.enabled=true adds Stakater Reloader annotations
rollout.manualToken forces a StatefulSet rollout when external Secret contents change outside Helm

Ingress and Gateway API

The chart supports either classic Ingress or Gateway API.

Ingress example:

ingress:
  enabled: true
  ingressClassName: nginx
  hosts:
    - host: codex-lb.example.com
      paths:
        - path: /
          pathType: Prefix

Gateway API example:

gatewayApi:
  enabled: true
  parentRefs:
    - name: my-gateway
      namespace: gateway-system
  hostnames:
    - codex-lb.example.com

Upgrade Contract

helm upgrade codex-lb oci://ghcr.io/soju06/charts/codex-lb <your values...>

External DB installs can migrate before StatefulSet creation.
External secrets installs keep the dedicated migration Job and fail closed behind the schema gate.
Bundled installs stay easy to bootstrap and keep the migration hook for upgrades.
StatefulSet pod-template checksums force rollouts when chart-managed ConfigMaps or Secrets change.
The workload resource name is intentionally different from the legacy Deployment name to avoid Helm kind-migration conflicts during upgrade.

Validation

Recommended after install:

helm test codex-lb -n <namespace>
kubectl get pods -n <namespace>
kubectl logs job/<release>-migrate -n <namespace>

If you are using a port-forwarded install:

kubectl port-forward svc/codex-lb 2455:2455 -n <namespace>
curl -i http://127.0.0.1:2455/health/live
curl -i http://127.0.0.1:2455/health/ready

Troubleshooting

Migration Job:

kubectl describe job <release>-migrate -n <namespace>
kubectl logs job/<release>-migrate -n <namespace>

App pod stuck in init:

kubectl describe pod -l app.kubernetes.io/name=codex-lb -n <namespace>
kubectl logs deploy/<release> -c wait-for-schema-head -n <namespace>

Health failures:

kubectl describe deploy <release> -n <namespace>
kubectl logs deploy/<release> -n <namespace>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codex-lb Helm Chart

Design Goal

Prerequisites

Version Policy

Install Modes

1. Bundled

2. External DB

3. External Secrets

Quick Start

Docker Desktop / kind style cluster

Managed PostgreSQL

From source (development)

Included Value Overlays

Schema and Migration Behavior

Secret Model

Single secret

Split secrets

Network Policy

Connection Pool Sizing

Production Workload

Prerequisites for Multi-Replica

Session Bridge Ring

Connection Pool Budget

values-prod.yaml Reference

Graceful Shutdown Tuning

Scale-Down Caution

Security

Ingress and Gateway API

Upgrade Contract

Validation

Troubleshooting

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

codex-lb Helm Chart

Design Goal

Prerequisites

Version Policy

Install Modes

1. Bundled

2. External DB

3. External Secrets

Quick Start

Docker Desktop / kind style cluster

Managed PostgreSQL

From source (development)

Included Value Overlays

Schema and Migration Behavior

Secret Model

Single secret

Split secrets

Network Policy

Connection Pool Sizing

Production Workload

Prerequisites for Multi-Replica

Session Bridge Ring

Connection Pool Budget

values-prod.yaml Reference

Graceful Shutdown Tuning

Scale-Down Caution

Security

Ingress and Gateway API

Upgrade Contract

Validation

Troubleshooting