Skip to content

fix(deps): pin aws-creds to fork with EKS Pod Identity support#1419

Merged
tlongwell-block merged 1 commit into
mainfrom
fix/s3-pod-identity-creds
Jul 1, 2026
Merged

fix(deps): pin aws-creds to fork with EKS Pod Identity support#1419
tlongwell-block merged 1 commit into
mainfrom
fix/s3-pod-identity-creds

Conversation

@tlongwell-block

Copy link
Copy Markdown
Collaborator

Follow-up to #1417 (Redis TLS) — the second blocker on the bb-block relay deploy.

Problem

After the Redis TLS fix, the relay pod on bb-block failed with:

failed to initialize media storage: Could not get valid credentials

The pod authenticates to S3 via EKS Pod Identity (AWS_CONTAINER_CREDENTIALS_FULL_URI + AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE). But aws-creds 0.39.1 — pulled transitively via rust-s3 0.37.2 — only reads the ECS RELATIVE_URI form, hardcodes the ECS endpoint, and sends no Authorization header. So it never gets creds for either S3 media or git CAS storage. Wren independently confirmed buzz is correctly onboarded to Pod Identity (cluster app-service standard), so the fix belongs in the creds layer, not infra.

Fix

Pin aws-creds to tlongwell-block/rust-s3@c9fce362, which adopts the aws-creds portion of upstream durch/rust-s3#449:

  • Falls through to AWS_CONTAINER_CREDENTIALS_FULL_URI when RELATIVE_URI is absent.
  • Reads the bearer token from AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE (then AWS_CONTAINER_AUTHORIZATION_TOKEN) and sends it as Authorization.
  • Refresh-safe: sets expiration from the response, so rust-s3's auto-refresh() re-fetches on token expiry rather than silently 403ing after ~6h.
  • Security allowlist: the auth token is only sent to the documented Pod Identity loopback addresses (169.254.170.23 / [fd00:ec2::23]) over http/https; token-file read fails loud (no silent env fallback).

Scope (minimal)

  • [patch.crates-io] redirects only aws-creds. rust-s3 stays on crates.io 0.37.2; no S3 call sites and no git CAS logic change.
  • Cargo.lock delta is exactly one line (aws-creds source).
  • Temporary pin — a personal OSS fork used for test-then-upstream (per Tyler). Revert to crates.io once feat: add NIP-38 user status for desktop and mobile #449 lands; if it stalls into a long-lived prod dep, move the fork to a Block-managed org.

Human review

This adopts an upstream PR that had only bot review. Wren + I reviewed the full diff together (refresh safety, the allowlist, error mapping) — see thread. We are the human review it lacked.

Verification

  • cargo build -p buzz-media -p buzz-relay — clean against the fork.
  • cargo test -p buzz-media — 42 passed (MinIO round-trip ignored, needs live MinIO).
  • cargo test -p buzz-relay — 429 passed, 0 failed, incl. the git-CAS unit tests (2 live-S3 probes env-gated behind BUZZ_GIT_S3_PROBE=1).
  • Live media + git S3 round-trip and a token-refresh survival check will be verified on-cluster after the image ships.

The relay pod on bb-block authenticates to S3 via EKS Pod Identity
(AWS_CONTAINER_CREDENTIALS_FULL_URI + AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE),
but aws-creds 0.39.1 (pulled transitively via rust-s3 0.37.2) only reads
the ECS RELATIVE_URI form and sends no Authorization header. Result:
"failed to initialize media storage: Could not get valid credentials",
blocking both S3 media and git CAS storage.

Pin aws-creds to tlongwell-block/rust-s3@c9fce362, which adopts the
aws-creds portion of durch/rust-s3#449: FULL_URI fallback, bearer token
from the token file/env, Authorization header, and a loopback allowlist
(169.254.170.23 / [fd00:ec2::23]) that only sends the token to the
documented Pod Identity agent addresses. The fix is refresh-safe — it
sets expiration from the response, so rust-s3's auto-refresh re-fetches
on token expiry rather than silently 403ing after ~6h.

Scope is minimal: only aws-creds is redirected; rust-s3 stays on
crates.io 0.37.2, and no S3 call sites or the git CAS change. Temporary
pin pending upstream merge of #449.

Verified: buzz-media (42 passed) and buzz-relay (429 passed, incl. the
git-CAS unit tests) full suites green against the fork. Live S3/MinIO
probes remain env-gated and are validated on-cluster post-deploy.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
@tlongwell-block tlongwell-block merged commit 86d6388 into main Jul 1, 2026
29 checks passed
@tlongwell-block tlongwell-block deleted the fix/s3-pod-identity-creds branch July 1, 2026 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant