Skip to content

Output bucket id#1386

Open
giurgiur99 wants to merge 1 commit into
mainfrom
feature/compute-output-bucket
Open

Output bucket id#1386
giurgiur99 wants to merge 1 commit into
mainfrom
feature/compute-output-bucket

Conversation

@giurgiur99

@giurgiur99 giurgiur99 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Fixes #1339

Changes proposed in this PR:

  • Adds optional outputBucketId to compute start (paid + free) — write job results straight into a persistent-storage bucket instead of outputs.tar
  • Docker C2D engine bind-mounts the bucket read-write at /data/outputs; results land as individual files (no archiving/upload step)
  • outputBucketId is a top-level plain string, mutually exclusive with output (the encrypted remote-storage path)
  • New validateOutputBucket: rejects when both are set (400), storage disabled (400), bad bucket id (400), or consumer not allowed on the bucket (403)
  • getDockerOutputMountObject added to the persistent-storage interface — implemented for LocalFS, throws "not implemented" for S3
  • Threaded through types, DB persistence (sqliteCompute), and both HTTP routes (/compute, /freeCompute)
  • Refactor: extracted failJobDataProvisioning to dedupe the fail-and-cleanup path
  • Docs updated (API.md, Storage.md, persistentStorage.md); integration tests cover bucket output + chaining/overwrite

@giurgiur99

Copy link
Copy Markdown
Contributor Author

/run-security-scan

@alexcos20 alexcos20 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI automated code review (Gemini 3).

Overall risk: low

Summary:
The PR successfully introduces the outputBucketId parameter to allow C2D jobs to write their results directly into a persistent storage bucket instead of an outputs.tar file. The changes are well-structured, include necessary updates to documentation and tests, and effectively integrate with the persistent storage factory. A minor security concern regarding world-writable permissions on the host filesystem is noted.

Comments:
• [WARNING][security] Changing the directory permission to 0o777 makes it globally readable, writable, and executable by any user on the host OS. While this is a common workaround for Docker volume mount permission issues, it introduces a security risk in shared hosting environments. Consider configuring the container to run with the host user's UID/GID, or use more restrictive permissions like 0o755 or 0o770 combined with proper group ownership.

-    await fsp.chmod(source, 0o777)
+    await fsp.chmod(source, 0o755) // Or handle ownership via chown

• [INFO][style] Excellent refactoring here. Extracting the repeated job failure and cleanup logic into the failJobDataProvisioning helper greatly improves maintainability and keeps the code DRY.
• [INFO][security] Good use of explicit access control checks (assertConsumerAllowedForBucket) to ensure that only authorized consumers can write to the specified output bucket. Returning an explicit HTTP 403 on failure is the correct behavior here.

@giurgiur99

Copy link
Copy Markdown
Contributor Author

AI automated code review (Gemini 3).

Overall risk: low

Summary: The PR successfully introduces the outputBucketId parameter to allow C2D jobs to write their results directly into a persistent storage bucket instead of an outputs.tar file. The changes are well-structured, include necessary updates to documentation and tests, and effectively integrate with the persistent storage factory. A minor security concern regarding world-writable permissions on the host filesystem is noted.

Comments: • [WARNING][security] Changing the directory permission to 0o777 makes it globally readable, writable, and executable by any user on the host OS. While this is a common workaround for Docker volume mount permission issues, it introduces a security risk in shared hosting environments. Consider configuring the container to run with the host user's UID/GID, or use more restrictive permissions like 0o755 or 0o770 combined with proper group ownership.

-    await fsp.chmod(source, 0o777)
+    await fsp.chmod(source, 0o755) // Or handle ownership via chown

• [INFO][style] Excellent refactoring here. Extracting the repeated job failure and cleanup logic into the failJobDataProvisioning helper greatly improves maintainability and keeps the code DRY. • [INFO][security] Good use of explicit access control checks (assertConsumerAllowedForBucket) to ensure that only authorized consumers can write to the specified output bucket. Returning an explicit HTTP 403 on failure is the correct behavior here.

source is already path-restricted, so we only give access to that particular bucket

@giurgiur99 giurgiur99 marked this pull request as ready for review June 9, 2026 11:04

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

C2D output in bucket

2 participants