[fs] Support AWS S3 credentials provider mode#3540
Conversation
|
@fresh-borzoni could you please help review this when you have time? Thanks! |
fresh-borzoni
left a comment
There was a problem hiding this comment.
@litiliu Thank you for the PR, looks good overall, couple of minor comments, PTAL
| throw new IllegalArgumentException( | ||
| "AssumeRole and a custom AWS credentials provider cannot be configured together."); | ||
| } | ||
| this.credentialProviderList = |
There was a problem hiding this comment.
The description says configuring DynamicTemporaryAWSCredentialsProvider for server mode is "rejected", but there's no guard, so it just gets instantiated here and throws NoAwsCredentialsException at the first token request. Can you clarify?
| AWSCredentialsProvider createStsCredentialsProvider() { | ||
| if (credentialProviderList != null) { | ||
| AWSCredentials credentials = credentialProviderList.getCredentials(); | ||
| checkArgument( |
There was a problem hiding this comment.
This is effectively the "long-term creds only" gate, and it's the thing people will trip on instance profiles/IRSA roles all return session creds and land here at token time (lazily).
Given s3.md already has an IRSA section, can we add a short note there for this new mode: long-term creds only, not compatible with AssumeRole? Otherwise it's a bit hidden for operational usage
| import com.amazonaws.services.securitytoken.model.AssumeRoleResult; | ||
| import com.amazonaws.services.securitytoken.model.Credentials; | ||
| import com.amazonaws.services.securitytoken.model.GetSessionTokenResult; | ||
| import org.apache.commons.lang3.StringUtils; |
There was a problem hiding this comment.
commons-lang3 is only transitive here, shall we use org.apache.commons.lang3.StringUtils?
| import org.apache.fluss.fs.s3.token.S3DelegationTokenProvider; | ||
| import org.apache.fluss.fs.s3.token.S3DelegationTokenReceiver; | ||
|
|
||
| import org.apache.commons.lang3.StringUtils; |
| (accessKey == null) == (secretKey == null), | ||
| "S3 access key and secret key must both be set or both be unset."); | ||
| if (accessKey == null) { | ||
| if (hasCredentialProvider && roleArn != null) { |
There was a problem hiding this comment.
Since we're adding all these checks in this PR - this one fails fast, but the session/empty-cred cases only fail on the first token request.
Could we check those at construction too, so they're consistent? Not blocking though
Purpose
Closes #3493.
This PR adds a server-side AWS S3 credentials provider mode for the S3 filesystem. When
fs.s3a.aws.credentials.provideris explicitly configured in Fluss configuration, Fluss treats it as the authoritative server-side credential source instead of injecting the client delegated-token provider.This is intended for deployments that use standard AWS SDK/Hadoop S3A providers such as
com.amazonaws.auth.profile.ProfileCredentialsProvider, so rotated long-term credentials can be picked up by the provider without restarting Fluss servers.The full motivation and design discussion are in #3493. This PR description keeps the reviewer-facing summary and the final implemented behavior.
Brief change log
s3.aws.credentials.providers3a.aws.credentials.providerfs.s3a.aws.credentials.providerS3DelegationTokenProvidercan distinguish an explicit Fluss provider from Hadoop default resources.DynamicTemporaryAWSCredentialsProvider;fs.s3a.assumed.role.arn;Credential mode resolution:
Tests
mvn -pl fluss-filesystems/fluss-fs-s3 test -Dtest=S3FileSystemPluginTest,S3DelegationTokenProviderTestAdded/updated coverage for:
DynamicTemporaryAWSCredentialsProvider;DynamicTemporaryAWSCredentialsProviderfor server mode is rejected;API and Format
No public API or storage format changes.
The PR adds an internal Hadoop configuration marker under
fluss.fs.s3.aws.credentials.provider.explicitly.configured. It is not a user-facing option; it only carries whether the provider was explicitly configured through Fluss config.Documentation
No separate documentation update in this PR. The user-facing behavior and operational motivation are described in #3493.
Generative AI disclosure
AGENTS.mdguidance.