Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support s3 requester pays #76

Open
kylebarron opened this issue Nov 6, 2024 · 5 comments · May be fixed by #81
Open

Support s3 requester pays #76

kylebarron opened this issue Nov 6, 2024 · 5 comments · May be fixed by #81
Assignees

Comments

@kylebarron
Copy link
Member

image https://docs.aws.amazon.com/AmazonS3/latest/userguide/ObjectsinRequesterPaysBuckets.html
@kylebarron
Copy link
Member Author

From discord:

Hi everyone, if anyone is familiar with the object_store implementation in arrow-rs I had a question; is there a way to pass in arbitrary AWS (specifically S3) options via HTTP headers? For example, I don't see implicit support for requester pays buckets, which is hinted at by passing in this header value pair: x-amz-request-payer : requester so I'm wondering if there's a way to add in our own options like that.

Not on a per-request basis, as it sort of breaks the abstraction of being any object store, but you could look at overriding the headers using ClientOptions when you build the AmazonS3 client

@kylebarron
Copy link
Member Author

I guess we have to set this header manually in ClientOptions::with_default_headers

@kylebarron kylebarron changed the title Document s3 requester pays Support s3 requester pays Nov 6, 2024
@kylebarron
Copy link
Member Author

So we need to set this in ClientOptions. But right now the only supported client options are those that we can pass in as strings. I.e. whatever gets passed to ClientOptions::with_config. But ClientConfigKey doesn't have an option for default_headers. So it's not currently possible to pass this in.

Instead, we should make ClientOptions a Python class, with its own constructor (not just created from a dict on the Rust side). Then a user can manually construct a ClientOptions class, and then pass that into an S3Store Python class to ensure we're enabling requester pays.

Or we could just add a requester_pays option to the S3Store constructor methods. That would probably be simpler for users.

We should probably do both of these two options.

@alukach alukach self-assigned this Nov 6, 2024
@alukach
Copy link
Member

alukach commented Nov 8, 2024

Instead, we should make ClientOptions a Python class, with its own constructor (not just created from a dict on the Rust side). Then a user can manually construct a ClientOptions class, and then pass that into an S3Store Python class to ensure we're enabling requester pays.

@kylebarron Any strong reason that you suggest a traditional Python class rather than a TypedDict? For something like an all-optional body of options, a TypedDict may make sense and fit in well with what we already do with things like GetOptions

@kylebarron
Copy link
Member Author

It's already a TypedDict essentially. It's already an all-optional body of options. (and we'll continue to allow that) But not everything can be strictly modeled as a dict of strings. We'd need it to be a pyclass to support class methods.

Here with_default_headers is a classmethod on the Rust side.

For easiest maintenance, I think the dict options should be passed directly into the underlying object_store implementation, so to support any other options on the ClientOptions rust struct it would need to be a Python pyclass.

@kylebarron kylebarron linked a pull request Nov 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants