-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-file write support to the js and python sdks #451
base: main
Are you sure you want to change the base?
Add multi-file write support to the js and python sdks #451
Conversation
|
I pushed an edit that should sketch a type cleanup. There are two unfinished parts:
const { path, writeOpts, writeFiles } = typeof pathOrFiles === 'string'
? { path: pathOrFiles, writeFiles: [{ data: dataOrOpts }], writeOpts: opts }
: { path: undefined, writeFiles: pathOrFiles, writeOpts: dataOrOpts }
const blobs = await Promise.all(writeFiles.map(f => new Response(f.data).blob()))
return files.length === 1 ? files[0] : files Some edits I did:
Also, your previous code was mostly correct; this is more of an improvement. |
I also think the Let's keep it in the envd API though, because one of the uses was that you can use it to ensure that the path people upload files to is fixated. |
@ValentaTomas I addressed your comments and added extra tests for some edge cases. |
One important thing to note, and I've added it as a code comment, is that we can't expect specified directories in path of multipart filename to be taken into consideration; I've tested it with and sure enough only file name is used as path, the rest of the path is stripped by the std lib's
|
Ok, this is a very good find — how do you think we should handle this? We want to be able to upload files with any path, but the stripping of paths might make sense to preserve, because it allows people to upload by link easily. This might require some changes to the envd. |
If it's really important not to break the current API spec, we could add a custom field to the multipart dataform and look for it with some changes to |
@ValentaTomas I started by adding multi file write support for ConsiderationsI went the |
For the overload I think you can use the same system as we already have with https://github.com/e2b-dev/E2B/blob/beta/packages/python-sdk/e2b/sandbox_sync/process/main.py#L106 It should be the same thing, right? |
I also suggest naming and exporting all the types (from both SDKs). What do you say about having:
|
I'm thinking about what to do when you try to invoke write for multiple files and provide an empty array. Logically, you might want to notify the user that nothing was written, but throwing an error might not be optimal. If you are generating the field to write, you need to explicitly check if the array is empty; otherwise, you will get an error. In contrast to this, isn't writing 0 files a valid operation and you will also get an array with 0 results so everything is ok? |
Yeah, I'm thinking that we should probably do this, because people are already using the Beta SDK. |
From the perpective that there will likely be less control over which files, if any, are generated, your point makes sense. I will allow empty arrays |
That's a good argument. I agree. I don't like dealing with arrays and checking the length as well |
I also like the single upload method because it is something users already use a lot; requiring to always pass an array didn't sit well with me. For splitting the methods — I really don't want to end up with more methods for the same thing on the same level.
I think this is the better solution here than the |
For the multifile upload, you have to pass an array though, right? EDIT: If I understand it correctly, this is an argument for having the single file upload there, right |
It would be great if we adopt this approach in general. With Typescript it's a thin line between using types to hide some abstraction and over-using types. |
I can maybe see us using the overrides in JS but I really don't like this write signature in Python: path_or_files: str | List[WriteEntry],
data_or_user: WriteData | Username = "user",
user_or_request_timeout: Optional[float | Username] = None, It's not clear at all how to use these methods just from looking at the code. We should really aim for trying to be as clear as possible just from the code. It creates many questions in my had, why is data and user mixed together? Why is user and request timeout mixed together? It's very unintuitive |
Just to clear up a possible confusion here, because when I checked the start of the discussion it might not be clear:
This signature should never be seen by the user, because the method is
@overload
def write(
self,
path: str,
data: WriteData,
user: Username = "user",
request_timeout: Optional[float] = None,
) -> EntryInfo:
@overload
def write(
self,
files: List[WriteEntry],
user: Optional[Username] = "user",
request_timeout: Optional[float] = None,
) -> List[EntryInfo]: |
I see. I must have missed the I'll leave @jakubno to add any notes if has anything additional. As I said - he has the most Python experience |
It ain't true, users can see the source code for python libraries and I often go there |
The We had a lot of confusion around the users/permissions, so having this explicitly visible and modifiable seems better than hiding it again. EDIT: It is new in a sense that it is on all relevant methods in the Beta SDK. |
Well, I cannot argue with that. At that point isn't the usage clear? EDIT: What I wanted to say — are you worried about the overload implementation being confusing to people inspecting the code? |
Is this only filesystem thing? Or on all methods now? My immediate thought would be to expect something like this on every relevant method.
I'm pretty sure this won't solve it. People will be still confused and won't know what to pass there. The solution is that users shouldn't need to mess with permissions by default. The filesystem should be accessible to the default user. I had very relevant feedback for this from our hackathon:
|
Would the pythonic That way a dev can use one or more args or w/o using arrays. |
I think it would be slightly problematic (more so in TS) because we also want to pass the |
plus don't we lose all the type hints? |
Yes, it is on all relevant methods. By default you don't need to mess with this, but when interacting with filesystem, not having this exposed would lead you to not being able to do this. So that's why I choose to do less magic here — the user you pass is used for the operation, there is default of |
What are the use cases when you need it? It's a simple parameter but it opens a whole can of new questions like "how do I create a new user?" or "how do I switch to a different user when running a command?" that we'll need to provide answers for. That's why I'm not a big fan of this. It feels only half way there because it's missing all the technical content and utilities around it. |
The variadic argument could be typed, but generally, it should be the last argument. Otherwise, it won't work. |
Sorry to get in between, but for the Browserbase SDK I was in a similar situation, we ended up with a |
Thanks for the input. Would you do it the same way again? I think I'm leaning increasingly more towards two separate methods. I'm not sure about having a third one to potentially handle both single and multi use case. |
Would do two methods. My thinking was (still is): I want a function to return X. If a function should return Y, I make a function that returns Y. A function that returns both X and Y is flaky |
@mishushakov Why you ended up with three in the end? |
Because it felt like if we have the other two why not have it also. But tbh., the whole thing was out of necessity, because I didn't want to start a browser every time I want to fetch a URL, |
Three sounds the worst :D Let's do the two separate if everybody agrees:
|
I agree about separate methods but not sure about the naming because there's already
Maybe feels a little weird? |
Alternative namespace solutions:
EDIT: I actually feel that maybe not having namespaces lowers the mental difficulty for me when using the SDK — you just type what you want to do as opposed to having to think in which namespace the method should be. EDIT 2: Also I would not even argue for overloaded |
Keeping a namespace or prefixing the method name could help disambiguate a bit since that kinduv io touches other linux abstractions. |
I'm going with no namespaces. With Stripe SDK it makes sense to have namespaces as they have different "products" and the products have different methods available. We only have one "product" (or class) which is the sandbox, so we should only have sandbox methods on top-level, eg: sandbox.writeFile. |
The question is then is this a single product API or a VM (with its components) API? |
Description
Filesystem.write
method to accept multiple filesenvd
supports multipart with multiple files out of the boxTest