Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Wip] Add an implementation of ReadAsync to PartialInputStream #589

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Numpsy
Copy link
Contributor

@Numpsy Numpsy commented Mar 1, 2021

Spin off from comments in #576 - Where it looked like PartialInputStream needed an implementation of ReadAsync in order for async read calls against the streams returned from ZipFile.GetInputStream to actually do async reads from the base stream.

It's WIP because I'm not sure what the situation should be with the locks around baseStream_ in the existing Read function, because await inside lock seems to not be allowed.

I assume the locks are an attempt to prevent issues if someone tries to read from multiple zip entries at the same time, and they all try to manipulate the base stream?

I certify that I own, and have sufficient rights to contribute, all source code and related material intended to be compiled or integrated with the source code for the SharpZipLib open source product (the "Contribution"). My Contribution is licensed under the MIT License.

@Numpsy Numpsy changed the title [WipAdd an implementation of ReadAsync to PartialInputStream [Wip] Add an implementation of ReadAsync to PartialInputStream Mar 1, 2021
@Numpsy Numpsy marked this pull request as draft March 1, 2021 15:41
@Numpsy
Copy link
Contributor Author

Numpsy commented May 1, 2021

@piksel Do you have any thoughts on the importance of the locks used in the existing PartialInputStream read functions, or on the best way to approach them in any async implementation?

The common suggestion seems to be to use SemaphoreSlim instead of lock() in async functions which would be doable, but then you might need to change ZipFile itself to manage the semaphore, and as ZipFile.DisposeInternal tries to lock the baseStream itself that might have issues of its own (though whether trying to lock on and dispose of a member variable in a function thats called from a finalizer is sensible is another question, refs #44 )

@piksel
Copy link
Member

piksel commented May 2, 2021

I think using Stream.Synchronized(Stream) should be roughly equivalent to using SemaphoreSlim (preventing our code from altering the stream).

Now, the problem is that it would still not be restricting code outside our library from altering the base stream. At least I don't think so. Using lock should prevent all access (since the lock is on the reference to the stream itself), but there is a lot of guesswork going on here. 😓

@Numpsy
Copy link
Contributor Author

Numpsy commented May 2, 2021

As in, create the partial input stream from a synchronized stream, rather than the base stream?

Would that work in the case of PartialInputStream.Read which does a seek on the basestream followed by a read? (as in, would that be equivalent to locking Seek and Read independantly rather than both as a unit, such that you could potentially allow more than one thread to do a Seek while others are pending read?)

@Numpsy
Copy link
Contributor Author

Numpsy commented May 2, 2021

I don't know if you can really prevent a caller who constructed a ZipFile from an arbitrary stream from changing that stream themselves outside of the library code - even if lock did prevent that, it's only locked part of the time anyway, so there would still be times when it could change

@piksel
Copy link
Member

piksel commented May 3, 2021

Yeah, I just don't know what else this is meant to do? The original code is synchronous, so what was the point?

Also Stream.Synchronized doesn't help that much as it's just synchronizing calls to read/write etc. Replacing the lock with a SemaphoreSlim is probably the sanest thing to do.

@Numpsy
Copy link
Contributor Author

Numpsy commented May 3, 2021

Yeah, I just don't know what else this is meant to do? The original code is synchronous, so what was the point?

I guessed that it was to try to avoid issues if the user tries to simultaneously read the content from multiple zip entries at the same time from different threads (given that nothing in the library itself should try to do anything like that).

The lock at

looks more odd to me - not sure if that would be doing anything other than trying to protect a caller from disposing the ZipFile while they're still reading from it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants