-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threaded FS IO probably should be async now? #4159
Comments
IMO the code we have for this looks very complicated/scary. Rewriting it looks like a project, would probably be hard to know if we bring over all the desired properties. |
Ah, I do remember the days when installing Rust took a long time to finish: #1876 (cc the original author, if I'm not mistaken, @rbtcollins) |
Yah, so the underlying thing here is that rustup does tasks that look very much like offensive action to heuristic detectors that are wired into the syscall path on windows; and it also does a large number of operations that have data dependencies. And we have to run in very resource constrained environments, but also unpack very large files. For the first case, more details:
So for this case what we do is quite easy: we make sure that CloseHandle is in a thread of its own and continue with other IO while that takes place. tokio's io threads would be entirely suitable for this. Second case:Creating directories is not free, particularly when RUSTUP_HOME is on NFS (note that that bug was closed with a workaround, but we continued to improve the IO code so that it is genuinely fixed today, even with docs being installed). Most files are very small. To unpack efficiently when mkdir is not instant, we want to decompress as many files as will fit in RAM, and write them to disk as soon as the directory they belong in is created. We don't want to create or assert directory existence more than once per directory. Just waiting for the directory to be created leads to stalls where rustup does nothing and excessive wall clock times. tar files are not required to be topologically ordered: there is no guarantee that the file path 'a/b' will be preceeded by a directory 'a'. Well behaved compressors will do this of course, but malicious tars can also mess with things by violating this expectation, and if I recall correctly actually old rust artifacts fail this too. Most extractors - and rustup does this - end up taking a pragmatic approach and just implicitly creating directories when e.g. third case:Raspberry PI machines have less memory than the llvm lib that Rust ships, which is 500+ MB in size. To process that in memory as one chunk we would need to stream in the compressed archive, and stream out the output, which works completely fine - but runs into the aforementioned performance issues. However, Raspberry PI machines don't run Windows (case 1), and are typically local developer setups, so not (case 2). Thus on Raspberry PI accepting the pipeline stalls and not using the code that addresses case 1 or case 2 makes a lot of sense. We do have a memory-capped buffer : we have a few different sizes of IO buffer, and store content to be written in an IO buffer for later dispatch. When there are no buffers available (e.g. because the sum of IO buffers exceeds the supplied or inferred size limit, then we stop decompressing the tar and wait for an IO buffer to be available. memory pressureOur minimum footprint is then:
The only one of those that can be tuned is the IO buffer total size. The heuristic we use is actual physical memory - a guess that we took. I think it entirely likely that decompression memory is why we keep seeing issues here :- I think using a minimal-compression-window archive and including the maximum memory footprint of the decompressor in the memory accounting layer is the most realistic way to make rPi installation troublefree + fast. Alternatively, defaulting to single-threaded non-pipelined logic for rPi would also mitigate most of the problem by never buffering file content. Much of this is also discussed in this talk I gave. |
We are still using an async-unaware threadpool here, and it's known to produce problems (#3125):
rustup/src/diskio/threaded.rs
Lines 1 to 6 in fa4ae32
Anyway, using
RUSTUP_IO_THREADS=1
to limit concurrency feels a bit off. Will migrating toasync
help?The text was updated successfully, but these errors were encountered: