-
-
Notifications
You must be signed in to change notification settings - Fork 30k
-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os.path.getsize
very slow for Windows 11
#124126
Comments
Can you test with the same Python versions please? Also, could you avoid putting a loop inside the statement to test and rather use |
Thanks for the quick response @picnixz , I just updated the test results with Python 3.12.5 exactly :)
That was deliberate (I want to rule out the import time of |
You can rule out the import time by using
Generally, no, but I'm not sure whether the garbage collector could do something inbetween, hence the question. From my experience, Windows is generally slower when doing OS-related operations, so I'm not that shocked. Let's ask an expert on this topic: @zooba |
I just observed that |
Thanks a lot for the input, both script and result updated.
I thought Windows is just "slightly" slower, but pretty surprised to see such a big gap. Also perhaps calling with open(file_path, "w", encoding="utf-8", newline="") as f:
while os.path.getsize(file_path) < target_size:
f.write(f"This is line number {line_number}\n") In my case, it's much faster to guess a total number of lines and avoid using |
I'd be interested in knowing whether this is really |
Yep, solid idea! I assume it would be much faster to check the size of the bytes object than the file :) |
Please try running on a Dev Drive to compare. It's not quite free of the issues that make Windows have slower I/O than Linux, but it's significantly better than using your default OS drive. I'd also be interested to know exactly which build of Windows you're running. One recent update includes a new API for getting file metadata that is implemented more like Linux (it doesn't require opening the file first, which Windows traditionally does). Python 3.12 should use the new API automatically, and some measurements have shown that it runs 3-4x faster than the old one. But overall, the slow file system is an OS issue, probably not a Python issue. To see a Python issue, you'll need to do native profiling of Python itself and show that we're somehow going through significantly more of our own code on one OS than another. Simple timings of OS operations are not really comparable in that way. |
Hi @zooba thanks a lot for the input and detailed explanation!
It's indeed much faster (17 sec vs 31 sec).
It should be
Fully understandable, thanks a lot for the input. But as an end user I don't quite know how to properly profiling Python, so opened this in case you can do something :) |
At least on Windows, the approach is to use Windows Performance Recorder to capture a trace and then Windows Performance Analyzer to attribute the CPU time to either one of Python's native modules (you won't get Python-specific information in there yet, but I'll be releasing a tool soon to help with that) or an OS module. It's quite a specialized job, I'll be honest! But there are people out there who know how to do it, and may also have the time and interest to see what's up (not me, right now).
This doesn't have the new API in it, so you're getting the Dev Drive accelerated time, but not the improved stat calls. I believe Insider builds should have it already. |
For reference, I just did |
Bug report
Bug description:
Summary
I noticed
os.path.getsize
runs much slower (38x) on Windows 11 than Ubuntu 22.04-WSL2 (Windows 11 and Ubuntu 22.04-WSL2 running on the same physical machine, the same SSD, both tested while idle) and MacOS Sonoma 14.6.1.Test code
Test results
On Windows 11 (Version: 23H2, OS build: 22631.4169):
Windows 11 (dev drive):
On Ubuntu 22.04 WSL2:
On MacOS 14.6:
CPython versions tested on:
3.12
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered: