Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.path.getsize very slow for Windows 11 #124126

Open
DanielYang59 opened this issue Sep 16, 2024 · 11 comments
Open

os.path.getsize very slow for Windows 11 #124126

DanielYang59 opened this issue Sep 16, 2024 · 11 comments
Labels
OS-windows performance Performance or resource usage type-bug An unexpected behavior, bug, or error

Comments

@DanielYang59
Copy link

DanielYang59 commented Sep 16, 2024

Bug report

Bug description:

Summary

I noticed os.path.getsize runs much slower (38x) on Windows 11 than Ubuntu 22.04-WSL2 (Windows 11 and Ubuntu 22.04-WSL2 running on the same physical machine, the same SSD, both tested while idle) and MacOS Sonoma 14.6.1.

Test code

import timeit
import platform


with open("test_file.txt", mode="w", encoding="utf-8", newline="") as file:
    for i in range(10):
        file.write(f"This is line {str(i)}\\n.")

execution_time = timeit.timeit(stmt='os.path.getsize("test_file.txt")', number=1_000_000, setup="import os")

os_info = platform.system()
kernel_info = platform.release()
python_version = platform.python_version()

print(f"Execution time: {execution_time:.6f} seconds")
print(f"Operating System: {os_info}")
print(f"Kernel Version: {kernel_info}")
print(f"Python Version: {python_version}")

Test results

On Windows 11 (Version: 23H2, OS build: 22631.4169):

Execution time: 30.922192 seconds
Operating System: Windows
Kernel Version: 11
Python Version: 3.12.5

Windows 11 (dev drive):

Execution time: 17.214313 seconds
Operating System: Windows
Kernel Version: 11
Python Version: 3.12.5

On Ubuntu 22.04 WSL2:

Execution time: 0.844529 seconds
Operating System: Linux
Kernel Version: 5.15.153.1-microsoft-standard-WSL2
Python Version: 3.12.5

On MacOS 14.6:

Execution time: 0.811347 seconds
Operating System: Darwin
Kernel Version: 23.6.0
Python Version: 3.12.5

CPython versions tested on:

3.12

Operating systems tested on:

Windows

@DanielYang59 DanielYang59 added the type-bug An unexpected behavior, bug, or error label Sep 16, 2024
@picnixz
Copy link
Contributor

picnixz commented Sep 16, 2024

Can you test with the same Python versions please? Also, could you avoid putting a loop inside the statement to test and rather use number=10000 for instance?

@picnixz picnixz added the pending The issue will be closed if no feedback is provided label Sep 16, 2024
@DanielYang59
Copy link
Author

Thanks for the quick response @picnixz , I just updated the test results with Python 3.12.5 exactly :)

Also, could you avoid putting a loop inside the statement to test and rather use number=10000 for instance?

That was deliberate (I want to rule out the import time of os), is there any pitfall for having a loop inside the test statement?

@picnixz
Copy link
Contributor

picnixz commented Sep 16, 2024

want to rule out the import time of os

You can rule out the import time by using setup='import os'.

is there any pitfall for having a loop inside the test statement

Generally, no, but I'm not sure whether the garbage collector could do something inbetween, hence the question.


From my experience, Windows is generally slower when doing OS-related operations, so I'm not that shocked. Let's ask an expert on this topic: @zooba

@picnixz picnixz added performance Performance or resource usage OS-windows and removed pending The issue will be closed if no feedback is provided labels Sep 16, 2024
@picnixz
Copy link
Contributor

picnixz commented Sep 16, 2024

I just observed that os.path.getsize is simply calling os.stat and then gets its corresponding field. So the problem (if any) is the slowness of os.stat.

@DanielYang59
Copy link
Author

DanielYang59 commented Sep 16, 2024

You can rule out the import time by using setup='import os'.

Thanks a lot for the input, both script and result updated.

From my experience, Windows is generally slower when doing OS-related operations, so I'm not that shocked.

I thought Windows is just "slightly" slower, but pretty surprised to see such a big gap.

Also perhaps calling getsize a million times is a rare use case, in my case I was just trying to create a test file of specific size, and find the following code taking forever on my Windows machine:

with open(file_path, "w", encoding="utf-8", newline="") as f:
    while os.path.getsize(file_path) < target_size:
        f.write(f"This is line number {line_number}\n")

In my case, it's much faster to guess a total number of lines and avoid using getsize after writing each line :)

@picnixz
Copy link
Contributor

picnixz commented Sep 16, 2024

I'd be interested in knowing whether this is really os.stat that is 30x slower on Windows or not. For your specific use case, create a bytes object of the number of bytes you want, fill it with whatever you want and write it in binary mode and you should have a file of exact size.

@DanielYang59
Copy link
Author

DanielYang59 commented Sep 16, 2024

For your specific use case, create a bytes object of the number of bytes you want

Yep, solid idea! I assume it would be much faster to check the size of the bytes object than the file :)

@zooba
Copy link
Member

zooba commented Sep 16, 2024

Please try running on a Dev Drive to compare. It's not quite free of the issues that make Windows have slower I/O than Linux, but it's significantly better than using your default OS drive.

I'd also be interested to know exactly which build of Windows you're running. One recent update includes a new API for getting file metadata that is implemented more like Linux (it doesn't require opening the file first, which Windows traditionally does). Python 3.12 should use the new API automatically, and some measurements have shown that it runs 3-4x faster than the old one.

But overall, the slow file system is an OS issue, probably not a Python issue. To see a Python issue, you'll need to do native profiling of Python itself and show that we're somehow going through significantly more of our own code on one OS than another. Simple timings of OS operations are not really comparable in that way.

@DanielYang59
Copy link
Author

Hi @zooba thanks a lot for the input and detailed explanation!

Please try running on a Dev Drive to compare.

It's indeed much faster (17 sec vs 31 sec).

I'd also be interested to know exactly which build of Windows you're running.

It should be Version: 23H2, OS build: 22631.4169.

But overall, the slow file system is an OS issue, probably not a Python issue. To see a Python issue, you'll need to do native profiling of Python itself and show that we're somehow going through significantly more of our own code on one OS than another. Simple timings of OS operations are not really comparable in that way.

Fully understandable, thanks a lot for the input. But as an end user I don't quite know how to properly profiling Python, so opened this in case you can do something :)

@zooba
Copy link
Member

zooba commented Sep 18, 2024

But as an end user I don't quite know how to properly profiling Python, so opened this in case you can do something :)

At least on Windows, the approach is to use Windows Performance Recorder to capture a trace and then Windows Performance Analyzer to attribute the CPU time to either one of Python's native modules (you won't get Python-specific information in there yet, but I'll be releasing a tool soon to help with that) or an OS module.

It's quite a specialized job, I'll be honest! But there are people out there who know how to do it, and may also have the time and interest to see what's up (not me, right now).

It should be Version: 23H2, OS build: 22631.4169.

This doesn't have the new API in it, so you're getting the Dev Drive accelerated time, but not the improved stat calls. I believe Insider builds should have it already.

@zooba
Copy link
Member

zooba commented Sep 18, 2024

For reference, I just did python3.12 -m timeit -n 100 -s "import os" "sum(os.path.getsize(s) for s in os.scandir(r'C:\Windows\System32'))" on a 22631 build and an unreleased 26100 build (both with Store install of 3.12.6) and got 178ms vs 57.8ms. So the new API should provide 2-3x speedup on this operation, and that should stack on top of the Dev Drive benefit (though I suspect part of the benefit is from bypassing the same drivers that Dev Drives disable, so it may not be a straight (1.5-2x) x (2-3x) = (3-6x) calculation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows performance Performance or resource usage type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants