Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Walk directories in sorted order for reproducibility #517

Merged
merged 10 commits into from
Jan 3, 2025

Conversation

dalcinl
Copy link
Contributor

@dalcinl dalcinl commented Nov 10, 2024

Implement a new utility function wrapping os.walk with the following modifications:

  • recurse directories in sorted order
  • recurse top-level *.dist-info/ directories last
  • list filenames in sorted order
  • list top-level *.dist-info/RECORD files last

* recurse directories in sorted order
* recurse top-level *.dist-info/ directories last
* list filenames in sorted order
* list top-level *.dist-info/RECORD files last
@dalcinl
Copy link
Contributor Author

dalcinl commented Nov 10, 2024

There is an additional nit that would be trivial to address in this PR. Currently, auditwheel outputs zip files using the usual convention of storing directories as an entry of zero bytes with name ending in "/". This is exactly what the zip utility does in Linux, but it is not what the wheel package does (and causes trouble with namespace packages).

What we could do (with a one-line change in this PR) is to still store subdirectories, but only if they are empty. Again, this is not what zip or python -m zipfile do. However, it is closer to what wheel does, albeit without the issue related to namespace packages. There is a minor backward incompatibility in the sense that the output zip files would no longer have all directory entries, but only those directories that are empty. However, upon unzipping, there would be no difference in the output directory tree. BTW, the wheel file spec says nothing about empty directories.

@dalcinl
Copy link
Contributor Author

dalcinl commented Nov 10, 2024

This PR covers #507 and provides additional features (.dist-info directories last, .dist-info/RECORD files last)

Copy link

codecov bot commented Dec 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.47%. Comparing base (2e00860) to head (457be20).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #517      +/-   ##
==========================================
+ Coverage   92.35%   92.47%   +0.12%     
==========================================
  Files          20       20              
  Lines        1268     1289      +21     
  Branches      244      250       +6     
==========================================
+ Hits         1171     1192      +21     
  Misses         56       56              
  Partials       41       41              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mayeut
Copy link
Member

mayeut commented Jan 3, 2025

Removing entries for non-empty folders also removes the permissions attribute for those. I'm gonna revert this for now.

@dalcinl
Copy link
Contributor Author

dalcinl commented Jan 3, 2025

@mayeut Maybe you want to keep the test for empty folders?

@mayeut mayeut merged commit 5fac207 into pypa:main Jan 3, 2025
11 checks passed
@mayeut
Copy link
Member

mayeut commented Jan 3, 2025

thanks @dalcinl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants