-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix "skipped files" count calculation #10141
base: main
Are you sure you want to change the base?
Conversation
2933453
to
feeb8f5
Compare
feeb8f5
to
b29b3b6
Compare
@@ -51,6 +51,7 @@ class ModuleDescriptionDict(TypedDict): | |||
isarg: bool | |||
basepath: str | |||
basename: str | |||
isignored: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was initially hoping to have all ignored files reporting go through this, but that
seems like it would be a larger refactor and out of scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you this look great. Agree that there is a large refactor to do it we want to have a verbose mode that make sense (display what will be skipped or not before the analysis and not after). I don't want to block bug fixes waiting for a refactor I don't have any bandwith for however.
This comment has been minimized.
This comment has been minimized.
Resolved the tests. I've decided not to walk the tree to find the precise number of skipped files since it can get Some of the failed tests as well as the ^ message above are both seemingly related to the os.walk |
Ah, I think astroid's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the fix. I have one comment about how we store the result (also to not have to worry about storing too many unneeded str
in memory).
Tests themselves already look good so it is just about a little refactoring of the code :)
@@ -851,6 +852,7 @@ def _get_file_descr_from_stdin(self, filepath: str) -> Iterator[FileItem]: | |||
self.config.ignore_patterns, | |||
self.config.ignore_paths, | |||
): | |||
self.skipped_paths.add(filepath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead store this as a an attribute on LinterStats
? Preferably just as an int
as we never actually need the file name. That would centralise all statistics counting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that originally! I had two concerns that made me switch
- the ignore logic happens at a few different layers. From the way it's structured, there should be no overlap, but I was weary of double-counting
- in the next step (or potentially in this PR itself, depending on how you feel) I'd also like to use
get_module_files
to get an accurate file count rather than a βfile or module" count; it would also enable pylint to report precisely what got skipped, though I guess that could be reported as we iterate and ignore them
If you're cool with the first, I can switch it back and leave the second point to a future discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the first is fine. If it turns out we are double counting we would have an actual reproducer and know what the bug is. Let's not optimize for something that might not happen.
Just doing a quick confirmation with @Pierre-Sassoulas that we are fine with doing an initial PR where we just store this on LinterStats
. I don't want to you to change the code and then have another maintainer request the exact opposite.
As for the second change, I agree: let's do that in a follow up.
Type of Changes
Description
Closes #10073. This bug has existed ever since the logic was introduced in #9122. This was
the initial issue that inspired that PR, so we can see that the "skipped" count should obviously
have nothing to do with modules without docstrings.
Also enables more reports/output about skipped files in the future, which has been asked for.
I haven't implemented that since it would probably be a larger discussion as to how verbose
--verbose
should be.Testing
Fleshed out the test that was written when this was introduced that didn't actually test anything.
Ran this with several configs, including overlapping paths and modules. Additionally ran it on
our biggest repo with multiple packages, vendored code, etc, and got the expected results:
Before:
Checked 10xxx files, skipped 9391 files
(alarming!)After:
Checked 10xxx files, skipped 124 files
(whew, ok)