-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible improvements for sniff performance analysis #3784
Comments
I like it! And I think I like the option 2 most. |
I discussed this off-GitHub with @gsherwood and the current line of thinking is option 3, but then a PHPCS native version of that report, which would be the best of both worlds. |
Open question: I'm currently reporting the time in microseconds. Should it be seconds instead ? (to lower cognitive load) |
I read |
Actually it won't as in most cases, we're talking very small numbers, i.e. |
PR #3810 is now open to address this. |
Sorry I didn't comment sooner. This is excellent. |
Closing as replaced by PHPCSStandards/PHP_CodeSniffer#28 |
As a sniff author, it is not uncommon to receive reports about sniffs/standards being "slow".
To analyze which sniffs could be the culprit/which sniffs to focus on to improve performance, the
-vvv
command can be used, which at the very end of the debug logs shows a "SNIFF PROCESSING REPORT" with the sniffs which have been triggered during a run and the time each sniff took, ordered from slowest to fastest sniff.This information is super-useful, however to get a good impression of the performance of various sniffs, the best way to do so is to run the sniffs/standard against a large number of files, which makes the
-vvv
output completely unworkable as even for a simple run against 4 files, the output is easily over 60.000 lines, with a test run against 300 files, the output is well over 4 million lines...The current available verbosity levels and their output are defined as such:
-v
: file specific token and line counts with running timesvv
: the info from level 1 + verbose tokeniser output-vvv
: the info from level 1 + 2 + token processing output with sniff running timesNow there are a couple of ways I can think of to improve this situation and I'd like to hear if any of these could/would be acceptable.
1
(-v
) and2
(-vv
) levels, which would contain the information from level 1 + the sniff performance report.This would mean that
-vv
would now show the performance report, while to get the "old"-vv
report (+ performance info), one would now need to use-vvv
, and the "old"-vvv
would become-vvvv
.While this would be a BC-break, this type of verbosity is generally only ever used by sniff authors/PHPCS tokenizer fix contributors. So considering the very limited public this BC-break would hit, I hope this could be considered acceptable as an enhancement, which could go into a PHPCS 3.x minor.
Maybe using the
-r
for "(sniff) run time information" ?This would avoid the BC-break and would still make the information available in a more workable manner.
Performance
Report
type, possibly externally hosted, like in PHPCSDevTools.To allow for this, some changes would be needed to the
File::process()
method as it currently only records "time taken by each sniff" to the$this->listenerTimes
property ifPHP_CODESNIFFER_VERBOSITY > 2
, i.e. when the-vvv
option is enabled, which would run counter to the problem I'm trying to solve.It might also make life easier if, similar to the
File::getMetrics()
method, there would also be aFile::getListenerTimes()
method.It would still need some work-arounds as
$this->listenerTimes
is cumulative and only theReport::generateFileReport()
method has access to theFile
object, while the report would need to be generated in thegenerate()
method which doesn't have access to theFile
object, but that is doable.In my opinion, option 1 or 2 would be the preferred options as it would also allow standard maintainers to ask performance issue reporters to run the report, even when they can't share access to the (private) code base on which the performance issue was detected.
Having said that, option 3 would still be an improvement over the current situation.
Opinions ?
/cc @gsherwood @kukulich
P.S.: I'd, of course, be happy to make the changes, but would like some direction before spending significant time on this.
The text was updated successfully, but these errors were encountered: