Minor Optimization for the Generic Dominator Calculation Algorithm #1424
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The generic dominator calculation algorithm sets the initial
todo
set to be just all the nodes. This leads to redundant calculations which can be saved.Explanation:
The algorithm's main loop tries to see if the dominators of the current node have changed by calculating the intersection of every previous node's dominators. Since the initial dominator set of all nodes is always the entire node set, except for the head node (whose dominator set is
set([head])
), this intersection calculation will always initially return the entire node set for all nodes except the next nodes of the head node. Therefore, we can prevent this waste of cycles by setting the initialtodo
set to contain only them.Note - I used the terms previous and next nodes instead of predecessors and successors, since this function supports both dominators and postdominators. I initially didn't notice the support for postdominators as well, and implemented an additional optimization which was correct only for dominator calculation. Luckily, the regression tests caught this.
Benchmarks:
When adding a log of the amount of such "misses" and running the
graph.py
test suite, these are the results before this PR:And after:
Not so dramatic - but I guess it might be beneficial for analysis of huge CFGs... Let's test it for large CFGs as well. I used this tester program which creates a pretty linear CFG of 20K nodes:
Here are the results before this PR:
And after:
About 3 times faster, nice...