performance: (~3x) optimize field multiplier and EC-adder for improved MSM and ECNTT #693
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe the changes
fix: ECNTT 5<logn<14 was single threaded. Now multi threaded.
This PR is using inlining and loop unrolling to improve CPU for:
ECNTT (MULTI-THREAD) {size=2^14, curve=BN254}:
MSM (SINGLE-THREAD!) BLS12-377 logn=20:
MSM (MULTI-THREAD!) BLS12-377 logn=20:
CONCLUSION: ICICLE msm is bound by the manager thread. This is why for single thread we see 3X but for multi-thread mac (8-10 cores) improves 1.8x but i9 (32 cores) show no improvement. @LeonHibnik @mickeyasa @Koren-Brand