-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Simulator Speedup Improvements: Elements per Iteration assumption? #138
Comments
@cod3monk do you have some time to discuss speed improvements / approximations? |
Sorry for the delay. We have a meeting with cod3monk next week and come back to you afterwards. |
Hi @lukastruemper, sorry for the long delay and thanks for getting in touch with us. The warmup-phase should be limited by
@TomTheBear and @JanLJL will be in touch with you, on how to proceed. |
All right, thanks for the detailed reply! I'll talk to them :) |
Hello,
I am applying your nice tool to typical stencil applications and I am observing very long simulation runtimes on high-dimensional stencils (several orders of magnitude longer than execution time). Most of the time is spent in the "warmup phase" and I am wondering about this:
kerncraft/kerncraft/cacheprediction.py
Line 563 in b5a302d
Does it assume that only one element is loaded/stored to the cache per iteration? On higher-dimensional stencils, I easily read 100-1000 elements per iteration.
So could something like this be used instead of element_size:
kerncraft/kerncraft/cacheprediction.py
Line 548 in b5a302d
, but estimated on read elements per iteration? If this leads to inaccuracy, would this still be reasonably accurate?
I would have researched this in the related publications, but I couldn't find those details.
Thanks in advance!
The text was updated successfully, but these errors were encountered: