Cache Simulator Speedup Improvements: Elements per Iteration assumption? #138

lukastruemper · 2022-02-27T11:00:33Z

Hello,

I am applying your nice tool to typical stencil applications and I am observing very long simulation runtimes on high-dimensional stencils (several orders of magnitude longer than execution time). Most of the time is spent in the "warmup phase" and I am wondering about this:

kerncraft/kerncraft/cacheprediction.py

Line 563 in b5a302d

warmup_increment = ceildiv(max_cache_size // element_size, max_steps // 2)

Does it assume that only one element is loaded/stored to the cache per iteration? On higher-dimensional stencils, I easily read 100-1000 elements per iteration.

So could something like this be used instead of element_size:

kerncraft/kerncraft/cacheprediction.py

Line 548 in b5a302d

sympy.Integer(self.kernel.bytes_per_iteration))

, but estimated on read elements per iteration? If this leads to inaccuracy, would this still be reasonably accurate?

I would have researched this in the related publications, but I couldn't find those details.

Thanks in advance!

lukastruemper · 2022-03-04T09:28:45Z

@cod3monk do you have some time to discuss speed improvements / approximations?

TomTheBear · 2022-03-09T13:43:43Z

Sorry for the delay. We have a meeting with cod3monk next week and come back to you afterwards.

cod3monk · 2022-03-14T12:48:37Z

Hi @lukastruemper,

sorry for the long delay and thanks for getting in touch with us.

The warmup-phase should be limited by invalid_entries > 0 (becoming 0 when the cache is fully initialized, indirectly taking the number of accessed elements into account). You can try decreasing warmup_increment, but I wouldn't expect much improvement. A more likely target for improvements would be the subsequently called functions:

RRZE-HPC/pycachesim's backend.Cache.loadstore function and
Kerncraft's Kernel.compile_global_offsets

kerncraft/kerncraft/kernel.py

Line 535 in b5a302d

def compile_global_offsets(self, iteration=0, spacing=0):

@TomTheBear and @JanLJL will be in touch with you, on how to proceed.

lukastruemper · 2022-03-15T09:39:41Z

All right, thanks for the detailed reply! I'll talk to them :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache Simulator Speedup Improvements: Elements per Iteration assumption? #138

Cache Simulator Speedup Improvements: Elements per Iteration assumption? #138

lukastruemper commented Feb 27, 2022 •

edited

Loading

lukastruemper commented Mar 4, 2022

TomTheBear commented Mar 9, 2022

cod3monk commented Mar 14, 2022

lukastruemper commented Mar 15, 2022

Cache Simulator Speedup Improvements: Elements per Iteration assumption? #138

Cache Simulator Speedup Improvements: Elements per Iteration assumption? #138

Comments

lukastruemper commented Feb 27, 2022 • edited Loading

lukastruemper commented Mar 4, 2022

TomTheBear commented Mar 9, 2022

cod3monk commented Mar 14, 2022

lukastruemper commented Mar 15, 2022

lukastruemper commented Feb 27, 2022 •

edited

Loading