Opt Backend Assembly #1370

jeremylt · 2023-10-10T15:03:07Z

The /cpu/self/opt/* backends should implement their own version of diagonal/full assembly that assembles by element. A lot of the pieces are all there in the code, but spread out.

Current:

Assemble QFunction
for (elem in l-vec) Assemble Operator element

New:

for (elem in l-vec) {
  Assemble QFunction element
  Assemble Operator element
}

This is very similar to our approach with the operator application, except we would probably want to keep the block size set a 1 for simplicity. Then we can set /cpu/self/opt/serial as the operator fallback for /cpu/self/opt/blocked.

This would hopefully significantly decrease the assembly memory footprint (and speed things up) for the Opt, AVX, and XSMM backends.

The text was updated successfully, but these errors were encountered:

jeremylt added enhancement performance labels Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt Backend Assembly #1370

Opt Backend Assembly #1370

jeremylt commented Oct 10, 2023

Opt Backend Assembly #1370

Opt Backend Assembly #1370

Comments

jeremylt commented Oct 10, 2023