Performance enhancement via matmul

Evaluation of scalars is done essentially via binary matmuls. This is a bit hard to see, but effectively is what happens after the VMAP transformation. 

Generally, kernels are not optimized for binary matmul, so one could consider using floating-point matmul instead. 

An additional strategy is bit-packing, which would also significantly save memory.

Some profiling should definitely be done for this ticket. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance enhancement via matmul #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance enhancement via matmul #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions