[minor] Improved `repeat_kv` eager perf #251

awgu · 2024-06-25T13:46:51Z

This PR:

In forward, this saves 4 aten::slice.
In backward, this saves 4 aten::fill and 4 aten::copy_ kernels with shape (bs, seq_len, n_kv_heads, head_dim).

See pytorch/torchtitan#418 for details and traces.

Since this repo is hosting inference (forward-only) code, this change is not significant at all. However, since this repo also hosts the reference model implementation that others base on, even for training, this change could be helpful for others.

[minor] Improved repeat_kv eager perf

89435bd

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[minor] Improved `repeat_kv` eager perf #251

[minor] Improved `repeat_kv` eager perf #251

awgu commented Jun 25, 2024

[minor] Improved repeat_kv eager perf #251

Are you sure you want to change the base?

[minor] Improved repeat_kv eager perf #251

Conversation

awgu commented Jun 25, 2024

[minor] Improved `repeat_kv` eager perf #251

[minor] Improved `repeat_kv` eager perf #251