Skip to content

Conversation

@scxiao
Copy link

@scxiao scxiao commented Jan 20, 2026

This PR adds a heuristic to set the KWidth for attention. kWidth is set to 8 by default, but if there are not enough elements for each lane to load 8 elements, kWidth is ajdusted to 4.

Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because FILL THIS IN.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants