[TEST] flashinfer version upgrade to v0.2.0 #2054

james-p-xu · 2024-11-17T05:21:56Z

Motivation

Modifications

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

zhyncs · 2024-11-17T10:54:32Z

python/sglang/srt/layers/attention/flashinfer_backend.py



 class WrapperDispatch(Enum):
    SLIDING_WINDOW = auto()
    CROSS_ATTENTION = auto()


+def _grouped_size_compiled_for_decode_kernels(


With FlashInfer v0.2, Tensor Core will be enabled by default for decoding, so this is no longer necessary.

I was looking at this issue: flashinfer-ai/flashinfer#549. I can try running without this function to ensure that it works as intended.

zhyncs · 2024-11-17T10:54:49Z

python/sglang/srt/layers/attention/flashinfer_backend.py

@@ -7,6 +7,8 @@
 Each backend supports two operators: extend (i.e. prefill with cached prefix) and decode.
 """

+#


Please delete this.

zhyncs · 2024-11-17T10:55:13Z

python/sglang/srt/layers/attention/flashinfer_backend.py

@@ -612,6 +619,8 @@ def call_begin_forward(
                self.num_qo_heads,
                self.num_kv_heads,
                self.head_dim,
+                q_data_type=self.q_data_type,


QQ Why do you specify the data type here?

I am trying to sync with @yzh119 on this; I was seeing a dtype mismatch in the BatchPrefillWithRaggedKVCacheWrapper plan function otherwise.

I’ve fixe it with the hopper branch. cc @yzh119 I’ll land soon.

zhyncs · 2024-11-17T10:55:45Z

scripts/ci_install_dependency.sh

@@ -5,4 +5,8 @@ Install the dependency in CI.
 pip install --upgrade pip
 pip install -e "python[all]"
 pip install transformers==4.45.2 sentence_transformers accelerate peft
-pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
+git clone https://github.com/flashinfer-ai/flashinfer.git --recursive
+git reset --hard 32d9510d67187f1f3a379cce81302cdd15a557d2 # Revert to before PR https://github.com/flashinfer-ai/flashinfer/pull/609 merged


We can use this for testing, not for release.

Yes, this makes sense. This PR is mainly to test flashinfer for #2016 with flashinfer from source as it is failing all tests. We wanted to know where the issue stems from.

zhyncs · 2024-11-19T08:10:48Z

@james-p-xu If you want to use the latest version of FlashInfer, you can refer to https://github.com/flashinfer-ai/flashinfer-nightly/releases

…infer v0.2.0 flashinfer-ai/flashinfer#549

…) with QKV dtypes

james-p-xu marked this pull request as ready for review November 17, 2024 05:22

james-p-xu requested review from merrymercy, Ying1123, zhyncs and ispobock as code owners November 17, 2024 05:31

zhyncs reviewed Nov 17, 2024

View reviewed changes

james-p-xu marked this pull request as draft November 17, 2024 17:11

james-p-xu force-pushed the test_flashinfer_version_upgrade branch from 180a79b to e58161e Compare November 18, 2024 05:40

james-p-xu marked this pull request as ready for review November 18, 2024 05:43

james-p-xu requested review from ByronHsu and hnyls2002 as code owners November 18, 2024 05:43

james-p-xu force-pushed the test_flashinfer_version_upgrade branch 3 times, most recently from a0938ab to fbb8844 Compare November 18, 2024 21:17

james-p-xu marked this pull request as draft November 18, 2024 21:21

james-p-xu force-pushed the test_flashinfer_version_upgrade branch from fbb8844 to 3815058 Compare November 18, 2024 21:26

james-p-xu force-pushed the test_flashinfer_version_upgrade branch 3 times, most recently from 7b04976 to b93675b Compare November 20, 2024 22:11

james-p-xu added 9 commits November 23, 2024 11:51

[REMOVE ME] Bump FlashInfer version for CI tests

8845a9a

[REMOVE ME] Add ninja to CI

7ef040a

[REMOVE ME] Patch _grouped_size_compiled_for_decode_kernels for flash…

975201b

…infer v0.2.0 flashinfer-ai/flashinfer#549

[FlashInfer v0.2.0] Update BatchPrefillWithRaggedKVCacheWrapper.plan(…

5da009a

…) with QKV dtypes

Remove _grouped_size_compiled_for_decode_kernels usage

4b6482a

Test use flashinfer nightly build

41c58eb

Re-add ninja install

dc255a3

Bump to 11/20 flashinfer

fb97310

Bump flashinfer nightly version

6655727

james-p-xu force-pushed the test_flashinfer_version_upgrade branch from da5ccdb to 6655727 Compare November 23, 2024 16:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TEST] flashinfer version upgrade to v0.2.0 #2054

[TEST] flashinfer version upgrade to v0.2.0 #2054

james-p-xu commented Nov 17, 2024

zhyncs Nov 17, 2024

james-p-xu Nov 17, 2024

zhyncs Nov 17, 2024

zhyncs Nov 17, 2024

james-p-xu Nov 17, 2024

zhyncs Nov 17, 2024

zhyncs Nov 17, 2024

james-p-xu Nov 17, 2024

zhyncs commented Nov 19, 2024

[TEST] flashinfer version upgrade to v0.2.0 #2054

Are you sure you want to change the base?

[TEST] flashinfer version upgrade to v0.2.0 #2054

Conversation

james-p-xu commented Nov 17, 2024

Motivation

Modifications

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhyncs commented Nov 19, 2024