Add THD format support for Context Parallel #641

kunlunl · 2024-01-30T06:13:15Z

Make Context Parallel support THD format.
Currently only the Flash Attention Backend is supported as Fused Attention doesn't support THD format.

kunlunl · 2024-03-14T13:30:54Z

Add some custom CUDA kernels to replace the pytorch native op, to make the THD and BSHD have the same performance when using context parallel.

tests/pytorch/fused_attn/run_fused_attn_with_cp.py

transformer_engine/pytorch/csrc/extensions.h

xrennvidia · 2024-04-19T00:27:16Z

Hi @kunlunl , it looks much better. I left some simple comments, I will let @cyanguwa to review the cuda code more carefully. Thanks.

transformer_engine/pytorch/csrc/extensions/attention.cu

xrennvidia

LGTM. Thanks!

cyanguwa · 2024-04-30T23:44:18Z

/te-ci pytorch

cyanguwa · 2024-04-30T23:46:25Z

@kunlunl thanks for the PR. Could you fix the DCO and lint errors please? Instructions are in the Details link. Thanks.

cyanguwa

LGTM. Pending context parallel CI.

cyanguwa · 2024-05-03T19:03:45Z

@kunlunl could you also please update the pytest version to 7.2 in qa/L1_pytorch_context_parallel_test/test.sh? Thanks.

Signed-off-by: kunlunl <[email protected]>

kunlunl · 2024-05-06T11:54:16Z

thanks for the PR. Could you fix the DCO and lint errors please? Instructions are in the Details link. Thanks.

could you also please update the pytest version to 7.2 in qa/L1_pytorch_context_parallel_test/test.sh? Thanks.

@cyanguwa Both are done.

cyanguwa · 2024-05-13T18:53:51Z

/te-ci pytorch

timmoon10 requested a review from cyanguwa February 8, 2024 20:39

kunlunl force-pushed the add_thd_for_cp branch 2 times, most recently from 514b9b6 to afd7fe1 Compare March 14, 2024 13:23

kunlunl force-pushed the add_thd_for_cp branch from afd7fe1 to ccd7b9e Compare April 1, 2024 12:15

xrennvidia reviewed Apr 19, 2024

View reviewed changes

transformer_engine/pytorch/csrc/extensions/attention.cu Outdated Show resolved Hide resolved

kunlunl force-pushed the add_thd_for_cp branch 2 times, most recently from dea2a00 to 43add29 Compare April 22, 2024 16:32

xrennvidia approved these changes Apr 23, 2024

View reviewed changes

xrennvidia mentioned this pull request Apr 24, 2024

Add attention bias and qkv format to context parallelism #726

Merged

kunlunl force-pushed the add_thd_for_cp branch 2 times, most recently from 7aea011 to 8ced458 Compare April 26, 2024 17:02

cyanguwa reviewed May 3, 2024

View reviewed changes

kunlunl force-pushed the add_thd_for_cp branch 2 times, most recently from 044b028 to 80686b7 Compare May 6, 2024 10:05

Add THD format support for Context Parallel

882c78b

Signed-off-by: kunlunl <[email protected]>

kunlunl force-pushed the add_thd_for_cp branch from 80686b7 to 882c78b Compare May 6, 2024 11:21

Merge branch 'main' into add_thd_for_cp

11047fa

Merge branch 'main' into add_thd_for_cp

62148be

cyanguwa approved these changes May 13, 2024

View reviewed changes

cyanguwa merged commit 476f659 into NVIDIA:main May 13, 2024
9 checks passed

tomlifu mentioned this pull request Jun 25, 2024

[PyTorch] Add context parallel support for packed dataset in THD format NVIDIA/NeMo#9540

Closed

3 tasks

xrennvidia mentioned this pull request Aug 26, 2024

[PyTorch] Add support for cuDNN FusedAttention + THD + CP #885

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add THD format support for Context Parallel #641

Add THD format support for Context Parallel #641

kunlunl commented Jan 30, 2024

kunlunl commented Mar 14, 2024

xrennvidia commented Apr 19, 2024

xrennvidia left a comment

cyanguwa commented Apr 30, 2024

cyanguwa commented Apr 30, 2024

cyanguwa left a comment •

edited

Loading

cyanguwa commented May 3, 2024

kunlunl commented May 6, 2024

cyanguwa commented May 13, 2024

Add THD format support for Context Parallel #641

Add THD format support for Context Parallel #641

Conversation

kunlunl commented Jan 30, 2024

kunlunl commented Mar 14, 2024

xrennvidia commented Apr 19, 2024

xrennvidia left a comment

Choose a reason for hiding this comment

cyanguwa commented Apr 30, 2024

cyanguwa commented Apr 30, 2024

cyanguwa left a comment • edited Loading

Choose a reason for hiding this comment

cyanguwa commented May 3, 2024

kunlunl commented May 6, 2024

cyanguwa commented May 13, 2024

cyanguwa left a comment •

edited

Loading