[C/PyTorch/Jax] Add support for more bias shapes #677

cyanguwa · 2024-02-21T04:09:47Z

This PR

adds support for [b,1,s,s],[b,h,s,s], [1,1,s,s] bias shapes, when dBias is not required. This is applicable to inference (bias.requires_grad = False), when bias is a workaround for an arbitrary mask (True/False -> 0/-inf), or when ALiBi slopes tensor is in [b,h] shape.
makes changes to the F16_arbitrary_seqlen backend of the cuDNN fused attention only.
makes changes to C, PyTorch and Jax attention implementations.

Signed-off-by: Alp Dener <[email protected]>

Signed-off-by: Charlene Yang <[email protected]>

…ze' into fused_attn/add_dbias_shapes_c_pytorch Signed-off-by: Alp Dener <[email protected]>

denera · 2024-02-21T21:07:16Z

@cyanguwa I merged #676 into this PR. Still need to update the unit tests to include the new bias shapes.

Signed-off-by: Charlene Yang <[email protected]>

cyanguwa · 2024-02-22T02:08:39Z

/te-ci

Signed-off-by: Alp Dener <[email protected]>

denera · 2024-02-22T22:51:09Z

/te-ci jax

Signed-off-by: Alp Dener <[email protected]>

denera · 2024-02-23T21:24:55Z

/te-ci jax

cyanguwa · 2024-02-23T21:44:31Z

Pytorch pipeline with CUDA 12.3: 13014488. Fixing A100 errors now.

Signed-off-by: Charlene Yang <[email protected]>

cyanguwa · 2024-02-24T00:03:39Z

Pipeline 13043757 for PyTorch and Paddle has passed.

Signed-off-by: Alp Dener <[email protected]>

denera · 2024-02-24T08:33:35Z

/te-ci jax

…h JAX Signed-off-by: Alp Dener <[email protected]>

…and h_q == h_kv conditions Signed-off-by: Alp Dener <[email protected]>

denera · 2024-02-27T06:05:52Z

With the latest commit, all FP16 tests in TE/JAX CI are now passing with neg_inf = -2^15

The BF16 failures look like this:

Any input with max_seqlen > 512 + any mask
Inputs with max_seqlen <= 512 and max_seqlen_q != max_seqlen_kv + NO_MASK

It's not clear to me yet if these BF16 failures are due to a bug in the pure-JAX/Flax reference function or the TE/JAX fused attn custom op.

tests/jax/test_fused_attn.py

transformer_engine/jax/cpp_extensions.py

…27 for Bfloat16 and -2**15 for Float16 Signed-off-by: Alp Dener <[email protected]>

Signed-off-by: Alp Dener <[email protected]>

cyanguwa · 2024-02-27T22:02:00Z

/te-ci jax

Signed-off-by: Alp Dener <[email protected]>

denera · 2024-02-28T01:13:37Z

/te-ci jax

zlsh80826

LGTM

tests/jax/test_fused_attn.py

denera and others added 3 commits February 21, 2024 01:14

added support for arbitrary bias shapes for fused_attn

70ebda6

Signed-off-by: Alp Dener <[email protected]>

Fix linting

994d7ab

Signed-off-by: Alp Dener <[email protected]>

Add b1ss/bhss/11ss bias shapes when not requiring dBias

cb19e89

Signed-off-by: Charlene Yang <[email protected]>

cyanguwa changed the title ~~[C/PyTorch] Add b1ss/bhss/11ss bias shapes when not requiring dBias~~ [C/PyTorch] Add b1ss/bhss/11ss bias shapes when dBias is not required Feb 21, 2024

cyanguwa changed the title ~~[C/PyTorch] Add b1ss/bhss/11ss bias shapes when dBias is not required~~ [C/PyTorch] Add support for more bias shapes when dBias is not required Feb 21, 2024

cyanguwa and others added 3 commits February 21, 2024 04:24

fix lint

bfc22c0

Signed-off-by: Charlene Yang <[email protected]>

Merge branch 'NVIDIA:main' into fused_attn/add_dbias_shapes_c_pytorch

f7ff976

Merge remote-tracking branch 'denera/jax-fused-attn-arbitrary-bias-si…

a12184f

…ze' into fused_attn/add_dbias_shapes_c_pytorch Signed-off-by: Alp Dener <[email protected]>

denera mentioned this pull request Feb 21, 2024

[JAX] Support arbitrary bias shape in fused attention custom ops #676

Closed

cyanguwa changed the title ~~[C/PyTorch] Add support for more bias shapes when dBias is not required~~ [C/PyTorch/Jax] Add support for more bias shapes when dBias is not required Feb 21, 2024

cyanguwa changed the title ~~[C/PyTorch/Jax] Add support for more bias shapes when dBias is not required~~ [C/PyTorch/Jax] Add support for more bias shapes Feb 21, 2024

add bias_b/h to plan cache

fd87a0b

Signed-off-by: Charlene Yang <[email protected]>

denera added 3 commits February 22, 2024 18:22

Merge branch 'main' into fused_attn/add_dbias_shapes_c_pytorch

ee221cf

Signed-off-by: Alp Dener <[email protected]>

fixed compile errors after PR653 merge

f615dfc

Signed-off-by: Alp Dener <[email protected]>

updated JAX unittests for new bias shapes

9bc00f1

Signed-off-by: Alp Dener <[email protected]>

denera added 2 commits February 23, 2024 17:15

fixed mismatched mask type checking

02a8613

Signed-off-by: Alp Dener <[email protected]>

corrected skip condition

4049e63

Signed-off-by: Alp Dener <[email protected]>

fix selection logic for A100s

6ab94bc

Signed-off-by: Charlene Yang <[email protected]>

corrected skip checks for bias shapes

3c08250

Signed-off-by: Alp Dener <[email protected]>

cyanguwa requested review from zlsh80826, mingxu1067, ksivaman and ptrendx February 26, 2024 17:10

denera added 2 commits February 26, 2024 21:59

resolved test issues but neginf with float16 is still problematic wit…

f9d5c76

…h JAX Signed-off-by: Alp Dener <[email protected]>

new bias shapes passing TE JAX CI for seqlen <= 512, seq_q == seq_kv …

59b71c6

…and h_q == h_kv conditions Signed-off-by: Alp Dener <[email protected]>

zlsh80826 reviewed Feb 27, 2024

View reviewed changes

denera and others added 3 commits February 27, 2024 20:10

TE/JAX fused attn tests for new bias shapes passing with neg_inf=-2**…

554426a

…27 for Bfloat16 and -2**15 for Float16 Signed-off-by: Alp Dener <[email protected]>

code style fixes and test parameter ID cleanup

fe74430

Signed-off-by: Alp Dener <[email protected]>

Merge branch 'main' into fused_attn/add_dbias_shapes_c_pytorch

5dbf923

cyanguwa requested a review from zlsh80826 February 27, 2024 22:03

fixed incorrect skip condition for backward fused attn test

d47f6e8

Signed-off-by: Alp Dener <[email protected]>

zlsh80826 approved these changes Feb 28, 2024

View reviewed changes

tests/jax/test_fused_attn.py Show resolved Hide resolved

cyanguwa merged commit b8eea8a into NVIDIA:main Feb 28, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C/PyTorch/Jax] Add support for more bias shapes #677

[C/PyTorch/Jax] Add support for more bias shapes #677

cyanguwa commented Feb 21, 2024 •

edited

Loading

denera commented Feb 21, 2024 •

edited

Loading

cyanguwa commented Feb 22, 2024

denera commented Feb 22, 2024

denera commented Feb 23, 2024

cyanguwa commented Feb 23, 2024 •

edited

Loading

cyanguwa commented Feb 24, 2024 •

edited

Loading

denera commented Feb 24, 2024

denera commented Feb 27, 2024

cyanguwa commented Feb 27, 2024

denera commented Feb 28, 2024

zlsh80826 left a comment

[C/PyTorch/Jax] Add support for more bias shapes #677

[C/PyTorch/Jax] Add support for more bias shapes #677

Conversation

cyanguwa commented Feb 21, 2024 • edited Loading

denera commented Feb 21, 2024 • edited Loading

cyanguwa commented Feb 22, 2024

denera commented Feb 22, 2024

denera commented Feb 23, 2024

cyanguwa commented Feb 23, 2024 • edited Loading

cyanguwa commented Feb 24, 2024 • edited Loading

denera commented Feb 24, 2024

denera commented Feb 27, 2024

cyanguwa commented Feb 27, 2024

denera commented Feb 28, 2024

zlsh80826 left a comment

Choose a reason for hiding this comment

cyanguwa commented Feb 21, 2024 •

edited

Loading

denera commented Feb 21, 2024 •

edited

Loading

cyanguwa commented Feb 23, 2024 •

edited

Loading

cyanguwa commented Feb 24, 2024 •

edited

Loading