-
Notifications
You must be signed in to change notification settings - Fork 583
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Plumbing correct bias dims from TE to cudnn
attention
#2537
opened Dec 20, 2025 by
KshitijLakhani
•
Draft
13 tasks
[docs] Getting started refactor
documentation
Improvements or additions to documentation
#2534
opened Dec 18, 2025 by
pggPL
Loading…
8 of 13 tasks
[DO NOT MERGE] Get seqlens and offsets in O(N) space instead of O(N*N) space
do not merge
#2530
opened Dec 17, 2025 by
KshitijLakhani
•
Draft
13 tasks
[JAX] Fix incorrect calculation of segment pos from segment ids
attention
bug
Something isn't working
jax
#2523
opened Dec 16, 2025 by
KshitijLakhani
Loading…
5 of 13 tasks
[JAX] Calculate seqlens and offsets in O(N) space instead of O(N*N) space for THD sequences
attention
#2522
opened Dec 16, 2025 by
KshitijLakhani
•
Draft
13 tasks
Documentation for cpu offloading
documentation
Improvements or additions to documentation
#2520
opened Dec 16, 2025 by
pggPL
Loading…
8 of 13 tasks
[PyTorch] Support cudagraph recomputation
#2518
opened Dec 16, 2025 by
buptzyb
Loading…
1 of 13 tasks
[DO NOT MERGE] Testing v2.6 + pr2201
attention
#2513
opened Dec 12, 2025 by
KshitijLakhani
•
Draft
13 tasks
[common] Add support for cuBLASLt GEMM for GroupedTensor
MoE
#2502
opened Dec 10, 2025 by
pggPL
Loading…
8 tasks done
Add logic for block-scaled tensors with GEMM swizzled scales
enhancement
New feature or request
MoE
performance
Performance issues
refactor
#2486
opened Dec 6, 2025 by
timmoon10
Loading…
14 of 19 tasks
[JAX] Estimate post-RHT amax using regular amax
fp4
#2479
opened Dec 4, 2025 by
jberchtold-nvidia
•
Draft
13 tasks
Add support for SWA (left, right) with FusedAttention
2.12.0
#2477
opened Dec 4, 2025 by
sudhakarsingh27
Loading…
22 of 28 tasks
[PyTorch] Documentation for op fuser API
documentation
Improvements or additions to documentation
#2447
opened Dec 3, 2025 by
timmoon10
Loading…
8 of 13 tasks
Fix transformer 2.9.0 (torch 2.9.1 used by SGLang 0.5.5) build
#2445
opened Dec 2, 2025 by
yiakwy-xpu-ml-framework-team
Loading…
13 tasks
[Common] Comm+GEMM overlap API updated to support cuBlasMp backend (incl. framework API)
#2443
opened Dec 2, 2025 by
denera
Loading…
5 of 13 tasks
[JAX] Better error message when Q, K, V are sharded differently
attention
jax
#2440
opened Dec 2, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2025-12-16.