Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[PyTorch] Bunch of fixes for cpu offloading
#2535 opened Dec 19, 2025 by pggPL Draft
13 tasks
[docs] Getting started refactor documentation Improvements or additions to documentation
#2534 opened Dec 18, 2025 by pggPL Loading…
8 of 13 tasks
[JAX] Fix incorrect calculation of segment pos from segment ids attention bug Something isn't working jax
#2523 opened Dec 16, 2025 by KshitijLakhani Loading…
5 of 13 tasks
Documentation for cpu offloading documentation Improvements or additions to documentation
#2520 opened Dec 16, 2025 by pggPL Loading…
8 of 13 tasks
[PyTorch] Support cudagraph recomputation
#2518 opened Dec 16, 2025 by buptzyb Loading…
1 of 13 tasks
[JAX] HLO FFI tests jax
#2517 opened Dec 16, 2025 by jberchtold-nvidia Loading…
7 of 13 tasks
Cpu optimizations v2 cpu_overhead
#2514 opened Dec 12, 2025 by vthumbe1503 Draft
13 tasks
[Common] Optimize fused RoPE kernel performance performance Performance issues
#2508 opened Dec 11, 2025 by yaox12 Draft
13 tasks
[common] Add support for cuBLASLt GEMM for GroupedTensor MoE
#2502 opened Dec 10, 2025 by pggPL Loading…
8 tasks done
Add logic for block-scaled tensors with GEMM swizzled scales enhancement New feature or request MoE performance Performance issues refactor
#2486 opened Dec 6, 2025 by timmoon10 Loading…
14 of 19 tasks
Add support for SWA (left, right) with FusedAttention 2.12.0
#2477 opened Dec 4, 2025 by sudhakarsingh27 Loading…
22 of 28 tasks
[JAX] Einsum with quantization
#2474 opened Dec 3, 2025 by phu0ngng Draft
13 tasks
[PyTorch] Documentation for op fuser API documentation Improvements or additions to documentation
#2447 opened Dec 3, 2025 by timmoon10 Loading…
8 of 13 tasks
[PyTorch] Enable post-RHT amax estimation fp4
#2442 opened Dec 2, 2025 by negvet Draft
1 of 13 tasks
support cuda graph capture offloading module
#2435 opened Dec 1, 2025 by lhb8125 Draft
13 tasks
ProTip! Updated in the last three days: updated:>2025-12-16.