- Santa Clara
- https://www.linkedin.com/in/rdspring1
- @ryanspring13
Pinned Loading
-
NVIDIA/Fuser
NVIDIA/Fuser PublicA Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
-
RUSH-LAB/LSH_Memory
RUSH-LAB/LSH_Memory PublicOne-Shot Learning using Nearest-Neighbor Search (NNS) and Locality-Sensitive Hashing LSH
-
PyTorch_GBW_LM
PyTorch_GBW_LM PublicPyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
-
Count-Sketch-Optimizers
Count-Sketch-Optimizers PublicA compressed adaptive optimizer for training large-scale deep learning models using PyTorch
-
LSH-Mutual-Information
LSH-Mutual-Information PublicUse LSH Sampling for Mutual Information Estimation
Python 5
-
lightning-thunder
lightning-thunder PublicForked from Lightning-AI/lightning-thunder
Source to source compiler for PyTorch. It makes PyTorch programs faster on single accelerators and distributed.
Python
536 contributions in the last year
Day of Week | March Mar | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | ||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Activity overview
Contribution activity
March 2025
Created 6 commits in 1 repository
Created a pull request in NVIDIA/Fuser that received 9 comments
Enforce shared memory alignment for TMA LoadStoreOps
This PR enforces the bytes alignment requirements for TMA LoadStoreOps, which prevents IMA and incorrect results. If TMA LoadStoreOp is not detecte…
Opened 4 other pull requests in 1 repository
NVIDIA/Fuser
3
open
1
merged
-
Add silu and bias epilogue matmul tests
This contribution was made on Mar 18
-
[RFC] Create a basic binding for CPP Fusion in python frontend using AI Coding Tools
This contribution was made on Mar 14
-
Load Epilogue Inputs with LdMatrix in Hopper Matmul Scheduler
This contribution was made on Mar 13
-
Enable hard-coded index for LdMatrix and create basic copy tutorial
This contribution was made on Mar 7
Reviewed 21 pull requests in 1 repository
NVIDIA/Fuser
21 pull requests
-
Deprecate ParallelType::MisalignedVectorize
This contribution was made on Mar 24
-
Introduce use_stmatrix parameter to MatmulParams
This contribution was made on Mar 19
-
Change c10::irange to iota, part 1
This contribution was made on Mar 19
-
Load Epilogue Inputs with LdMatrix in Hopper Matmul Scheduler
This contribution was made on Mar 19
-
Mark supports_segmentation=False in test_issue1273
This contribution was made on Mar 17
-
Add Blackwell MMA macros
This contribution was made on Mar 17
-
Check that warps are only accessing the subpartition of TMem that it can access
This contribution was made on Mar 14
-
indexAccumulate python api
This contribution was made on Mar 14
-
TMem check the stride of outer dims
This contribution was made on Mar 14
-
add register count checks for warp specialization with register sharing
This contribution was made on Mar 14
-
Fix C++23 backport of
zip
andenumerate
This contribution was made on Mar 13 -
Make Hopper mma tests sparse
This contribution was made on Mar 12
-
register sharing, add launch bound, disable tests with illegal paras
This contribution was made on Mar 12
-
Indexing for TMem ld and st
This contribution was made on Mar 11
-
Tensor memory 32x32b data path pattern matching
This contribution was made on Mar 11
-
Enable hard-coded index for LdMatrix and create basic copy tutorial
This contribution was made on Mar 11
-
redo register sharing PR-3972
This contribution was made on Mar 10
-
Translate MatmulOp and LinearOp on Hopper without AxisMapping
This contribution was made on Mar 10
-
Automatically save MatmulParams in extra_info in benchmarks
This contribution was made on Mar 7
-
Update Hopper default matmul heuristic
This contribution was made on Mar 6
-
format toString for SetMaxNReg and Return
This contribution was made on Mar 3
Opened 1 issue in 1 repository
NVIDIA/Fuser
1
open
-
Refactor
IndexLowering::handle(const LoadStoreOp* ldst)
This contribution was made on Mar 11