Pinned Loading
-
linkedin/Liger-Kernel
linkedin/Liger-Kernel PublicEfficient Triton Kernels for LLM Training
-
NVIDIA/TensorRT-LLM
NVIDIA/TensorRT-LLM PublicTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
-
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
datamllab/automl-in-action-notebooks
datamllab/automl-in-action-notebooks PublicJupyter notebooks for the code samples of the book "Automated Machine Learning in Action"
-
pytorch/ao
pytorch/ao PublicPyTorch native quantization and sparsity for training and inference
117 contributions in the last year
Day of Week | March Mar | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | ||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Contribution activity
March 2025
Created 3 commits in 1 repository
Created a pull request in sgl-project/sglang that received 5 comments
Add deepseek style fused moe group gate selection kernel
Motivation
PR adapted and improved from #3191
Rewrite Macro. Extended to support all power of 2 # expert
& # expert group
, also all # topk_group
& #…
Opened 4 other pull requests in 1 repository
sgl-project/sglang
1
open
1
closed
2
merged
-
[Not ready for merge] Remove macro definition for ROCM for __shfl_xor_sync
This contribution was made on Mar 15
-
Add deepseek style fused moe group gate selection kernel
This contribution was made on Mar 15
-
Fix per token fp8 quant precision
This contribution was made on Mar 13
-
Add moe topk softmax templated from vllm
This contribution was made on Mar 11
Reviewed 6 pull requests in 2 repositories
sgl-project/sglang
4 pull requests
-
[quantization] fix channelwise conversion with scalar weight scale
This contribution was made on Mar 19
-
Add deepseek style fused moe group gate selection kernel
This contribution was made on Mar 18
-
Add deepseek style fused moe group gate selection kernel
This contribution was made on Mar 17
-
Add moe topk softmax templated from vllm
This contribution was made on Mar 15
linkedin/Liger-Kernel
2 pull requests
-
Update README.md
This contribution was made on Mar 18
-
Refactor chunked preference functions and distillation base class
This contribution was made on Mar 3
Created an issue in linkedin/Liger-Kernel that received 2 comments
Support DAPO Chunked loss
🚀 The feature, motivation and pitch ByteDance DAPO is the open-sourced SOTA RL algorithm that achieves 50 points on AIME 2024 based on the Qwen2.5-…