-
Carnegie Mellon University
- Pittsburgh
- @yi_xin_dong
Highlights
- Pro
Stars
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
DeepEP: an efficient expert-parallel communication library
Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"
verl: Volcano Engine Reinforcement Learning for LLMs
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Expressive, Easy-to-build, and High-performance Application Networks
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Cloud-Edge Collaboration Platform for Automated Synthetic Dataset Generation
A generative world for general-purpose robotics & embodied AI learning.
FastVideo is a lightweight framework for accelerating large video diffusion models.
[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 15+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Fast and memory-efficient exact attention
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
Tile primitives for speedy kernels
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
nanobind: tiny and efficient C++/Python bindings
Puzzles for learning Triton, play it with minimal environment configuration!
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
LLM abstractions that aren't obstructions