Skip to content
View Ubospica's full-sized avatar

Highlights

  • Pro

Block or report Ubospica

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 919 122 Updated Mar 28, 2025

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 11,232 1,164 Updated Mar 28, 2025

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Python 9,676 720 Updated Mar 28, 2025
Cuda 2 Updated Mar 5, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,326 684 Updated Mar 28, 2025

Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"

Python 15 Updated Feb 16, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 5,807 581 Updated Mar 28, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,395 1,442 Updated Mar 10, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,157 898 Updated Mar 25, 2025

Expressive, Easy-to-build, and High-performance Application Networks

Go 16 6 Updated Jan 30, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,941 188 Updated Mar 27, 2025

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.

Jupyter Notebook 4,089 526 Updated Mar 14, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,943 583 Updated Mar 27, 2025

Cloud-Edge Collaboration Platform for Automated Synthetic Dataset Generation

Python 6 Updated Mar 14, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 24,579 2,149 Updated Mar 27, 2025

FastVideo is a lightweight framework for accelerating large video diffusion models.

Python 1,281 76 Updated Mar 28, 2025

[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation

Python 196 14 Updated Dec 16, 2024

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 15+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Python 7,592 601 Updated Mar 28, 2025

Fast and memory-efficient exact attention

Python 16,592 1,571 Updated Mar 25, 2025

A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.

187 5 Updated Mar 22, 2025

structured outputs for llms

Python 9,919 762 Updated Mar 27, 2025

Tile primitives for speedy kernels

Cuda 2,189 130 Updated Mar 28, 2025

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,622 376 Updated Dec 4, 2024

nanobind: tiny and efficient C++/Python bindings

C++ 2,674 221 Updated Mar 28, 2025

🙌 OpenHands: Code Less, Make More

Python 51,484 5,708 Updated Mar 28, 2025

Puzzles for learning Triton, play it with minimal environment configuration!

Python 267 25 Updated Dec 3, 2024

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,206 232 Updated Dec 3, 2024

LLM abstractions that aren't obstructions

Python 1,038 72 Updated Mar 24, 2025

The Runner for GitHub Actions 🚀

C# 5,204 1,029 Updated Mar 26, 2025
Next
Showing results