Skip to content
View tanzelin430's full-sized avatar
  • University of Science and Technology of China
  • Hefei,China
  • 20:38 - 12h behind

Highlights

  • Pro

Block or report tanzelin430

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 87,522 12,956 Updated Mar 29, 2025

This package contains the original 2012 AlexNet code.

Cuda 2,244 281 Updated Mar 12, 2025

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程

Jupyter Notebook 14,212 1,625 Updated Mar 22, 2025

DeepSeek-V3/R1 inference performance simulator

Jupyter Notebook 88 8 Updated Mar 27, 2025

Analyze computation-communication overlap in V3/R1.

970 130 Updated Mar 21, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 43,011 6,534 Updated Mar 29, 2025

在不同城市要过上同等生活水平的我到底需要多少钱?

TypeScript 80 7 Updated Mar 21, 2025

how to optimize some algorithm in cuda.

Cuda 2,054 184 Updated Mar 26, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,675 281 Updated Mar 10, 2025

🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用

Rust 36,966 6,646 Updated Mar 25, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,385 811 Updated Mar 1, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,178 899 Updated Mar 25, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,947 230 Updated Mar 4, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,695 101 Updated Mar 7, 2025

My learning notes/codes for ML SYS.

Python 1,609 93 Updated Mar 29, 2025

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 518 53 Updated Aug 19, 2024

Efficient and easy multi-instance LLM serving

Python 349 27 Updated Mar 28, 2025

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 2 Updated Jan 15, 2025

Materials for learning SGLang

357 24 Updated Mar 22, 2025

[ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Python 442 25 Updated Feb 10, 2025

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 24,001 2,400 Updated Mar 27, 2025

润学全球官方指定GITHUB,整理润学宗旨、纲领、理论和各类润之实例;解决为什么润,润去哪里,怎么润三大问题; 并成为新中国人的核心宗教,核心信念。

31,972 2,621 Updated Jul 31, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 12,615 1,388 Updated Mar 29, 2025

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Python 4,600 2,371 Updated Mar 29, 2025

A low-latency & high-throughput serving engine for LLMs

Python 330 42 Updated Jan 31, 2025

A tool for bandwidth measurements on NVIDIA GPUs.

C++ 393 34 Updated Feb 7, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,949 517 Updated Mar 28, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 333 25 Updated Mar 24, 2025

Nightly Build for LMDeploy

PowerShell 10 Updated Jan 28, 2025

Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Python 959 178 Updated Mar 27, 2025
Next
Showing results