Skip to content
View SeaOfOcean's full-sized avatar

Block or report SeaOfOcean

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,673 281 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,101 534 Updated Mar 28, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,383 811 Updated Mar 1, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,946 230 Updated Mar 4, 2025

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,735 132 Updated Jan 17, 2025
Python 39 3 Updated Jun 5, 2024

DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.

Python 1,767 124 Updated Dec 6, 2024
Python 4 Updated Oct 20, 2024

A flexible and efficient training framework for large-scale alignment tasks

Python 333 28 Updated Feb 14, 2025

BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray

Jupyter Notebook 2,674 731 Updated Mar 27, 2025

PyTorch distributed training acceleration framework

Python 46 8 Updated Feb 13, 2025

Efficient and easy multi-instance LLM serving

Python 348 27 Updated Mar 28, 2025

TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.

C++ 92 9 Updated Apr 22, 2023

Fast and memory-efficient exact attention

Python 16,587 1,571 Updated Mar 25, 2025

DaCe - Data Centric Parallel Programming

Python 516 133 Updated Mar 27, 2025

Research and development for optimizing transformers

Python 125 17 Updated Feb 16, 2021

Development repository for the Triton language and compiler

MLIR 15,010 1,890 Updated Mar 28, 2025
Python 8 Updated Oct 10, 2022

A C++ standalone library for machine learning

C++ 5,352 503 Updated Mar 28, 2025

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 3,052 528 Updated Mar 28, 2025

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,036 125 Updated Apr 17, 2024

The hacker's browser.

JavaScript 24,341 2,498 Updated Mar 20, 2025

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

Python 2,111 255 Updated Nov 27, 2024

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python 2,487 249 Updated Apr 24, 2024

A framework for large scale recommendation algorithms.

Python 1,935 351 Updated Mar 9, 2025

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Python 267 49 Updated Mar 31, 2023

FastNN provides distributed training examples that use EPL.

Python 83 19 Updated Mar 11, 2022

An Industrial Graph Neural Network Framework

C++ 1,299 266 Updated Jul 1, 2024

GPU-scheduler-for-deep-learning

C++ 203 34 Updated Nov 5, 2020

Open source platform for the machine learning lifecycle

Python 19,944 4,424 Updated Mar 28, 2025
Next
Showing results