GitHub - InternLM/xtuner: A Next-Generation Training Engine Built for Ultra-Large MoE Models

👋 join us on

🔍 Explore our models on

English | 简体中文

🚀 Speed Benchmark

🎉 News

[2025/09] XTuner V1 Released! A Next-Generation Training Engine Built for Ultra-Large MoE Models

📖 XTuner V1

XTuner V1 is a next-generation LLM training engine specifically designed for ultra-large-scale MoE models. Unlike traditional 3D parallel training architectures, XTuner V1 is optimized for the mainstream MoE training scenarios prevalent in today's academic research.

Key Features

📊 Dropless Training

Scalable without complexity: Train 200B-scale MoE models without expert parallelism; 600B models require only intra-node expert parallelism
Optimized parallelism strategy: Smaller expert parallelism dimension compared to traditional 3D approaches, enabling more efficient Dropless training

📝 Long Sequence Support

Memory-efficient design: Train 200B MoE models on 64k sequence lengths without sequence parallelism through advanced memory optimization techniques
Flexible scaling: Full support for DeepSpeed Ulysses sequence parallelism with linearly scalable maximum sequence length
Robust performance: Maintains stability despite expert load imbalance during long sequence training

⚡ Superior Efficiency

Massive scale: Supports MoE training up to 1T parameters
Breakthrough performance: First to achieve FSDP training throughput that surpasses traditional 3D parallel schemes for MoE models above 200B scale
Hardware optimization: Achieves training efficiency on Ascend A3 Supernode that exceeds NVIDIA H800

🔥 Roadmap

XTuner V1 is committed to continuously improving training efficiency for pre-training, instruction fine-tuning, and reinforcement learning of ultra-large MoE models, with special focus on Ascend NPU optimization.

🚀 Training Engine

Our vision is to establish XTuner V1 as a versatile training backend that seamlessly integrates with the broader open-source ecosystem.

Model	GPU(FP8)	GPU(BF16)	NPU(BF16)
Intern S1	✅	✅	✅
Intern VL	✅	✅	✅
Qwen3 Dense	✅	✅	✅
Qwen3 MoE	✅	✅	✅
GPT OSS	✅	✅	🚧
Deepseek V3	✅	✅	🚧
KIMI K2	✅	✅	🚧

🧠 Algorithm

The algorithm component is actively evolving. We welcome community contributions - with XTuner V1, scale your algorithms to unprecedented sizes!

Implemented

✅ Multimodal Pre-training - Full support for vision-language model training
✅ Multimodal Supervised Fine-tuning - Optimized for instruction following
✅ GRPO - Group Relative Policy Optimization

Coming Soon

🔄 MPO - Mixed Preference Optimization
🔄 DAPO - Dynamic Sampling Policy Optimization
🔄 Multi-turn Agentic RL - Advanced agent training capabilities

⚡ Inference Engine Integration

Seamless deployment with leading inference frameworks:

LMDeploy
vLLM
SGLang

Data Preparation

You can use GraphGen to create synthetic data for fine-tuning.

🤝 Contributing

We appreciate all contributions to XTuner. Please refer to CONTRIBUTING.md for the contributing guideline.

🙏 Acknowledgement

The development of XTuner V1's training engine has been greatly inspired by and built upon the excellent work of the open-source community. We extend our sincere gratitude to the following pioneering projects:

Training Engine:

Torchtitan - A PyTorch native platform for training generative AI models
Deepspeed - Microsoft's deep learning optimization library
MindSpeed - Ascend's high-performance training acceleration library
Megatron - NVIDIA's large-scale transformer training framework

Reinforcement Learning:

XTuner V1's reinforcement learning capabilities have been enhanced through insights and best practices from:

veRL - Volcano Engine Reinforcement Learning for LLMs
SLIME - THU's scalable RLHF implementation
AReal - Ant Reasoning Reinforcement Learning for LLMs
OpenRLHF - An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray

We are deeply grateful to all contributors and maintainers of these projects for advancing the field of large-scale model training.

🖊️ Citation

@misc{2023xtuner,
    title={XTuner: A Toolkit for Efficiently Fine-tuning LLM},
    author={XTuner Contributors},
    howpublished = {\url{https://github.com/InternLM/xtuner}},
    year={2023}
}

License

This project is released under the Apache License 2.0. Please also adhere to the Licenses of models and datasets being used.

Name		Name	Last commit message	Last commit date
Latest commit History 581 Commits
.dev_scripts		.dev_scripts
.github		.github
ci		ci
docs		docs
examples		examples
requirements		requirements
tests		tests
xtuner		xtuner
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.owners.yml		.owners.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_zh-CN.md		README_zh-CN.md
image_build.sh		image_build.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Speed Benchmark

🎉 News

📖 XTuner V1

Key Features

🔥 Roadmap

🚀 Training Engine

🧠 Algorithm

⚡ Inference Engine Integration

Data Preparation

🤝 Contributing

🙏 Acknowledgement

🖊️ Citation

License

About

Uh oh!

Releases 26

Packages

Used by 179

Contributors 47

Languages

License

InternLM/xtuner

Folders and files

Latest commit

History

Repository files navigation

🚀 Speed Benchmark

🎉 News

📖 XTuner V1

Key Features

🔥 Roadmap

🚀 Training Engine

🧠 Algorithm

⚡ Inference Engine Integration

Data Preparation

🤝 Contributing

🙏 Acknowledgement

🖊️ Citation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 26

Packages 0

Used by 179

Contributors 47

Languages

Packages