Change the repository type filter
All
Repositories list
75 repositories
LongSpec
Publicd4ft
Public- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
- 🚢 Data Toolkit for Sailor Language Models
sailor2
PublicMegatron-Sailor2
Publicregmix
Publicoat-zero
Publicautofd
PublicAutomatic Functional Differentiation in JAXI-FSJ
PublicInfNeRF
Publicsailor-llm
PublicMeta-Unlearning
Publicinceptionnext
PublicInceptionNeXt: When Inception Meets ConvNeXt (CVPR 2024)stde
Publicoptim4rl
PublicOptim4RL is a Jax framework of learning to optimize for reinforcement learning.VocabularyParallelism
Publicsdft
Public[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".Cheating-LLM-Benchmarks
PublicP-DoS
PublicCPO
PublicSimLayerKV
PublicAttention-Sink
Public[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)scaling-with-vocab
Public[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623envpool
PublicC++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.