Stars
Universal LLM Deployment Engine with ML Compilation
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Train transformer language models with reinforcement learning.
RewardBench: the first evaluation tool for reward models.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
LlamaIndex is the leading framework for building LLM-powered agents over your data.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
A curated collection of research papers exploring the utilization of LLMs for graph-related tasks.
🔥Highlighting the top ML papers every week.
DSPy: The framework for programming—not prompting—language models
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Retrieval and Retrieval-augmented LLMs
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Reference implementation for DPO (Direct Preference Optimization)
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Official inference library for Mistral models
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
[Paper List] Papers integrating knowledge graphs (KGs) and large language models (LLMs)