AVNLP

All

24 repositories

rankers
Public
Modular LLM ranking library for Information Retrieval and RAG. Implements state-of-the-art Pairwise, Setwise, and Listwise ranking with structured generation and specialized models (RankZephyr, RankLlama). Features efficient sorting algorithms, sliding windows, and zero-shot capabilities.
ranker pairwise listwise pointwise setwise llm-rankers
Python
•
MIT License
•1•5•0•0•Updated Dec 22, 2025Dec 22, 2025
llm-blender
Public
LLM-Blender: Ensembling framework that maximizes LLM performance via pairwise ranking. Employs PairRanker to rank candidates and GenFuser to merge outputs, generating superior responses by combining the diverse strengths of multiple open-source models.
rag llms rankers llm-blender
Python
•
MIT License
•3•36•0•0•Updated Dec 20, 2025Dec 20, 2025
rag-model-training
Public
Training code for advanced RAG techniques - Adaptive-RAG, Corrective RAG, RQ-RAG, Self-RAG, Agentic RAG, and ReZero. Reproduces paper methodologies to fine-tune LLMs via SFT and GRPO for adaptive retrieval, corrective evaluation, query refinement, self-reflection, and agentic search behaviors.
rag sft rezero crag llm-training agentic-rag adaptive-rag self-rag grpo rq-rag
Python
•
MIT License
•2•5•0•0•Updated Dec 13, 2025Dec 13, 2025
rag-pipelines
Public
Advanced RAG Pipelines and Evaluation
pubmed unstructured rag baml milvus earnings-calls contextual-ai llm langgraph rag-pipeline
Python
•
MIT License
•1•10•0•0•Updated Dec 7, 2025Dec 7, 2025
rrf
Public
Performance Evaluation of Rankers and RRF Techniques for Retrieval Pipelines: Employs Diversity, Lost-in-the-Middle, and Similarity rankers to reorder documents and maximize LLM context window performance. Implements Hybrid Retrieval with Reciprocal Rank Fusion (RRF) and rigorous BEIR evaluation (NDCG, MAP, Recall, Precision).
ranker rrf reciprocal-rank-fusion similarity-ranker lost-in-the-middle-ranker diversity-ranker
Python
•
MIT License
•1•6•0•0•Updated Nov 23, 2025Nov 23, 2025
dspy-opt
Public
Advanced RAG pipeline optimization framework using DSPy. Implements modular RAG pipelines with Query-Rewriting, Sub-Query Decomposition, and Hybrid Search via Weaviate. Automates prompt tuning and few-shot selection using MIPRO, COPRO, and BootstrapFewShot optimizers on datasets like FreshQA, HotpotQA, TriviaQA, Wikipedia and PubMedQA.
metadata-extraction query-rewriting rag weaviate dspy rag-pipeline deepeval sub-query-generation
Python
•
MIT License
•1•6•0•0•Updated Oct 31, 2025Oct 31, 2025
biothink
Public
Self-Reflective Question Answering for Biomedical Reasoning
rag biomedical-question-answering self-rag grpo
Python
•
MIT License
•1•5•0•0•Updated Oct 14, 2025Oct 14, 2025
med-reason-evals
Public
Python
•
MIT License
•0•2•0•0•Updated Oct 7, 2025Oct 7, 2025
llm-finetuning
Public
Pipelines for Fine-Tuning LLMs using SFT and RLHF
lora fine-tuning ppo peft sft dpo kto p-tuning qlora orpo
Python
•
MIT License
•2•6•0•0•Updated Oct 7, 2025Oct 7, 2025
avnlp.github.io
Public
MIT License
•0•0•0•0•Updated Oct 2, 2025Oct 2, 2025
grpo
Public
Group Relative Policy Optimization (GRPO) Implementations
reward-functions rlhf grpo
Python
•
MIT License
•0•4•0•0•Updated Sep 3, 2025Sep 3, 2025
prp
Public
Pairwise Ranking Prompting (PRP): Zero-shot LLM reranking library implementing efficient pairwise strategies (Heapsort, Sliding Window, All-Pairs). Mitigates position bias via bidirectional comparison and ensures reliability with structured Pydantic validation. Built for Haystack pipelines.
ranker pairwise prp pairwise-ranking-prompting
Python
•
MIT License
•0•3•0•0•Updated Jul 24, 2025Jul 24, 2025
MedRAG
Public
Python
•
Other
•41•0•0•0•Updated Jul 16, 2025Jul 16, 2025
scGPT
Public
Jupyter Notebook
•
MIT License
•311•0•0•0•Updated Jul 9, 2025Jul 9, 2025
BioReason
Public
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model
Jupyter Notebook
•
Apache License 2.0
•51•0•0•0•Updated Jun 14, 2025Jun 14, 2025
dataloaders
Public
Dataloaders is a versatile library designed for processing and formatting datasets to support various Retrieval-Augmented Generation (RAG) pipelines, facilitating efficient evaluation and analysis.
Python
•
MIT License
•1•4•0•0•Updated May 1, 2025May 1, 2025
hyperparameter-tuning
Public
Effect of Optimizer Selection and Hyperparameter Tuning on Training Efficiency and LLM Performance
hyperparameter-tuning adam-optimizer sgd-optimizer sgd-momentum optimizers rmsprop-optimizer
Python
•
MIT License
•1•4•0•0•Updated Apr 16, 2025Apr 16, 2025
vectordb
Public
Pipelines for Semantic Search, Metadata Filtering, Hybrid Search, Reranking, and Retrieval-Augmented Generation (RAG) on the TriviaQA, ARC, PopQA, FactScore, and Edgar datasets. These pipelines have been implemented using the Pinecone, Weaviate, Milvus, Qdrant and Chroma vector databases.
haystack pinecone weaviate milvus vector-database llm langchain
Python
•
MIT License
•1•4•0•0•Updated Jan 31, 2025Jan 31, 2025
.github
Public
MIT License
•0•0•0•0•Updated Jan 2, 2025Jan 2, 2025
PPO-for-Beginners
Public
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Python
•
MIT License
•157•0•0•0•Updated Oct 1, 2024Oct 1, 2024
direct-preference-optimization
Public
Reference implementation for DPO (Direct Preference Optimization)
Python
•
Apache License 2.0
•233•0•0•0•Updated Aug 11, 2024Aug 11, 2024
self-biorag
Public
[ISMB '24] Self-BioRAG: Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models
Python
•10•0•0•0•Updated Apr 4, 2024Apr 4, 2024
GenePT
Public
Jupyter Notebook
•45•0•0•0•Updated Mar 18, 2024Mar 18, 2024
finbert-hf
Public
Python
•
MIT License
•0•0•0•0•Updated Jan 30, 2023Jan 30, 2023