Introduction

This project contains experiments on GenAI.

Getting Started

WSL setup

This project was developed on a Windows 11 os, while some components require a linux os and are thus running inside a containerized environment backed by WSL.

Install WSL and create a linux distribution following the Microsoft official doc. See also the wsl basic commands.

Docker setup

Install Docker desktop or Podman on the Windows os.
- Note: All previous versions of Docker Engine and CLI installed directly through Linux distributions must be uninstalled before installing Docker Desktop.
- Activate the WSL integration in Docker Desktop settings following docker documentation
Install nvidia-container-toolkit in your linux distro:
- Launch a linux terminal by running the following command in a cmd (the --distribution flag being optional)
```
wsl --distribution <distro-name>
```
- Execute the install commands found in nvidia documentation
- Allow docker to use nvidia Container Runtime by executing the commands found in nvidia documentation

Additional tips:

Terminate the running linux kernel:

(kill one kernel) wsl -t <distro-name>
(kill all kernel) wsl --shutdown

Database setup

Get qdrant latest docker image by running (in CMD or bash):
```
docker pull qdrant/qdrant
```

Python setup

This project uses python 3.11 as core interpreter, and poetry 1.6.1 as dependency manager.

Install miniconda on windows or on linux.
Create a new conda environment with
```
conda env create -f environment.yml
```
Activate the environment with
```
conda activate llm-playground
```
Move to the project directory, and install the project dependencies with
```
poetry install
```
Launch a jupyter server with
```
jupyter notebook
```

How to use it

Run Qdrant vector database

We dedicate a docker container with name qdrant-db to the vector database backend microservice.

Create and run new qdrant vector database service:
```
wsl -e ./scripts/services/qdrant/qdrant-db.sh
```
Run existing qdrant vector database service:
```
docker start qdrant-db
```

Run Text Generation Inference LLM service

We dedicate a docker container with name tgi-service to a TGI-based LLM backend microservice.

Create and run new TGI llm service: Launch docker desktop, open a cmd or shell and run
```
wsl -e ./scripts/services/tgi/tgi-service.sh
```
Run existing TGI llm service: Launch docker desktop, open a cmd or shell and run
```
docker start tgi-service
```

Learning plan

1. Inference
Framework	Documentation	Examples	Comment
Huggingface transformers
ctransformers	Github		CPU-only, Unmaintained

Service	Documentation	Examples	Comment
vLLM	Github, Inference speed blog post, 2309	Official quickstart, Official list of examples, Run in WSL	Linux-only
TGI: Text Generation Inference	Github HF page	Run with WSL & Docker, Run again with WSL & Docker, External usage, Use with OpenAI / langchain / llama-index client	Linux-only
Triton Inference Server	Github, pytriton Github	tensorRT with TIS	Linux-only
Llamafile
ollama	Github	ollama for Mixtral
OpenLLM	Github
DeepSparse	Github		CPU-only, Linux-only

SDK	Documentation	Examples	Comment
LangChain
Llama-index	Github, Documentation	Official list of notebooks
EmbedChain
Jan (product)	Github
- Further readings -
List of LLM frameworks, AWS GenAI tutorials, Open LLM Huggingface leaderboard, MTEB leaderboard, hamelsmu llama-inference, can-ai-code-results,
*

2. Compression, Quantization, Pruning
Method	Documentation	Examples	Paper
SparseML	Github
BitsAndBytes	HF docs	HF docs	2208
GPTQ	HF blog	Official repo notebooks	2210
AWQ: Activation-aware Weight Quantization	HF docs	notebook	2306
SqueezeLLM			2306
EXL2	Github	Blog post
HQQ: Half-Quadratic Quantization	Github	HQQ for Mixtral
EETQ	Github
ATOM			2310
*

3. Evaluation
Method	Documentation	Examples	Github
LLM-autoeval			Github
Deepeval		Integration in Huggingface Trainer	Github
*

4. Prompt Engineering
Method	Documentation	Examples	Paper
Chain of thoughts
Tree of thoughts
Graph of thoughts
Prompt injection
*

5. Data Ingestion
Method	Documentation	Examples	Paper
Retrieval Aware Fine-tuning (RAFT)	Github		2403
Automatic Data Selection in Instruction Tuning			2312
Fill-In-The-Middle (FIM) transformation			2207
*

6. Retrieval-Augmented Generation
Method	Documentation	Examples	Paper
RAG		llama-index blog, llama-index documentation	2312
self-RAG
*

7. Finetuning
Method	Documentation	Examples	Paper
PEFT: Parameter-Efficient FineTuning
C-RLFT: Conditioned-Reinforcement Learning Fine-Tuning
LoRA: Low Ranking Adaptation
QLoRA: Quantized Low Ranking Adaptation
DPO: Direct Preference Optimization			2305
SPIN: Self-Play Finetuning			2401
ASTRAIOS: Parameter-Efficient Instruction Tuning			2401
LLAMA-pro: Progressive Learning of LLMs
GaLore: Gradient Low Rank Projection			2403
ORPO: Odds Ratio Preference Optimization			2403
DNO: Direct Nash Optimization			2404

Framework	Documentation	Examples	Comment
TRL	Github , Finetuning scripts ,
Axolotl	Github , Finetuning script ,		Based on TRL, Multi-GPU with Accelerate
HF alignment handbook	Github , Finetuning scripts ,		Based on TRL, Multi-GPU with Accelerate
Adapters	Github , Train adapters around LLM or BERT models ,		Based on TRL, Multi-GPU with Accelerate
*

8. Model aggregation
Method	Documentation	Examples	Paper
MoE: Mixture of Experts			2209
Model merging	HF blog, Model merging bibliography
*

9. Agents
Method	Documentation	Examples	Paper

Architectures


Architecture based on TGI and langchain proposed in this blog post

References

llm-course Github repository
langchain Github repository
llm-autoeval Github repository
awesome-llms-fine-tuning Github repository
cookbook Github repository

Asset references
Inference	Leverage external source	Multi-turn interaction	Reasoning & intermediate steps	Agents
✅ Building AI Chatbots with Mistral and Llama2 🔲 7 Frameworks for Serving LLMs	🔲 A Cheat Sheet and Some Recipes For Building Advanced RAG 🔲 Why Are Advanced RAG Methods Crucial for the Future of AI?			🔲 Llama-lab

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
data		data
imgs		imgs
mlruns		mlruns
notebooks		notebooks
scripts		scripts
src/llmtools		src/llmtools
volumes		volumes
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

Getting Started

WSL setup

Docker setup

Database setup

Python setup

How to use it

Run Qdrant vector database

Run Text Generation Inference LLM service

Learning plan

Architectures

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

JBAujogue/LLM-playground

Folders and files

Latest commit

History

Repository files navigation

Introduction

Getting Started

WSL setup

Docker setup

Database setup

Python setup

How to use it

Run Qdrant vector database

Run Text Generation Inference LLM service

Learning plan

Architectures

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages