Retrieval Augmented Generation

A naive Retrieval Augmented Generation implementation uses the M3e model for text embedding, uses Milvus for vector data storage and retrieval, and LLM uses Llama3-8b.
The UI is shown below:

You can run the following script to start the Gradio web UI:

python3 rag.py

A brief description video is below:
https://www.bilibili.com/video/BV1ff421X7YN

中文解释文档在这里。

If you can't run it directly, you may need to do some preparation, including but not limilited to:

Install Ollama and run LLM:

https://ollama.com/ download and install.

> ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q4

Install or update vector database:

$ wget https://github.com/milvus-io/milvus/releases/download/v2.4.4/milvus-standalone-docker-compose.yml -O docker-compose.yml
$ docker-compose up -d
$ docker ps

Install package:

pip install -U gradio pymilvus transformers FlagEmbedding langchain langchain-core langchain_community langchain-milvus langchain-text-splitters pypdf2 bs4

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
rs		rs
README.md		README.md
bussiness.py		bussiness.py
document_zh.pdf		document_zh.pdf
models.py		models.py
rag.py		rag.py
vectordbs.py		vectordbs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieval Augmented Generation

About

Releases

Packages

Languages

coffeebean6/retrieval_augmented_generation

Folders and files

Latest commit

History

Repository files navigation

Retrieval Augmented Generation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages