Skip to content

coffeebean6/retrieval_augmented_generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Retrieval Augmented Generation

A naive Retrieval Augmented Generation implementation uses the M3e model for text embedding, uses Milvus for vector data storage and retrieval, and LLM uses Llama3-8b.
The UI is shown below:

UI demo


You can run the following script to start the Gradio web UI:

python3 rag.py

A brief description video is below:
https://www.bilibili.com/video/BV1ff421X7YN

中文解释文档在这里


If you can't run it directly, you may need to do some preparation, including but not limilited to:

  • Install Ollama and run LLM:
https://ollama.com/ download and install.

> ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q4
  • Install or update vector database:
$ wget https://github.com/milvus-io/milvus/releases/download/v2.4.4/milvus-standalone-docker-compose.yml -O docker-compose.yml
$ docker-compose up -d
$ docker ps
  • Install package:
pip install -U gradio pymilvus transformers FlagEmbedding langchain langchain-core langchain_community langchain-milvus langchain-text-splitters pypdf2 bs4 

About

Retrieval Augmented Generation (RAG)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages