HEAVEN: Hybrid-Vector Retrieval for Visually Rich Documents

Official Repository for our paper "Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy"

🔥News

[2025/11] ViMDoc is now available on Hugging Face🤗!

ViMDoc Benchmark

ViMDoc (Visually-rich Long Multi-Document Retrieval Benchmark) for evaluating visual document retrieval under both multi-document and long-document settings.

from datasets import load_dataset
dataset = load_dataset("kaistdata/ViMDoc", split="ViMDoc")

Format

Sample datasets are available in benchmark/{ViMDoc,OpenDocVQA,ViDoSeek,M3DocVQA}. Each contains sample_query.json with queries and ground truth document IDs:

{
    "id": "<query_id>",
    "query": "<query_text>",
    "doc_ids": ["<document_id>"]
}

Sample document pages are stored in sample_pages/.

Note: Full datasets for other benchmarks are available from their original sources: OpenDocVQA | ViDoSeek | M3DocVQA

Indexing

(1) Encoding (Query/Document)

cd indexing/encode

# Visusal encoder
python encoder.py --encoder_type dse --folder ViMDoc
python encoder.py --encoder_type colqwen25 --folder ViMDoc

# Textual encoder
python ocr.py --device 0 --folder ViMDoc
python encoder.py --encoder_type nvembedv2 --folder ViMDoc
python encoder.py --encoder_type bge_m3_multi --folder ViMDoc

Available Encoders

Encoder	Modality	Type	HF Checkpoint
`colpali`	Visusal	Multi-Vector	`vidore/colpali-v1.3`
`colqwen2`	Visusal	Multi-Vector	`vidore/colqwen2-v1.0`
`colqwen25`	Visusal	Multi-Vector	`vidore/colqwen2.5-v0.2`
`gme`	Visusal	Single-Vector	`Alibaba-NLP/gme-Qwen2-VL-2B-Instruct`
`dse`	Visusal	Single-Vector	`MrLight/dse-qwen2-2b-mrl-v1`
`visret`	Visusal	Single-Vector	`openbmb/VisRAG-Ret`
`bge_m3_multi`	Textual (OCR)	Multi-Vector	`BAAI/bge-m3`
`bge_m3`	Textual (OCR)	Single-Vector	`BAAI/bge-m3`
`nvembedv2`	Textual (OCR)	Single-Vector	`nvidia/NV-Embed-v2`

(2) VS-Page Construction

cd indexing/vs-page

# Step 1: Document Layout Analysis
python DLA.py --dataset ViMDoc --device 0

# Step 2: Assemble & VS-page Encoding
python assemble.py \
    --dataset ViMDoc \
    --encoder_type dse \
    --reduction_factor 15 \
    --device 0

Retrieval - HEAVEN

Run the complete HEAVEN pipeline (Stage 1 + Stage 2):

cd retrieval/heaven

python heaven.py \
    --folder ViMDoc \
    --stage1_model dse \
    --stage2_model colqwen25 \
    --device 0 \
    --preprocess

Stage 1 Only :

python stage1.py --folder ViMDoc --model dse --alpha 0.1 --filter_ratio 0.5

Stage 2 Only :

# Preprocess queries first
python preprocess.py --folder ViMDoc --model colqwen25

# Run Stage 2
python stage2.py --folder ViMDoc --model colqwen25 --stage1_model dse --k 200 --filter_ratio 0.25

Structure

HEAVEN/
│
├── benchmark/                    
│   ├── ViMDoc/                  
│   ├── OpenDocVQA/            
│   ├── ViDoSeek/                
│   └── M3DocVQA/
│       
├── indexing/                      
│   ├── encode/                  
│   └── vs-page/
│               
├── retrieval/                    
│   ├── baeline/                   
│   └── heaven/
│                
└── run.sh

Citation

@article{kim2025hybrid,
  title={Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy},
  author={Kim, Juyeon and Lee, Geon and Choi, Dongwon and Kim, Taeuk and Shin, Kijung},
  journal={arXiv preprint arXiv:2510.22215},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
benchmark		benchmark
indexing		indexing
retrieval		retrieval
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HEAVEN: Hybrid-Vector Retrieval for Visually Rich Documents

🔥News

ViMDoc Benchmark

Format

Indexing

(1) Encoding (Query/Document)

(2) VS-Page Construction

Retrieval - HEAVEN

Structure

Citation

About

Uh oh!

Releases

Packages

Languages

juyeonnn/HEAVEN

Folders and files

Latest commit

History

Repository files navigation

HEAVEN: Hybrid-Vector Retrieval for Visually Rich Documents

🔥News

ViMDoc Benchmark

Format

Indexing

(1) Encoding (Query/Document)

(2) VS-Page Construction

Retrieval - HEAVEN

Structure

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages