ViLBERTScore

This repository provides an evaluation metric for image captioning using ViLBERT which is based on our paper ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT.

Repository Setup

This code is built upon original ViLBERT paper and its repository. We provide the almost same guideline as in the original repository as follows.

Create a fresh conda environment, and install all dependencies.

conda create -n vilbert-score python=3.6
conda activate vilbert-score
git clone https://github.com/hwanheelee1993/ViLBERTScore.git
cd ViLBERTScore
pip install -r requirements.txt

Install pytorch

conda install pytorch torchvision cudatoolkit=10.0 -c pytorch

Install this codebase as a package in this environment.

python setup.py develop

Pre-trained models

We used two pre-trained models(pretrained ViLBERT, fine-tuned on 12 tasks) in our work. Please download the models in original ViLBERT repository and save it to "save" dir. (two files: "pretrained_model.bin", "multi_task_model.bin")

Feature Extraction

Using Processed Dataset

We provide the processed version for Flickr8k and Composite in this link, including the pre-computed detection features. Download the files and extract to "data" dir.

Extracting Features for Other Dataset

We extract the detection features following the guidelines in this link. Please extract the features from the link and save them as "imgs_rcnn.pkl" which is a list of each feature.

Then make files, "cand_caps.pkl", "gt_caps.pkl",("scores.pkl" optional for computing correlation) which are the list of each feature.

Computing Score

You can compute the scores using the following code.

python compute_vilbertscore.py --dataset flickr8k

License

MIT license

Please cite the following paper if you use this code. ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT.

@inproceedings{lee2020vilbertscore,
  title={ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT},
  author={Lee, Hwanhee and Yoon, Seunghyun and Dernoncourt, Franck and Kim, Doo Soon and Bui, Trung and Jung, Kyomin},
  booktitle={Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems},
  pages={34--39},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 299 Commits
config		config
script		script
tools		tools
vilbert		vilbert
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
compute_vilbertscore.py		compute_vilbertscore.py
dataset.py		dataset.py
requirements.txt		requirements.txt
setup.py		setup.py
utils.py		utils.py
vilbert_tasks.yml		vilbert_tasks.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ViLBERTScore

Repository Setup

Pre-trained models

Feature Extraction

Using Processed Dataset

Extracting Features for Other Dataset

Computing Score

License

About

Releases

Packages

Languages

License

hwanheelee1993/ViLBERTScore

Folders and files

Latest commit

History

Repository files navigation

ViLBERTScore

Repository Setup

Pre-trained models

Feature Extraction

Using Processed Dataset

Extracting Features for Other Dataset

Computing Score

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages