Skip to content

Bottom-up attention with detectron2; Compatible with LXMERT; previewing.

License

Notifications You must be signed in to change notification settings

liz109/py-bottom-up-attention

 
 

Repository files navigation

Bottom-up Attention with Detectron2

The detectron2 system with exact same model and weight as the Caffe VG Faster R-CNN provided in bottom-up-attetion.

The features extracted from this repo is compatible with LXMERT code and pre-trained models here. The original [bottom-up-attetion] is implemented based on Caffe, which is not easy to install and is inconsistent with the training code in PyTorch. Our project thus transfers the weights and models to detectron2 that could be few-line installed and has PyTorch front-end.

Installation

git clone https://github.com/airsplay/py-bottom-up-attention.git
cd py-bottom-up-attention

# Install python libraries
pip install -r requirements.txt
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

# Install detectron2
python setup.py build develop

# or if you are on macOS
# MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build develop

# or, as an alternative to `setup.py`, do
# pip install [--editable] .

Demos

Object Detection

demo vg detection

Feature Extraction

  1. Single image: demo extraction
  2. Batchwise extraction: demo batchwise extraction

Feature Extraction Scripts for LXMERT

  1. For MS COCO (VQA): vqa script

References

Detectron2:

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

Bottom-up Attention:

@inproceedings{Anderson2017up-down,
  author = {Peter Anderson and Xiaodong He and Chris Buehler and Damien Teney and Mark Johnson and Stephen Gould and Lei Zhang},
  title = {Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering},
  booktitle={CVPR},
  year = {2018}
}

LXMERT:

@inproceedings{tan2019lxmert,
  title={LXMERT: Learning Cross-Modality Encoder Representations from Transformers},
  author={Tan, Hao and Bansal, Mohit},
  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
  year={2019}
}

About

Bottom-up attention with detectron2; Compatible with LXMERT; previewing.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 87.8%
  • Cuda 7.7%
  • C++ 3.9%
  • Other 0.6%