We use pre-trained CNN models to extract visual features from images. The features are stored in a binary format, which can be read by BigFile, see our wiki page for detailed instruction.
wget http://lixirong.net/data/coco-cn/coco-cn_resnext-101_feat.tar.gz
- NUS-WIDE100: A set of 100 images randomly selected from the NUS-WIDE dataset for user study.
- Chinese tag vocabulary: A set of 655 Chinese tags defined for the cross-lingual image tagging task.
- Sentences with typos: A list of sentences with typos detected thus far. Although we tried our best to collect high-quality annotations, small typos are unfortunately inevitable.
- We thank the MediaMill team at the University of Amsterdam for generously providing their trained ResNext-101 model.
- We thank Miss Xinru Chen for performing typos check on COCO-CN sentences.