COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.
Chinese sentences | COCO-CN train | COCO-CN val | COCO-CN test |
---|---|---|---|
human written | ✅ | ✅ | ✅ |
human translation | ❌ | ❌ | ✅ |
machine translation (baidu) | ✅ | ✅ | ✅ |
- version 201805: 20,341 images (training / validation / test: 18,341 / 1,000 / 1,000), associated with 22,218 manually written Chinese sentences and 5,000 manually translated sentences. Data is freely available upon request. Please submit your request via Google Form.
- Precomputed image features: ResNext-101
- COCO-CN-Results-Viewer: A lightweight tool to inspect the results of different image captioning systems on the COCO-CN test set, developed by Emiel van Miltenburg at the Tilburg University.
- NUS-WIDE100: An extra test set.
- 2018-12-16: Code for cross-lingual image tagging and captioning released.
- 2018-12-20: Code for cross-lingual image retrieval and our image annotation system released.
- 2019-01-13: The COCO-CN paper accepted as a regular paper by the T-MM journal.
- 2021-02-03: Release of new annotations (4,573 images and 4,712 manually written sentences) collected via our iCap interactive image captioning System. The images have no overlap with the prevously released dataset.
If you find COCO-CN useful, please consider citing the following paper:
- Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, Jieping Xu, COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval, IEEE Transactions on Multimedia, Volume 21, Number 9, pages 2347-2360, 2019