VISA is a dataset that consists of 40k Japanese-English parallel sentence pairs and corresponding video clips with the following key features:
- The parallel sentences are subtitles from movies and TV episodes
- The source subtitles are ambiguous, which means they have multiple possible translations with different meanings
- We divide the dataset into Polysemy and Omission according to the cause of ambiguity
Polysemy:
放せ! --> Let me go!
Omission:
銃を持ってる。 --> I have a gun.
Split | Train | Validation | test |
---|---|---|---|
Polysemy | 18,666 | 1,000 | 1,000 |
Omission | 17,214 | 1,000 | 1,000 |
Combined | 35,880 | 2,000 | 2,000 |
You can read json files to find the mapping from videos to parallel subtitle pairs.
video_file_name: {
{ "ja": Japanese_subtitle },
{ "en": English_subtitle }
}
Please, note that by downloading the dataset, you agree to the following conditions:
- Do not re-distribute the dataset without our permission.
- The dataset can only be used for research purposes. Any other use is explicitly prohibited.
If you are interested in the video features of VISA, you can download them from the following links:
- The I3D Features of VISA: http://lotus.kuee.kyoto-u.ac.jp/~yihang/dataset/VISA_i3d.zip
- The RCNN Features of VISA: http://lotus.kuee.kyoto-u.ac.jp/~yihang/dataset/VISA_rcnn.zip
If you find this dataset helpful, please cite our publication "VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation":
@inproceedings{li-etal-2022-visa,
title = "{VISA}: An Ambiguous Subtitles Dataset for Visual Scene-aware Machine Translation",
author = "Li, Yihang and
Shimizu, Shuichiro and
Gu, Weiqi and
Chu, Chenhui and
Kurohashi, Sadao",
booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
month = jun,
year = "2022",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://aclanthology.org/2022.lrec-1.725",
pages = "6735--6743",
}
If you have any questions about this dataset, please contact [email protected].