We use various synthetic and real datasets. More info is in Appendix F of the supplementary material. Some preprocessing scripts are included in tools/
.
Dataset | Type | Remarks |
---|---|---|
MJSynth | synthetic | Case-sensitive annotations were extracted from the image filenames |
SynthText | synthetic | Processed with crop_by_word_bb_syn90k.py |
IC13 | real | Three archives: 857, 1015, 1095 (full) |
IC15 | real | Two archives: 1811, 2077 (full) |
CUTE80 | real | [1] |
IIIT5k | real | [1] |
SVT | real | [1] |
SVTP | real | [1] |
ArT | real | [2] |
LSVT | real | [2] |
MLT19 | real | [2] |
RCTW17 | real | [2] |
ReCTS | real | [2] |
Uber-Text | real | [2] |
COCO-Text v1.4 | real | Processed with coco_text_converter.py |
COCO-Text v2.0 | real | Processed with coco_2_converter.py |
OpenVINO | real | Annotations for a subset of Open Images. Processed with openvino_converter.py . |
TextOCR | real | Annotations for a subset of Open Images. Processed with textocr_converter.py . A horizontal version can be generated by passing --rectify_pose . |
[1] Case-sensitive annotations from Long and Yao + our corrections. Processed with case_sensitive_str_datasets_converter.py
[2] Archives used as-is from Baek et al. They are included in the dataset release for convenience. Please refer to their work for more info about the datasets.
The preprocessed archives are available here: val + test + most of train, TextOCR + OpenVINO
The expected filesystem structure is as follows:
data
├── test
│ ├── ArT
│ ├── COCOv1.4
│ ├── CUTE80
│ ├── IC13_1015
│ ├── IC13_1095 # Full IC13 test set. Typically not used for benchmarking but provided here for convenience.
│ ├── IC13_857
│ ├── IC15_1811
│ ├── IC15_2077
│ ├── IIIT5k
│ ├── SVT
│ ├── SVTP
│ └── Uber
├── train
│ ├── real
│ │ ├── ArT
│ │ │ ├── train
│ │ │ └── val
│ │ ├── COCOv2.0
│ │ │ ├── train
│ │ │ └── val
│ │ ├── LSVT
│ │ │ ├── test
│ │ │ ├── train
│ │ │ └── val
│ │ ├── MLT19
│ │ │ ├── test
│ │ │ ├── train
│ │ │ └── val
│ │ ├── OpenVINO
│ │ │ ├── train_1
│ │ │ ├── train_2
│ │ │ ├── train_5
│ │ │ ├── train_f
│ │ │ └── validation
│ │ ├── RCTW17
│ │ │ ├── test
│ │ │ ├── train
│ │ │ └── val
│ │ ├── ReCTS
│ │ │ ├── test
│ │ │ ├── train
│ │ │ └── val
│ │ ├── TextOCR
│ │ │ ├── train
│ │ │ └── val
│ │ └── Uber
│ │ ├── train
│ │ └── val
│ └── synth
│ ├── MJ
│ │ ├── test
│ │ ├── train
│ │ └── val
│ └── ST
└── val
├── IC13
├── IC15
├── IIIT5k
└── SVT