Dataset Preparation Guide

If you want to use prepared configs to run the Accuracy Checker tool and the Model Quantizer, you need to organize <DATASET_DIR> folder with validation datasets in a certain way. Instructions for preparing validation data are described in this document.

Each dataset description consists of the following sections:

instruction for downloading the dataset
structure of <DATASET_DIR> that matches the dataset definition in the existing global configuration file (dataset_definitions.yml)
examples of using and presenting the dataset in the global configuration file

More detailed information about using predefined configuration files you can find here.

ImageNet

How download dataset

To download images from ImageNet, you need to have an account and agree to the Terms of Access. Follow the steps below:

Go to the ImageNet homepage
If you have an account, click Login. Otherwise, click Signup in the right upper corner, provide your data, and wait for a confirmation email
Log in after receiving the confirmation email and go to the Download tab
Select Download Original Images
You will be redirected to the Terms of Access page. If you agree to the Terms, continue by clicking Agree and Sign
Click one of the links in the Download as one tar file section to select it
Unpack archive

To download annotation files, you need to follow the steps below:

val.txt
1. Download archive
2. Unpack val.txt from the archive caffe_ilsvrc12.tar.gz
val15.txt
1. Download annotation file
2. Rename ILSVRC2017_val.txt to val15.txt

Files layout

To use this dataset with OMZ tools, make sure <DATASET_DIR> contains the following:

ILSVRC2012_img_val - directory containing the ILSVRC 2012 validation images
val.txt - annotation file used for ILSVRC 2012
val15.txt - annotation file used for ILSVRC 2015

Datasets in dataset_definitions.yml

imagenet_1000_classes used for evaluation models trained on ILSVRC 2012 dataset with 1000 classes. (model examples: alexnet, vgg16)
imagenet_1000_classes_2015 used for evaluation models trained on ILSVRC 2015 dataset with 1000 classes. (model examples: se-resnet-152, se-resnext-50)
imagenet_1001_classes used for evaluation models trained on ILSVRC 2012 dataset with 1001 classes (background label + original labels). (model examples: googlenet-v2-tf, resnet-50-tf)

Common Objects in Context (COCO)

How download dataset

To download COCO dataset, you need to follow the steps below:

Download 2017 Val images and 2017 Train/Val annotations
Unpack archives

Files layout

To use this dataset with OMZ tools, make sure <DATASET_DIR> contains the following:

val2017 - directory containing the COCO 2017 validation images
instances_val2017.json - annotation file which used for object detection and instance segmentation tasks
person_keypoints_val2017.json - annotation file which used for human pose estimation tasks

Datasets in dataset_definitions.yml

ms_coco_mask_rcnn used for evaluation models trained on COCO dataset for object detection and instance segmentation tasks. Background label + label map with 80 public available object categories are used. Annotations are saved in order of ascending image ID.
ms_coco_detection_91_classes used for evaluation models trained on COCO dataset for object detection tasks. Background label + label map with 80 public available object categories are used (original indexing to 91 categories is preserved. You can find more information about object categories labels here). Annotations are saved in order of ascending image ID. (model examples: faster_rcnn_resnet50_coco, ssd_resnet50_v1_fpn_coco)
ms_coco_detection_80_class_with_background used for evaluation models trained on COCO dataset for object detection tasks. Background label + label map with 80 public available object categories are used. Annotations are saved in order of ascending image ID. (model examples: faster-rcnn-resnet101-coco-sparse-60-0001, ssd-resnet34-1200-onnx)
ms_coco_detection_80_class_without_background used for evaluation models trained on COCO dataset for object detection tasks. Label map with 80 public available object categories is used. Annotations are saved in order of ascending image ID. (model examples: ctdet_coco_dlav0_384, yolo-v3-tf)
ms_coco_keypoints used for evaluation models trained on COCO dataset for human pose estimation tasks. Each annotation stores multiple keypoints for one image. (model examples: human-pose-estimation-0001)
ms_coco_single_keypoints used for evaluation models trained on COCO dataset for human pose estimation tasks. Each annotation stores single keypoints for image, so several annotation can be associated to one image. (model examples: single-human-pose-estimation-0001)

WIDER FACE

How download dataset

To download WIDER Face dataset, you need to follow the steps below:

Go to the WIDER FACE website
Go to the Download section
Select WIDER Face Validation images and download them from Google Drive or Tencent Drive
Select and download Face annotations
Unpack archives

Files layout

To use this dataset with OMZ tools, make sure <DATASET_DIR> contains the following:

WIDER_val - directory containing images directory
- images - directory containing the WIDER Face validation images
wider_face_split - directory with annotation file
- wider_face_val_bbx_gt.txt - annotation file

Datasets in dataset_definitions.yml

wider used for evaluation models on WIDER Face dataset where the face is the first class. (model examples: mtcnn, retinaface-resnet50)
wider_without_bkgr used for evaluation models on WIDER Face dataset where the face is class zero. (model examples: mobilefacedet-v1-mxnet)

Visual Object Classes Challenge 2012 (VOC2012)

How download dataset

To download VOC2012 dataset, you need to follow the steps below:

Go to the VOC2012 website
Go to the Development Kit section
Select Download the training/validation data and download archive
Unpack archive

Files layout

To use this dataset with OMZ tools, make sure <DATASET_DIR> contains the following:

VOCdevkit/VOC2012 - directory containing annotations, images, segmentation masks and image sets files directories
- Annotations - directory containing the VOC2012 annotation files
- JPEGImages - directory containing the VOC2012 validation images
- ImageSets - directory containing the VOC2012 text files specifying lists of images for different tasks
  - Main/val.txt - image sets file for detection tasks
  - Segmentation/val.txt - image sets file for segmentation tasks
- SegmentationClass - directory containing the VOC2012 segmentation masks

Datasets in dataset_definitions.yml

VOC2012 used for evaluation models on VOC2012 dataset for object detection task. Background label + label map with 20 object categories are used. (model examples: mobilenet-ssd, ssd300)
VOC2012_without_background used for evaluation models on VOC2012 dataset for object detection tasks. Label map with 20 object categories is used.(model examples: yolo-v2-ava-0001, yolo-v2-tiny-ava-0001)
VOC2012_Segmentation used for evaluation models on VOC2012 dataset for segmentation tasks. Background label + label map with 20 object categories are used.(model examples: deeplabv3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets.md

datasets.md

Dataset Preparation Guide

ImageNet

How download dataset

Files layout

Datasets in dataset_definitions.yml

Common Objects in Context (COCO)

How download dataset

Files layout

Datasets in dataset_definitions.yml

WIDER FACE

How download dataset

Files layout

Datasets in dataset_definitions.yml

Visual Object Classes Challenge 2012 (VOC2012)

How download dataset

Files layout

Datasets in dataset_definitions.yml

Files

datasets.md

Latest commit

History

datasets.md

File metadata and controls

Dataset Preparation Guide

ImageNet

How download dataset

Files layout

Datasets in dataset_definitions.yml

Common Objects in Context (COCO)

How download dataset

Files layout

Datasets in dataset_definitions.yml

WIDER FACE

How download dataset

Files layout

Datasets in dataset_definitions.yml

Visual Object Classes Challenge 2012 (VOC2012)

How download dataset

Files layout

Datasets in dataset_definitions.yml