In this segmentation example we will import the ADE20K Outdoors dataset as a Cassandra dataset and then read the data into NVIDIA DALI.
As a first step, the raw files are to be downloaded from:
or, if you have installed Kaggle API, you can just run this command:
$ kaggle datasets download -d residentmario/ade20k-outdoors
In the following we will assume the original images are stored in the
/data/ade20k/
directory.
We begin by starting the Cassandra server shipped with the provided Docker container:
# Start Cassandra server
$ /cassandra/bin/cassandra
Note that the shell prompt is immediately returned. Wait until state jump to NORMAL
is shown (about 1 minute).
The following commands will insert the original dataset in Cassandra and use the plugin to read the images in NVIDIA DALI.
# - Create the tables in the Cassandra DB
$ cd examples/ade20k/
$ /cassandra/bin/cqlsh -f create_tables.cql
# - Fill the tables with data and metadata
$ python3 extract_serial.py /data/ade20k/images/training/ /data/ade20k/annotations/training/ --data-table=ade20k.data --metadata-table=ade20k.metadata
# - Read the list of UUIDs and cache it to disk
$ python3 cache_uuids.py --metadata-table=ade20k.metadata --rows-fn=ade20k.rows
# - Tight loop data loading test in host memory
$ python3 loop_read.py --data-table=ade20k.data --rows-fn=ade20k.rows
# - Tight loop data loading test in GPU memory (GPU:0)
$ python3 loop_read.py --data-table=ade20k.data --rows-fn=ade20k.rows --use-gpu
# - Sharded, tight loop data loading test, using 2 processes via torchrun
$ torchrun --nproc_per_node=2 loop_read.py --data-table=ade20k.data --rows-fn=ade20k.rows
The same scripts can be used to read the dataset from the filesystem, using the standard DALI file reader.
# - Tight loop data loading test in host memory
$ python3 loop_read.py --reader=file --image-root=/data/ade20k/images/ --mask-root=/data/ade20k/annotations/
# - Tight loop data loading test in GPU memory (GPU:0)
$ python3 loop_read.py --reader=file --image-root=/data/ade20k/images/ --mask-root=/data/ade20k/annotations/ --use-gpu
# - Sharded, tight loop data loading test, using 2 processes via torchrun
$ torchrun --nproc_per_node=2 loop_read.py --reader=file --image-root=/data/ade20k/images/ --mask-root=/data/ade20k/annotations/