This repository is the code for Deep Fashion Analysis with Feature Map Upsampling and Landmark-driven Attention in the First Workshop on Computer Vision for Fashion, Art and Design (Fashion) of ECCV 2018.
Python 3, PyTorch >= 0.4.0, and make sure you have installed TensorboardX:
pip install tensorboardX
1. Prepare the Dataset
Download the "Category and Attribute Prediction Benchmark" of the DeepFashion dataset from http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/AttributePrediction.html . Extract all the files to a folder and put all the images in a folder named "img".
For example, if you choose to put the dataset to /home/user/datasets/benchmark1/, the structure of this folder will be:
benchmark1/
Anno/
Eval/
img/
README.txt
Please modify the variable "base_path" in src/const.py correspondingly:
# in src/const.py
base_path = "/home/user/datasets/benchmark1/"
2. Create info.csv
python -m src.create_info
Please make sure you have modified the variable "base_path" in src/const.py, otherwise you may encounter a FileNotFound error. After the script finishes, you will find a file named "info.csv" in your "base_path".
3. Train the model
To train the landmark branch solely, run:
python -m src.train --conf src.conf.lm
To train the landmark branch and the category/attribute prediction network jointly, run:
python -m src.train --conf src.conf.whole
You can monitor all the training losses and evaluation metrics via tensorboard. Please run:
tensorboard --logdir runs/
Then visit localhost:6006 for detailed information.
The following table shows the landmark localization results on the DeepFashion dataset. Numbers stands for normalized distances between prediction and the ground truth. Best results are marked in bold.
Methods | L.Collar | R.Collar | L.Sleeve | R.Sleeve | L.Waistline | R.Waistline | L.Hem | R.Hem | Avg. |
---|---|---|---|---|---|---|---|---|---|
FashionNet | 0.0854 | 0.0902 | 0.0973 | 0.0935 | 0.0854 | 0.0845 | 0.0812 | 0.0823 | 0.0872 |
DFA | 0.0628 | 0.0637 | 0.0658 | 0.0621 | 0.0726 | 0.0702 | 0.0658 | 0.0663 | 0.0660 |
DLAN | 0.0570 | 0.0611 | 0.0672 | 0.0647 | 0.0703 | 0.0694 | 0.0624 | 0.0627 | 0.0643 |
Wang et al. | 0.0415 | 0.0404 | 0.0496 | 0.0449 | 0.0502 | 0.0523 | 0.0537 | 0.0551 | 0.0484 |
Ours | 0.0332 | 0.0346 | 0.0487 | 0.0519 | 0.0422 | 0.0429 | 0.0620 | 0.0639 | 0.0474 |
The following table shows the category classification and attribute prediction results on the DeepFashion dataset. The two numbers in each cell stands for top-3 and top-5 accuracy. Best results are marked in bold.
Methods | Category | Texture | Fabric | Shape | Part | Style | All |
---|---|---|---|---|---|---|---|
WTBI | 43.73 | 66.25 | 24.21 | 32.65 | 25.38 | 36.06 | 23.39 | 31.26 | 26.31 | 33.24 | 49.85 | 58.68 | 27.46 | 35.37 |
DARN | 59.48 | 79.58 | 36.15 | 48.15 | 36.64 | 48.52 | 35.89 | 46.93 | 39.17 | 50.14 | 66.11 | 71.36 | 42.35 | 51.95 |
FashionNet | 82.58 | 90.17 | 37.46 | 49.52 | 39/30 | 49.84 | 39.47 | 48.59 | 44.13 | 54.02 | 66.43 | 73.16 | 45.52 | 54.61 |
Lu et al. | 86.72 | 92.51 | - | - | - | - | - | - |
Corbiere et al. | 86.30 | 92.80 | 53.60 | 63.20 | 39.10 | 48.80 | 50.10 | 59.50 | 38.80 | 48.90 | 30.50 | 38.30 | 23.10 | 30.40 |
Wang et al. | 90.99 | 95.78 | 50.31 | 65.48 | 40.31 | 48.23 | 53.32 | 61.05 | 40.65 | 56.32 | 68.70 | 74.25 | 51.53 | 60.95 |
Ours | 91.16 | 96.12 | 56.17 | 65.83 | 43.20 | 53.52 | 58.28 | 67.80 | 46.97 | 57.42 | 68.82 | 74.13 | 54.69 | 63.74 |
We are very pleased if this paper helps you for further work. Please cite the paper as:
@inproceedings{liu2018deep,
title={Deep Fashion Analysis with Feature Map Upsampling and Landmark-Driven Attention},
author={Liu, Jingyuan and Lu, Hong},
booktitle={European Conference on Computer Vision},
pages={30--36},
year={2018},
organization={Springer}
}