Octave Convolution

MXNet implementation for:

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Note:

This repo is under development.

To Do List

Code for ablation study (by Symbol API)
Trained models (ablation)
Code for the rest exps (by Gluon API)
Trained models (rest exps)

ImageNet

Ablation

Loss: Softmax
Learning rate: Cosine (warm-up: 5 epochs, lr: 0.4)
MXNet API: Symbol API

Model	baseline	alpha = 0.125	alpha = 0.25	alpha = 0.5	alpha = 0.75
DenseNet-121	75.4 / 92.7	76.1 / 93.0	75.9 / 93.1	--	--
ResNet-26	73.2 / 91.3	75.8 / 92.6	76.1 / 92.6	75.5 / 92.5	74.6 / 92.1
ResNet-50	77.0 / 93.4	78.2 / 93.9	78.0 / 93.8	77.4 / 93.6	76.7 / 93.0
SE-ResNet-50	77.6 / 93.6	78.7 / 94.1	78.4 / 94.0	77.9 / 93.8	77.4 / 93.5
ResNeXt-50	78.4 / 94.0	--	78.8 / 94.2	78.4 / 94.0	77.5 / 93.6
ResNet-101	78.5 / 94.1	79.2 / 94.4	79.2 / 94.4	78.7 / 94.1	--
ResNeXt-101	79.4 / 94.6	--	79.6 / 94.5	78.9 / 94.4	--
ResNet-200	79.6 / 94.7	80.0 / 94.9	79.8 / 94.8	79.5 / 94.7	--

Note:

Top-1 / Top-5, single center crop accuracy is shown in the table. (testing script)
All residual networks in ablation study adopt pre-actice version[1] for convenience.

Others

Learning rate: Cosine (warm-up: 5 epochs, lr: 0.4)
MXNet API: Gluon API

Model	alpha	label smoothing[2]	mixup[3]	#Params	#FLOPs	Top1
0.75 MobileNet (v1)	.375			2.6 M	213 M	70.6
1.0 MobileNet (v1)	.5			4.2 M	321 M	72.4
1.0 MobileNet (v2)	.375	Yes		3.5 M	256 M	72.0
1.125 MobileNet (v2)	.5	Yes		4.2 M	295 M	73.0
Oct-ResNet-152	.125	Yes	Yes	60.2 M	10.9 G	81.4
Oct-ResNet-152 + SE	.125	Yes	Yes	66.8 M	10.9 G	81.6

Citation

@article{chen2019drop,
  title={Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution},
  author={Chen, Yunpeng and Fan, Haoqi and Xu, Bing and Yan, Zhicheng and Kalantidis, Yannis and Rohrbach, Marcus and Yan, Shuicheng and Feng, Jiashi},
  journal={arXiv preprint arXiv:1904.05049},
  year={2019}
}

Third-party Implementations

PyTorch Implementation with imagenet training log and pre-trained model by d-li14
MXNet Implementation with imagenet training log by terrychenism
Keras Implementation with cifar10 results by koshian2

Acknowledgement

Thanks MXNet, Gluon-CV and TVM!
Thanks @Ldpe2G for sharing the code for calculating the #FLOPs (link)
Thanks Min Lin (Mila), Xin Zhao (Qihoo Inc.), Tao Wang (NUS) for helpful discussions on the code development.

Reference

[1] He K, et al "Identity Mappings in Deep Residual Networks".

[2] Christian S, et al "Rethinking the Inception Architecture for Computer Vision"

[3] Zhang H, et al. "mixup: Beyond empirical risk minimization.".

License

The code and the models are MIT licensed, as found in the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
figs		figs
utils/symbol		utils/symbol
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Octave Convolution

To Do List

ImageNet

Ablation

Others

Citation

Third-party Implementations

Acknowledgement

Reference

License

About

Releases

Packages

Languages

License

BlueAnthony/OctConv

Folders and files

Latest commit

History

Repository files navigation

Octave Convolution

To Do List

ImageNet

Ablation

Others

Citation

Third-party Implementations

Acknowledgement

Reference

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages