Detail of BinaryDenseNet or BinaryResNet18E #8

LaVieEnRoseSMZ · 2019-08-06T09:17:05Z

Hi, recently I read your newly realeased paper "Back to Simplicity: How to Train Accurate BNNs from Scratch?" It is a quite good paper and inspires me a lot.

However, I am a little confused about the implementation in this paper. I am not familiar with the code structure of MXNet, Could you please write a more detailed readme or a tutorial or anything similar which could explain the code and the training details?

Thanks a lot in advance~

Jopyth · 2019-08-06T10:25:30Z

Hi, I agree the framework definitely could benefit from more tutorials and documentation. However it would be best if these hit the "sweet spot" which cover what you (or anyone else who stumbles upon this) is interested in. In the title you name the details of the networks, but in your comment you mention training in general? Which resources have you used/found already? Did you manage to build the framework on your machine?

In the meantime, here are a few links, that might help you get started, depending on what you are looking for:

The Examples contain code for building the models (ResNetE, BinaryDenseNet) and training on ImageNet or Cifar (image_classification.py)
The training script has a lot of parameters, you could check out the pages we have already created in the Wiki (e.g. regarding Hyperparameters). There are also pages with options you need to provide to the training script to train/reproduce the results of a particular model. This page contains lots of experiments. Unfortunately, I have not yet updated this page to include the latest experiment results, but you can find the configuration details, e.g. for BinaryDenseNet (reduction), in the supplementary material already.

Please let me know if this already helps, or whether you would like additional information.

LaVieEnRoseSMZ · 2019-08-09T07:57:01Z

Thanks very much for your patience and detailed answer. I have spent days to reproduce your work using pytorch and I already gets 56.3% using your hyperparameter in wiki which indicated your solid work. And I have found your supplementary meterial in arxiv.com and found detailed log of BinaryResNetE of 58.1% and I will reproduce this results too.

Thanks again for your detailed log and detailed supplementary material and it really helps us a lot~

lgeiger · 2019-08-09T08:07:05Z

We also started trying out BinaryDenseNets and the first results seam promising.

@LaVieEnRoseSMZ To which supplementary materials are you referring to? It doesn't seam like they are included in the arXiv version: https://arxiv.org/pdf/1906.08637.pdf

LaVieEnRoseSMZ · 2019-08-09T08:55:27Z

I am referring to the url in the comments of arxiv paper supplementary material

lgeiger · 2019-08-09T09:04:36Z

Thanks. That's helpful 👍

LaVieEnRoseSMZ · 2019-08-09T09:22:06Z

I still have one more problem in reading the code of Binarylayer

I can not find the defination and implementation of "det_sign" which is used to quantize activation and weight. Can you please show me the url of this part of code?

Thanks a lot in advance~

Jopyth · 2019-08-09T09:26:01Z

As described in Overview of changes, you can find the parts of the code for det_sign in this commit.

LaVieEnRoseSMZ · 2019-08-09T09:44:38Z

One more question about gradient_cancel layer, does it exactly the same as the supplementary material describes:

Thanks a lot for your patient answering~

Jopyth · 2019-08-09T09:50:57Z

The combination of gradient_cancelling + det_sign (e.g. QActivation with det_sign as activation function) does exactly what is described in the above part. You can check the implementation of gradient cancelling:

BMXNet-v2/src/operator/contrib/gradient_cancel-inl.h

Lines 97 to 112 in d0aaf81

    
           template<int req> 
        
           struct gradcancel_forward { 
        
             template<typename DType> 
        
             MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* in_data) { 
        
               KERNEL_ASSIGN(out_data[i], req, in_data[i]); 
        
             } 
        
           }; 
        
           template<int req> 
        
           struct gradcancel_backward { 
        
             template<typename DType> 
        
             MSHADOW_XINLINE static void Map(int i, DType* in_grad, const DType* out_grad, 
        
                                             const DType* in_data, const float threshold) { 
        
               KERNEL_ASSIGN(in_grad[i], req, math::fabs(in_data[i]) <= threshold ? out_grad[i] : DType(0)); 
        
             } 
        
           };

LaVieEnRoseSMZ · 2019-08-26T01:11:32Z

Hi~I am reproducing BinaryDenseNet in the paper. When I go through the code, I find three version of densenet called densenet, densenet_x and densenet_y, what is the exact version used in you final experiments? And I couldn't find DenseNet28 anywhere, would you mind show me the code?

Thanks a lot in advance~

yanghaojin · 2019-09-26T14:24:37Z

Hi~I am reproducing BinaryDenseNet in the paper. When I go through the code, I find three version of densenet called densenet, densenet_x and densenet_y, what is the exact version used in you final experiments? And I couldn't find DenseNet28 anywhere, would you mind show me the code?

Thanks a lot in advance~

We use "densenet.py" for all the experiments. The networks like BinaryDenseNet28/37 that have been created by using the densenet specific configurations (init feature number, reduction rate, growth rate) described in the supplementary material published with our paper: https://owncloud.hpi.de/s/1jrAUnqRAfg0TXH
you will find the detailed network configuration, network architecture, logs of training etc.

lgeiger · 2019-10-07T17:30:03Z

Thanks for sharing all the details and the supplementary materials. We were able to reproduce your experiments (which unfortunately isn't the case with many other papers) 👍

@LaVieEnRoseSMZ If you are looking for an reimplementation of this paper using Larq (a Keras and TensorFlow based BNN library), you can also checkout the pretrained models and training code at https://larq.dev/models/ and https://github.com/larq/zoo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detail of BinaryDenseNet or BinaryResNet18E #8

Detail of BinaryDenseNet or BinaryResNet18E #8

LaVieEnRoseSMZ commented Aug 6, 2019

Jopyth commented Aug 6, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

lgeiger commented Aug 9, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

lgeiger commented Aug 9, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

Jopyth commented Aug 9, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

Jopyth commented Aug 9, 2019 •

edited

Loading

LaVieEnRoseSMZ commented Aug 26, 2019

yanghaojin commented Sep 26, 2019 •

edited

Loading

lgeiger commented Oct 7, 2019

Detail of BinaryDenseNet or BinaryResNet18E #8

Detail of BinaryDenseNet or BinaryResNet18E #8

Comments

LaVieEnRoseSMZ commented Aug 6, 2019

Jopyth commented Aug 6, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

lgeiger commented Aug 9, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

lgeiger commented Aug 9, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

Jopyth commented Aug 9, 2019

LaVieEnRoseSMZ commented Aug 9, 2019

Jopyth commented Aug 9, 2019 • edited Loading

LaVieEnRoseSMZ commented Aug 26, 2019

yanghaojin commented Sep 26, 2019 • edited Loading

lgeiger commented Oct 7, 2019

Jopyth commented Aug 9, 2019 •

edited

Loading

yanghaojin commented Sep 26, 2019 •

edited

Loading