Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the details of learning rate #7

Open
hongxin001 opened this issue Nov 2, 2020 · 1 comment
Open

About the details of learning rate #7

hongxin001 opened this issue Nov 2, 2020 · 1 comment

Comments

@hongxin001
Copy link

There is a sentence in the appendix: "With batch normalization, we effectively cancel the learning rate of Meta-Weight-Net, and it works well with a fixed learning rate. "

I'm not sure what it is about. Would you please give an explanation in detail? Does it mean we don't need to fine-tune the learning rate of meta networks because of BN?

@shanshuo
Copy link

@xjtushujun Thanks for sharing the code of your nice work. For this part, I also have questions as @hongxin001 .

  1. Why batch normalization could cancel the learning rate of MV-Net?
  2. In the old version, you normalize the weight by its sum:
    w_v = w_new / norm_v
    While in the stable version of MV-Net you didn't:
    l_f_meta = torch.sum(cost_v * v_lambda)/len(cost_v)
    Why new version is more stable? How does it avoid the output weight becoming all zeros?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants