Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heteroscedastic loss #3

Open
AndreMouton opened this issue May 20, 2020 · 1 comment
Open

Heteroscedastic loss #3

AndreMouton opened this issue May 20, 2020 · 1 comment

Comments

@AndreMouton
Copy link

I'm having a similar issue as @SaumilShah66 in that I'm getting infs in adf.Softmax and subsequently nans in the heteroscedastic softmax loss function. In the paper you seem to suggest that you do not use a heteroscedastic loss as this is intended for regression problems. Is there any reason why you're using it in the training code for the classification problem?

@mattiasegu
Copy link
Contributor

mattiasegu commented Jun 10, 2020

Hi @AndreMouton

As I replied to @SaumilShah66, it is a known problem that training with the heteroscedastic loss may be difficult because of numerical instability problems. As you noticed, we mentioned in the paper that it wasn't possible to train the heteroscedastic neural network from Kendall et al. because of numerical instability enhanced by the SoftMax layer. To address this problem when training the ADF network with the heteroscedastic loss (which we needed for sake of completeness), we initialized the network weights from the best pretrained ckpt on Resnet-18 with and without dropout. You can try it yourself, no modification to the code is needed, you only need to load one of the two available ckpts before starting to train.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants