-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why doesn't the loss decrease during training? #11
Comments
I have run 50,000 iterations, it seems that the loss is always between 1.2-1.4 |
Are you using the default hyperparameters for the model size? If yes, those generate a very small model and I do not expect to do well. I am planning to experiment with bigger model size to ensure the model can be trained. |
Thank you for your reply. I will try the other hyperparameters. |
@EdoardoBotta @xy-lin77 @taoumb I also encountered this issue. Have you made any new attempts and got some findings? |
I have updated the model size and dataset with a3812ff, making the model size the same as the original one from the paper and adding a textual embedding of the movie title as additional features. The loss reflected in the progress bar is now a moving average and it steadily decreases during execution of the training loop. |
Can this code be trained correctly?
The text was updated successfully, but these errors were encountered: