-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loss下降问题 #35
Comments
可以把学习率调小一点,我已经很久没用这个代码了 |
@wushilian 我已经把学习率调到1e-5了,还要再调小吗,但是loss一直不下降,一开始不断上升,几个epoch后开始不再上升了,但是训练了一晚上,loss还是在1作用。 |
@yyfanxing 我记得很久之前在syn90k数据上训练过,学习率是1e-4,优化器是adam,可以收敛 |
我现在用50万的数据进行训练,仍然很难收敛,是不是数据集大的话attention很难收敛?我感觉训练个几天都无法收敛。 |
@yyfanxing 请问您解决这个问题了嘛?我最近在用这个模型训练,也发现不收敛。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The text was updated successfully, but these errors were encountered: