-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault:11 #336
Comments
The proposal operator has some problems when handling invalid input, which
leads to a segment fault when the input contains NaN. This means your
Cascade R-CNN heads or the RPN head has blown up. You can try to lower the
learning for your task.
…On Wed, Jun 17, 2020 at 10:21 AM dongzhenguo2016 ***@***.***> wrote:
`06-16 23:55:24 Epoch[0] Batch [3590] Iter: 3590/26046 Lr: 0.00500 Speed:
9.42 samples/sec Train-RpnAcc=0.997272, RpnL1=0.165742,
RcnnAcc_1st=0.985713, RcnnL1_1st=0.604444, RcnnAcc_2nd=0.986624,
RcnnL1_2nd=1.236113, RcnnAcc_3rd=0.984117, RcnnL1_3rd=1.859310,
06-16 23:55:28 Epoch[0] Batch [3600] Iter: 3600/26046 Lr: 0.00500 Speed:
9.50 samples/sec Train-RpnAcc=0.997278, RpnL1=0.165552,
RcnnAcc_1st=0.985734, RcnnL1_1st=0.603507, RcnnAcc_2nd=0.986646,
RcnnL1_2nd=1.234198, RcnnAcc_3rd=0.984152, RcnnL1_3rd=1.856836,
Segmentation fault: 11`
I recently encountered the same error while training
cascade_r101v1_fpn_1x, how can I solve it? Feel so strange.
My platform is ubuntu 16.04
maxnet-cu100 1.6.0
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#336>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGODH7XMRMW2K2YDBPUCGTRXASD7ANCNFSM4OAFBOJA>
.
|
Yes, reducing the learning rate can indeed solve this problem. But after adjusting the learning rate from 0.01 to 0.001, I found that mAP dropped by 1 point. This is not the result I want. Therefore, I think that the local optimal solution obtained after the learning rate is reduced is not as good as the local optimal solution obtained when the previous learning rate is large. |
`06-16 23:55:24 Epoch[0] Batch [3590] Iter: 3590/26046 Lr: 0.00500 Speed: 9.42 samples/sec Train-RpnAcc=0.997272, RpnL1=0.165742, RcnnAcc_1st=0.985713, RcnnL1_1st=0.604444, RcnnAcc_2nd=0.986624, RcnnL1_2nd=1.236113, RcnnAcc_3rd=0.984117, RcnnL1_3rd=1.859310,
06-16 23:55:28 Epoch[0] Batch [3600] Iter: 3600/26046 Lr: 0.00500 Speed: 9.50 samples/sec Train-RpnAcc=0.997278, RpnL1=0.165552, RcnnAcc_1st=0.985734, RcnnL1_1st=0.603507, RcnnAcc_2nd=0.986646, RcnnL1_2nd=1.234198, RcnnAcc_3rd=0.984152, RcnnL1_3rd=1.856836,
Segmentation fault: 11`
I recently encountered the same error while training cascade_r101v1_fpn_1x, how can I solve it? Feel so strange.
My platform is ubuntu 16.04
maxnet-cu100 1.6.0
The text was updated successfully, but these errors were encountered: