-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix accuracy test errors #5348
fix accuracy test errors #5348
Conversation
This pull request was exported from Phabricator. Differential Revision: D61301698 |
@ppwwyyxx do you know if there were changes recently that lead to difference in accuracy metrics? |
When did this start to happen? It could also be a change of CUDA version / precision. For example, are these now running on Ampere cards with TF32 enabled? |
@ppwwyyxx Great call! The last successful run (in June) was on Volta, and the first failed run was on Ampere. Setting |
This pull request was exported from Phabricator. Differential Revision: D61301698 |
Summary: Pull Request resolved: facebookresearch#5348 Some accuracy tests started to fail in between Jun 11 and Jun 17: - ❌ mask_rcnn_R_50_FPN_inference_acc_test - ✅ keypoint_rcnn_R_50_FPN_inference_acc_test - ✅ fast_rcnn_R_50_FPN_inference_acc_test - ❌ panoptic_fpn_R_50_inference_acc_test - ✅ retinanet_R_50_FPN_inference_acc_test - ❌ rpn_R_50_FPN_inference_acc_test - ✅ semantic_R_50_FPN_inference_acc_test - ❌ cascade_mask_rcnn_R_50_FPN_inference_acc_test V1: update the yaml to reflect the new scores. V5: it turns out that we can match the old scores by disabling tf32. Differential Revision: D61301698
1801c3e
to
90a5c0b
Compare
This pull request was exported from Phabricator. Differential Revision: D61301698 |
Summary: Pull Request resolved: facebookresearch#5348 Some accuracy tests started to fail in between Jun 11 and Jun 17: - ❌ mask_rcnn_R_50_FPN_inference_acc_test - ✅ keypoint_rcnn_R_50_FPN_inference_acc_test - ✅ fast_rcnn_R_50_FPN_inference_acc_test - ❌ panoptic_fpn_R_50_inference_acc_test - ✅ retinanet_R_50_FPN_inference_acc_test - ❌ rpn_R_50_FPN_inference_acc_test - ✅ semantic_R_50_FPN_inference_acc_test - ❌ cascade_mask_rcnn_R_50_FPN_inference_acc_test V1: update the yaml to reflect the new scores. V5: it turns out that we can match the old scores by disabling tf32. Differential Revision: D61301698
90a5c0b
to
c04028f
Compare
Summary: Pull Request resolved: facebookresearch#5348 Some accuracy tests started to fail in between Jun 11 and Jun 17: - ❌ mask_rcnn_R_50_FPN_inference_acc_test - ✅ keypoint_rcnn_R_50_FPN_inference_acc_test - ✅ fast_rcnn_R_50_FPN_inference_acc_test - ❌ panoptic_fpn_R_50_inference_acc_test - ✅ retinanet_R_50_FPN_inference_acc_test - ❌ rpn_R_50_FPN_inference_acc_test - ✅ semantic_R_50_FPN_inference_acc_test - ❌ cascade_mask_rcnn_R_50_FPN_inference_acc_test V1: update the yaml to reflect the new scores. V5: it turns out that we can match the old scores by disabling tf32. Differential Revision: D61301698
This pull request was exported from Phabricator. Differential Revision: D61301698 |
c04028f
to
28425c8
Compare
This pull request has been merged in 5b72c27. |
Summary:
Some accuracy tests started to fail in between Jun 11 and Jun 17:
V1: update the yaml to reflect the new scores.
Differential Revision: D61301698