-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
object label? #17
Comments
Sorry for getting confusing you. The object detection results provide both object category information and bounding boxes. Here, we only use the bounding boxes for inferring the HOI category. The training phase is the same as the previous setting. In fact, * means we use the same model as ATL, but do not use the object category information during inference. feel free to contact me if you have further question, Regards, |
Thank you for your replay. |
@zhihou7 Hi, I have another question about the code. The function get_new_Trainval_N in lib/ult/ult.py is definied as : Why use " Trainval_N[4]" not " Trainval_N[k]" ? |
Thanks for your comment. It should be Tranval_N[k]. It is a bug from the code of VCL. I forget to update the code. After fixing this bug, the performance will be improved a bit. This bug also does not add seen classes for zero-shot setting. Therefore, it just affects the performance a bit. I have updated the code. Thanks. |
Thank you for your quick reply. |
@zhihou7 As following codes, if an image contains two pairs <h1, v1, o1>, <h1, v2, o1> , and the first one is in the unseen composition list, then you delete two pair from training data. Why don't you only delete the first one ? In my view, only deleting the first one is more close to your description in paper. |
Here, GT[1] is HOI label list of a HOI sample, e.g., [eat apple, hold apple]. If "eat apple" is unseen category. I think it is fair to remove this HOI sample, rather than remove the annotation [eat apple]. Otherwise, the sample of "eat apple" is still existing, but is not labeled, which I think is different from the setting of zero-shot. |
I get it, thank you. |
Hi, could you explain the * in Table3 in ATL?
You described it as "* means we only use the boxes of the detection results", but how do you use the category of the detection results in training phrase and inference phrase ?
The text was updated successfully, but these errors were encountered: