You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great work. I have a question regarding the construction of the critic dataset as described in the paper. I would like to understand the specifics of the rule-based criteria used during the Negative Operations Sampling stage.
Is the evaluation based on a direct match with the ground truth operations?
For actions formatted as "click + coordinate," how is the correctness judged? Since any coordinate that falls within the correct bounding box is technically a valid action, how is this handled in your evaluation criteria?