You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I read it correctly, the shape of points input is [B, N, 2], where B is the batch size and N is the number of points per image. The padding ensures that the point prompt also contains the 2d coordinates of two points to make it compatible with the box prompt. Without reshaping operation before the torch.cat operation, wouldn't the shape become [B, N + 1, 2] after the padding. This doesn't feel right. Since this PromptEncoder is used in the SAM2 as well, it seems to impact both models.
Please correct me if I misunderstand any part of this.
Thank you!
The text was updated successfully, but these errors were encountered:
https://github.com/facebookresearch/segment-anything/blob/main/segment_anything/modeling/prompt_encoder.py#L81-L85
If I read it correctly, the shape of points input is
[B, N, 2]
, where B is the batch size and N is the number of points per image. The padding ensures that the point prompt also contains the 2d coordinates of two points to make it compatible with the box prompt. Without reshaping operation before thetorch.cat
operation, wouldn't the shape become[B, N + 1, 2]
after the padding. This doesn't feel right. Since this PromptEncoder is used in the SAM2 as well, it seems to impact both models.Please correct me if I misunderstand any part of this.
Thank you!
The text was updated successfully, but these errors were encountered: