Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about dimensions of tensors #24

Open
jonghakim35 opened this issue Nov 14, 2022 · 3 comments
Open

Questions about dimensions of tensors #24

jonghakim35 opened this issue Nov 14, 2022 · 3 comments

Comments

@jonghakim35
Copy link

jonghakim35 commented Nov 14, 2022

Hi, thanks for your great work.

Just for clarification, it would be great to know the dimension of tensors in Section 3.2.
Below is what I've understood about the tensor dimension when using the HICO-DET dataset.
If there's any misunderstanding, please kindly let me know.

\tilde{l}_o : (1, 80)
A_o : (80, 600)
l_v : (1, 117)
A_v : (117, 600)

Therefore, \bar{y} : (1, 600). Is this correct?

And also, since the composed HOI label should be in the original 600 HOI triplet set, is it correct that discovering a novel HOI triplet is impossible using this method and the main focus of the work is correctly learning affordances via feature composition?

Again, thanks for sharing your great work.

@zhihou7
Copy link
Owner

zhihou7 commented Nov 14, 2022

Hi @jonghakim35,
For ATL and VCL, you are right because I predict 600 classes of HOI directly in the two papers, which will limit the label space. Empirically, we can convert 600 classes into verb labels via A_v and construct a new matrix with l_o for possible concepts, that is what I have done for affordance recognition in ATL. For HOI Concept Discovery, I supervise the model via the verb labels (the dimension is 117) and I use a matrix (80x117) to represent the HOI label, and I obtain the pseudo verb labels via the corresponding matrix.

@jonghakim35
Copy link
Author

Thanks for the quick and detailed clarification. It helped me a lot in understanding the paper!

@zhihou7
Copy link
Owner

zhihou7 commented Nov 14, 2022

You are welcome. For ATL in which I find it is possible to recognize object affordance from an HOI model, I am not aware of the inner meaning of object affordance for HOI, which actually implies reasonable verb-object combinations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants