You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use Habitat to generate a multimodal dataset (video+language annotations) which may help improve the spatial and temporal perception abilities of VLMs.
I am using the Humanoid to interact with the environment, such as navigation/pick/place. However, I found when pick some object, the humanoid does not look at the object in interest (from the perspective of head_rgb sensor); the object to pick is missed in the view. Besides, during the pick action, the robot’s head RGB sensor clipped through the model, resulting in a view from inside the robot model.
Is there any method or hint that can solve/bypass the above problem?
The text was updated successfully, but these errors were encountered:
I am trying to use Habitat to generate a multimodal dataset (video+language annotations) which may help improve the spatial and temporal perception abilities of VLMs.
I am using the Humanoid to interact with the environment, such as
navigation/pick/place
. However, I found whenpick
some object, the humanoid does not look at the object in interest (from the perspective ofhead_rgb
sensor); the object to pick is missed in the view. Besides, during the pick action, the robot’s head RGB sensor clipped through the model, resulting in a view from inside the robot model.Is there any method or hint that can solve/bypass the above problem?
The text was updated successfully, but these errors were encountered: