Add Optional Key-Point Visibility to `draw_keypoints()` #8203

bmmtstb · 2024-01-11T17:58:13Z

🚀 The feature

I propose an optional key-point visibility flag for ~torchvision.utils.draw_keypoints(), to be able to draw human skeletons with key-points that are not set / visible.
If a key point is marked as invisible, the "dot" and the corresponding connection(s) in the connectivity / skeleton will not be printed for this person on this image. Other people on the same image (num_instances>1) can still have the full skeleton or another set of visible joints.

The visibility input can either be a torch.BoolTensor (or any other Callable[bool]) with the same number of entries K as the keypoints tensor, describing the visibility of each respective key point. If num_instances is bigger than one, there should either be a tensor of shape [num_instances, K] describing every person individually, or one of shape [K] describing all instances within this image at once.

The visibility should be optional and therefore be True / torch.ones((num_instances, K)) by default.

Motivation, pitch

The current issue arises when key point coordinates are set to the origin, e.g., if they are not visible, not found, or otherwise not available.

Lets have a look at the example showing the possibilities of draw_keypoints() over at the docs.

Given the image of the skateboarder, let some (other) model predict the key-point- or joint-coordinates as (x, y, visibility), obtaining the following result:

>>> print(new_keypoints)
tensor([[[208.0176, 214.2409,   1.0000],
         [  0.0000,   0.0000,   0.0000],
         [197.8246, 210.6392,   1.0000],
         [  0.0000,   0.0000,   0.0000],
         [178.6378, 217.8425,   1.0000],
         [221.2086, 253.8591,   1.0000],
         [160.6502, 269.4662,   1.0000],
         [243.9929, 304.2822,   1.0000],
         [138.4654, 328.8935,   1.0000],
         [277.5698, 340.8990,   1.0000],
         [153.4551, 374.5145,   1.0000],
         [  0.0000,   0.0000,   0.0000],
         [226.0053, 370.3125,   1.0000],
         [221.8081, 455.5516,   1.0000],
         [273.9723, 448.9486,   1.0000],
         [193.6275, 546.1933,   1.0000],
         [273.3727, 545.5930,   1.0000]]])

This is the result of the example, just that the left_eye, left_ear, and left_hip are annotated as "not visible" with key point coordinates as (0, 0).

Plotting this result shows three lines connecting the skateboarder with the origin, which doesn't look good. On the left is the original image, on the right the one using new_keypoints which has invisible key points.

Now imagine how that looks for other skeleton structures, like Halpe-FullBody (136 key points) or COCO-WholeBody (133 key points)...

Alternatives

It is possible to remove the "invisible" key points from the the skeleton, by updating the skeleton for every image and using something like:

invisible_indices = torch.argwhere(new_keypoints[0,:,2] == 0)
custom_connect_skeleton = [
    link for link in custom_connect_skeleton if not any(idx in link for idx in invisible_indices)
]
draw_keypoints(person_int, new_keypoints, connectivity=custom_connect_skeleton, ...)

But the "dots" are still printed in the upper left corner (see below), and the whole process is fairly cumbersome. That's because now the skeleton of every human has to be analysed and drawn seperately, due to the fact that the skeletons of different persons might have different exclusions and draw_keypoints() only accepts one connectivity for all instances.

Therefore, a second alternative would be to allow passing multiple connectivities, one for each instance. But that still doesn't solve the drawn "dot"s problem and feels less intuitive than the proposed approach.

Additional context

This image is taken from the PoseTrack21 dataset, and shows how a full body skeleton fails when only the upper body gets detected by the bounding box. These are the original annotated key points within the annotated bounding box of the dataset. (Image Source, not publically available: PoseTrack21/images/val/010516_mpii_test/000048.jpg)

The text was updated successfully, but these errors were encountered:

NicolasHug · 2024-01-15T15:36:36Z

Thank you for this greatly detailed feature proposal @bmmtstb ! I think it makes sense. Instead of an additional visibility parameter though, should we simply lookup the keypoint visibility from the 3rd column of the keypoints parameter, if that column exists? This would correspond to what the models currently outputs (3 columns) and users wouldn't need to change their code.

bmmtstb · 2024-01-15T18:20:45Z

I thought about that too, but for me a separate argument would be easier to understand. But because there are pro and con against both approaches, I startet a list of all the things I just thought of. In my opinion the extra attribute is more flexible, while demanding only a little more effort by the user.

3rd-column visibility	Extra `visibility` attribute
less torch to know for end-users, "easier"	the user "does not know" that we pruned invisible key points without him doing or specifying anything
the function does not get additional attribute(s)	implicitly casting 3rd dimension to bool without knowing whats in there (see below)
removes the possibility to pass a visibility tensor of the wrong dimensions, due to the single valid tensor	not extendable to 3D drawing
	potentially bigger memory consumption

If we follow your proposal, a few more questions arise:

Will the third dimension be "force"-cast to be a bool?
Most non-toy-models I know predict the visibility as float in range [0..1]. But is 0.1 now visible or not? If not, we potentially need an additional cut-off threshold. Should the user decide the threshold?
Other models might return a key-point certainty (e.g. AlphaPose's "joint-confidence score") as third dimension.
With a given separate BoolTensor, we can expect that the data has at least been looked at once by the user. Because he has to understand what the model, he used, returns. The user can potentially even modify the visibility to their liking, like setting the visibility-threshold.

In the end, the visibility tensor is easily extractable from the output tensor using split:

output = torch.ones((21, 3))
kp, vis = output.split([2, 1], dim=-1)

Even though we can't draw them yet, what about "real" 3D key-points?
What will happen if the prediction-model returns 3D key points and the user passes them along?

Specific Type vs Key-Point Type:
Im not entirely sure, but if we have a FloatTensor of key-point coordinates, the visibility will be float too. Isn't htis potentially resulting in more RAM consumption? Assuming that bool=1 bit and float=32 bit. But thats most likely not important seeing the sheer size of available memory...

NicolasHug · 2024-01-16T10:33:50Z

Thanks for your feedback @bmmtstb

Most non-toy-models I know predict the visibility as float in range [0..1]. But is 0.1 now visible or not? If not, we potentially need an additional cut-off threshold. Should the user decide the threshold?

That makes sense, it should be up to users to decide what the threshold should be, so let's go ahead with the visibility parameter. Happy to review a PR if you'd like to submit it @bmmtstb

bmmtstb · 2024-01-16T11:31:17Z

I will do a PR, but I can't promise how fast I can finish it. Fingers crossed for the weekend 🤞🏼

NicolasHug · 2024-01-16T11:44:17Z

There's no rush at all - thanks for doing it!

NicolasHug · 2024-02-06T12:49:30Z

Closed by #8225, thanks again @bmmtstb for the feature request and for the amazing PR!

NicolasHug closed this as completed Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Optional Key-Point Visibility to `draw_keypoints()` #8203

Add Optional Key-Point Visibility to `draw_keypoints()` #8203

bmmtstb commented Jan 11, 2024 •

edited

Loading

NicolasHug commented Jan 15, 2024

bmmtstb commented Jan 15, 2024

NicolasHug commented Jan 16, 2024

bmmtstb commented Jan 16, 2024

NicolasHug commented Jan 16, 2024

NicolasHug commented Feb 6, 2024

Add Optional Key-Point Visibility to draw_keypoints() #8203

Add Optional Key-Point Visibility to draw_keypoints() #8203

Comments

bmmtstb commented Jan 11, 2024 • edited Loading

🚀 The feature

Motivation, pitch

Alternatives

Additional context

NicolasHug commented Jan 15, 2024

bmmtstb commented Jan 15, 2024

NicolasHug commented Jan 16, 2024

bmmtstb commented Jan 16, 2024

NicolasHug commented Jan 16, 2024

NicolasHug commented Feb 6, 2024

Add Optional Key-Point Visibility to `draw_keypoints()` #8203

Add Optional Key-Point Visibility to `draw_keypoints()` #8203

bmmtstb commented Jan 11, 2024 •

edited

Loading