-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolved issue with black and white images in nodes_mask.py #4110
Conversation
Nodes that use the composite function accept both bw and rgb images. Both formats correctly display in the image preview node, however, the formats have transposed HW and incompatible C dimensions, even though they are both accepted as inputs. This change introduces a function that converts black and white inputs to rgb so that composite operations do not unexpectedly fail while the image preview displays an image successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll recap some discussion we had on the #backend-development
Discord.
Ultimately, I think that the root cause of the issue here is the fact that a node is returning a MASK
(i.e. a BHW
tensor) when it says it's returning an IMAGE
(i.e. a BHWC
tensor). Unfortunately, as @shawnington said, this displays just fine in the Preview Image node which confuses end-users and makes people think it's a valid image.
In my opinion, the right long-term fix is to prevent nodes from returning malformed types. The back-end could perform validation on built-in types (e.g. for IMAGE
, ensuring that they are BHWC
tensors) and throw an error on any nodes that fail this. This would force the node authors not to ship buggy nodes like this.
In the shorter term (to maintain backward-compatibility), we could throw warnings in the console to encourage node authors to fix their bugs.
A band-aid fix like this PR certainly doesn't hurt anything and would improve the user experience, so I think it's a reasonable stop-gap solution, but it really is just a band-aid. While this change would make this one node work when given a MASK that's masquerading as an IMAGE, many other nodes will still fail with it.
If we start finding ourselves adding this logic to more nodes, I really think we should be making the more foundational fix.
@@ -5,7 +5,19 @@ | |||
|
|||
from nodes import MAX_RESOLUTION | |||
|
|||
def tensor_to_rgb(img): | |||
# convert from bw to rgb : cwh -> bchw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct. I'm fairly certain that masks in ComfyUI are BHW
and images are BHWC
. Your logic here works out when the batch size is 1, but will break if there is a batch of more than one mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see -- .movedim(-1, 1)
is called before calling this function for the RGB case -- so it would be BCHW
. Still, I think you want to insert the channel dimension rather than the batch dimension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see --
.movedim(-1, 1)
is called before calling this function for the RGB case -- so it would beBCHW
. Still, I think you want to insert the channel dimension rather than the batch dimension.
Yeah, I am not sure why it wants BCHW When the LoadImage node outputs BHWC, but the composite nodes in nodes_mask.py definitely require this format to perform their operations, and it is also the format it returns.
The batch dimension is needed because it accesses channel dimension via index position, C is expected at index[1] in the composite node so HWC would also fail without Batch being added, but batch is only added if there is no batch currently, otherwise the channel is repeated at index[1]
Im guessing all of this is a result of there not being an enforced format for what an image is through the pipeline.
So forcing conformance to either BCHW or BHWC would fix a lot of confusion with things like this.
I agree with this 100%, there is one slight error, masks are passed as numpy arrays, not tensors. The results of this "bug" or what many users would consider a bug since preview displays it properly, is converting mask into an image by converting it from a numpy array into a tensor either directly or with PIL, like follows. image = torch.from_numpy(np.array(mask).astype(np.float32) / 255.0) or mask = Image.fromarray(mask, mode="L")
torch.from_numpy(np.array(mask).astype(np.float32) / 255.0) which at first glance doesn't look problematic, and will display in image preview 100% correct. However, both of these result in malformed images that preview correctly. however, the correct way would be with mask = Image.fromarray(mask, mode="L").convert("RGB")
torch.from_numpy(np.array(mask).astype(np.float32) / 255.0) Typical you would squeeze the mask down to HW then unsqueezed after What further can confuse users is that any image nodes that convert the tensor to PIL to use PIL image operations such as: image = image.filter(ImageFilter.GaussianBlur(blur_radius)) Will work perfectly fine with the malformed black and white tensor when converted to a PIL image. They will also convert back to a tensor perfectly fine and once again preview perfectly fine, and the user will have just successfully passed a malformed image format through a node that performs operations on it, without error, and now the actual source of the error is obfuscated from them, because when an error is thrown, it does not have any trace back to where the original malformed tensor was created. The rest I agree with, this is a quick patch that only deals with things in nodes_mask.py A longer term solution should be enforcing proper type for image. Perhaps having a function that is similar to what I proposed that is either a node_helper that is shown in the examples for creating a custom node, or is done as validation on all image outputs without the user having a say and ensures it conforms to expected BHWC format is best. A third solution could be to prevent the preview node from displaying malformed images, and showing an error message of the way the image is malformed. |
superseded by #4149 |
Nodes that use the composite function accept both bw
1:w:h
and rgbb:3:h:w
images as input. Both formats correctly display in the image preview node, however, the formats have transposed HW and incompatible C dimensions, which throws an error in nodes that attempt to composite them.This is the normal format for PIL to create an image from a single channel numpy array such as a mask if
.convert("RGB")
is not used. This is a fairly common omission, as it does not impact preview behavior, and compositing of the mask preview over an RGB image was not a common workflow task.However, with the introduction of UnionControlnet, compositing of mask previews is however now a common operation as it is part of creating a conditioning image for
repaint
union operations.The resultant image will pass through most image nodes without issue when converted back into a tensor, and will display properly in preview nodes, but will not composite properly with an "RGB" image. This can make it confusing for users as to where and what the source of the error is.
This change introduces a function that checks if a tensor is in the PIL b&w shape
1:w:h
and converts to the RGB shapeb:3:h:w
so that composite operations do not unexpectedly fail while the image preview displays an image successfully.An alternative solution is to make it so that image inputs only accept tensors of shape
b:3:h:w
, instead of also accepting1:w:h
andb:1:w:h
images.