Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update YUV format codes and documentation #214

Merged
merged 1 commit into from
Apr 3, 2023
Merged

update YUV format codes and documentation #214

merged 1 commit into from
Apr 3, 2023

Conversation

christianrauch
Copy link
Contributor

The YUV formats seem to be ambiguous. With this PR, I would like to clarify this and properly document their memory layout. There is no "yuv422_yuy2" format and the "yuv422" format refers to the format code YU16. The documentation mentions a "UYUV" format, which does not exist. Additionally, the website http://www.fourcc.org is not reachable anymore.

With the wayback machine, I compared the documented memory layout from http://www.fourcc.org with the one from the Linux kernel documentation from https://www.kernel.org/doc/html/latest/. I found that for the yuv422 the memory layout described at fourcc.org matches V4L2_PIX_FMT_UYVY at kernel.org and yuv422_yuy2 matches V4L2_PIX_FMT_YUYV.

The NV21 and NV24 formats have unambiguous format codes across the kernel documentation. For the YUV 4:2:2 formats, different sources seem to use different names, hence I propose to use the fourcc instead of an ambiguous name.

See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/videodev2.h for a list of codes used with video devices.

@sgvandijk Since you added the "yuv422_yuy2" format via #78, can you comment if my changes make sense to you?

@christianrauch christianrauch requested a review from tfoote as a code owner January 4, 2023 12:41
Use format codes instead of ambiguous names.

Signed-off-by: Christian Rauch <[email protected]>
@sgvandijk
Copy link
Contributor

The naming of the format was discussed in #76. As mentioned there, one of the reasons of adding the format was to be able to then extend cv_bridge to be able to perform the right conversion. OpenCV defines:

  cv::COLOR_YUV2RGB_YUY2 = 115,
  cv::COLOR_YUV2BGR_YUY2 = 116, 

and then the aliases:

  cv::COLOR_YUV2RGB_YUYV = COLOR_YUV2RGB_YUY2,
  cv::COLOR_YUV2BGR_YUYV = COLOR_YUV2BGR_YUY2,
  cv::COLOR_YUV2RGB_YUNV = COLOR_YUV2RGB_YUY2,
  cv::COLOR_YUV2BGR_YUNV = COLOR_YUV2BGR_YUY2, 

so hinting they treat YUY2 as the canonical name, but also YUYV and YUNV as alternative names, seemingly following the table that used to be at fourcc.org.

It is true that this doesn't follow the V4L2 codes and that V4L2 drivers return the code "YUYV", which I also had to handle that in my V4L2 camera driver. However I opted to be consistent with OpenCV as the more general framework rather than the more specific V4L2.

FWIW, Windows seems to use YUY2 as well and does not use YUYV as an alias AFAICT. Also see RFC2361.

FFmpeg mixes things a bit: running ffmpeg -f v4l2 -i /dev/video0 gives me:

Stream #0:0: Video: rawvideo (YUY2 / 0x32595559), yuyv422, 640x480, 147456 kb/s, 30 fps, 30 tbr, 1000k tbn, 1000k tbc

so they use the YUY2 FourCC, which they then map to the internal yuyv422 type.

With all that said, I'd agree that YUYV would be clearer and I wouldn't be against adopting that per se, I had that name in mind initially as well. But for almost everywhere I look except for V4L2, YUY2 is the common default name to refer to this encoding.

@christianrauch
Copy link
Contributor Author

Thanks for the detailed response. That means that YUYV is used on Linux and YUY2 is used on Windows, but they both describe the same memory layout. The ffmpeg pixel format mapping indeed shows that MKTAG('Y', 'U', 'Y', '2') and MKTAG('Y', 'U', 'Y', 'V') are mapped to AV_PIX_FMT_YUYV422. In this case, I am also fine with using YUY2 as the fourcc.

I wouldn't so much rely on OpenCV for choosing those names. It is more important that they properly specify the precise memory layout so that the image can be reconstructed manually from the raw data. When you get a data stream from the camera with a given fourcc, then the mapping to the ROS "encodings" should be unambiguous (see also #204).

But in any case, I find the yuv422 encoding name ambiguous as there are multiple YUV 4:2:2 encodings.

Copy link
Contributor

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks cleaner and more clearly documented.

@clalancette
Copy link
Contributor

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette clalancette merged commit 0655d4e into ros2:rolling Apr 3, 2023
@christianrauch christianrauch deleted the update_yuv branch April 4, 2023 17:30
rr-tom-noble pushed a commit to rivelinrobotics/common_interfaces that referenced this pull request May 4, 2023
Use format codes instead of ambiguous names.

Signed-off-by: Christian Rauch <[email protected]>
Signed-off-by: Tom Noble <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants