Add HTJ2K Compressor #1883

palemieux · 2024-10-13T18:53:27Z

This patch fulfills my action item from the September 19 TSC meeting.

The patch proposes to add support for High-Throughput JPEG 2000 (HTJ2K) compression to OpenEXR -- HTJ2K is JPEG 2000 with the HT block coder standardized in Rec. ITU-T T.814 | ISO/IEC 15444-15. As detailed at Evaluating HTJ2K as a compressor option for OpenEXR, HTJ2K demonstrates significant improvements in both speed and file size over other OpenEXR compressors. Furthermore, HTJ2K is a worldwide standard that benefits from a diversity of implementations, builds on JPEG 2000, which is in broad use in studio workflows, is appropriate for long-term preservation, has both lossy and lossless modes and supports integer and floating point samples. Finally, the unique features of HTJ2K could be powerful additions to high-performance OpenEXR workflows where the full image may not always be viewed at its full resolution.

The patch defines a ht256 compressor, which uses 256-scanline chunks (this is not a limitation of HTJ2K, which can support any number of scanlines) and the open-source OpenJPH library.

linux-foundation-easycla · 2024-10-13T18:53:36Z

The committers listed above are authorized under a signed CLA.

✅ login: palemieux / name: Pierre-Anthony Lemieux (c8a670e, 543be54, f8f1d99, bf65763)

src/bin/exrconv/main.cpp

cary-ilm · 2024-12-13T02:48:10Z

The CI is failing in the validate step, which is attempting to link a test program against the library. The error message indicate the link is failing with unresolved symbols, possibly because that cmake configuration is missing the new dependency.

However, this part of the CI has been rewritten in the PR I just merged. Can you rebase your branch onto the current main branch now? It may still fail, but hopefully it will be easier to resolve then at least.

palemieux · 2024-12-13T04:26:07Z

Can you rebase your branch onto the current main branch now? It may still fail, but hopefully it will be easier to resolve then at least.

@cary-ilm Done.

cmake/CMakeLists.txt

cary-ilm · 2024-12-13T04:56:59Z

The failure in the "Validate" step of the linux build is along the lines of what we discussed in the TSC meeting today, the check to make sure "make install" installs just the right files. This output from the logs appears to indicate that the cmake configuration is causing the openjph library and headers to be installed, when that shouldn't happen, it'll cause problems for downstream projects:

Error: The following files were installed but were not expected:
  bin/ojph_compress
  bin/ojph_expand
  include/OpenEXR/ImfHTCompressor.h
  include/OpenEXR/ImfHTKCompressor.h
  include/openjph/ojph_arch.h
  include/openjph/ojph_arg.h
  include/openjph/ojph_base.h
  include/openjph/ojph_codestream.h
  include/openjph/ojph_defs.h
  include/openjph/ojph_file.h
  include/openjph/ojph_mem.h
  include/openjph/ojph_message.h
  include/openjph/ojph_params.h
  include/openjph/ojph_version.h
  lib64/libopenjph.so
  lib64/libopenjph.so.0.18
  lib64/libopenjph.so.0.18.2
  lib64/pkgconfig/openjph.pc

Try to make sure your cmake "fetch" of openjph mirror as close as possible how the current setup handles libdeflate and Imath. I had a similar issue with them originally but the current setup should handle the the fetch properly.

palemieux · 2024-12-13T04:58:44Z

Thanks! What is the best way to run that validate step locally?

palemieux · 2024-12-13T06:52:36Z

@cary-ilm I remember now... I could never fix the following errors:

[cmake] CMake Error: install(EXPORT "OpenEXR" ...) includes target "Iex" which requires target "openjph" that is not in any export set.
[cmake] CMake Error: install(EXPORT "OpenEXR" ...) includes target "IlmThread" which requires target "openjph" that is not in any export set.
[cmake] CMake Error: install(EXPORT "OpenEXR" ...) includes target "OpenEXRCore" which requires target "openjph" that is not in any export set.
[cmake] CMake Error: install(EXPORT "OpenEXR" ...) includes target "OpenEXR" which requires target "openjph" that is not in any export set.
[cmake] CMake Error: install(EXPORT "OpenEXR" ...) includes target "OpenEXRUtil" which requires target "openjph" that is not in any export set.

It seems to impose some requirements on openjph, which I could not find a solution for.

cary-ilm · 2024-12-13T16:55:37Z

I'm not enough of a cmake expert to immediately know how to resolve this. Since this is a work in progress, you could just disable the CI "validate" step entirely for now, just edit the workflow file on your branch and add a "if: false" line, or just delete it. We'll need it resolved eventually before merging, but at least that will allow you to get on with other priorities.

I'm happy to help resolve this, but I won't have much time to look into it until mid-January.

src/bin/exrconv/main.cpp

+
+    // make image filename
+    char input_filename[512] = { '\0' };
+    sprintf(input_filename, args->input_filename, frame_index + args->start_frame);


src/bin/exrconv/main.cpp

+    // make image filenames
+    char input_filename[512] = { '\0' };
+    char output_filename[512] = { '\0' };
+    sprintf(input_filename, args.input_filename, frame_index + args.start_frame);


src/bin/exrconv/main.cpp

+    char input_filename[512] = { '\0' };
+    char output_filename[512] = { '\0' };
+    sprintf(input_filename, args.input_filename, frame_index + args.start_frame);
+    sprintf(output_filename, args.output_filename, frame_index + args.start_frame);


peterhillman · 2025-01-10T09:35:46Z

Regarding exrperf and exrconv: OpenEXR now provides exrmetrics for performance analysis. exrmetrics can also do compression conversion (though that's not an obvious use from the name) so can be used instead of exrconv. It looks like exrmetrics has more complete support for different channel and part types, but there are a couple of features that exrperf and exrconv offer that are missing from exrmetrics. Perhaps it would be better to add those as options to exrmetrics, probably as a separate PR, instead of adding new binaries?

palemieux · 2025-01-10T15:16:23Z

Perhaps it would be better to add those as options to exrmetrics, probably as a separate PR, instead of adding new binaries?

Yes. exrperf and exrconv will be removed before this is merged.

palemieux · 2025-01-20T23:52:07Z

@cary-ilm I believe this is ready for review and discussion at the upcoming TSC call.

kmilos · 2025-01-21T09:57:22Z

Please either make this optional, and/or also support building w/ system OpenJPH, like libdeflate. Bundling a (unreleased) feature branch/tag is not an ideal solution for long-term distribution...

palemieux · 2025-01-21T16:33:53Z

@kmilos I agree that referencing an unreleased branch/tag of a dependency is not ideal. I expect a release of OpenJPH to be referenced by the time this PR is merged.

kmilos · 2025-01-21T17:09:18Z

Ok, makes sense, thanks @palemieux In the meantime I still think it would also be good to prepare for the detection for the system library, like it is done for libdeflate...

palemieux · 2025-01-21T17:14:23Z

In the meantime I still think it would also be good to prepare for the detection for the system library, like it is done for libdeflate...

You mean use the system library if available (through find_package), and fall back to the GitHub dependency if the system library is not available?

kmilos · 2025-01-21T17:20:31Z

You mean use the system library if available (through find_package), and fall back to the GitHub dependency if the system library is not available?

Exactly, this is how it is already done for libdeflate (one can skip the CONFIG detection branch as OpenJPH doesn't ship a CMake config, only a pkgconf one). And Imath as well. In addition there are options to force use of the "internal" (fetched from GitHub) ones...

palemieux · 2025-01-21T17:24:21Z

Exactly, this is how it is already done for libdeflate (one can skip the CONFIG detection branch as OpenJPH doesn't ship a CMake config, only a pkgconf one). In addition there is an option to force use of the "internal" (fetched from GitHub) one...

Ok. No objection from me. Is that regularly tested?

TodicaIonut · 2025-01-21T19:03:19Z

src/lib/OpenEXR/ImfCompression.cpp

@@ -192,6 +198,7 @@ static const std::map<std::string, Compression> CompressionNameToId = {
    {"b44a", Compression::B44A_COMPRESSION},
    {"dwaa", Compression::DWAA_COMPRESSION},
    {"dwab", Compression::DWAB_COMPRESSION},
+    {"ht256", Compression::HT256_COMPRESSION},


Suggested change

{"ht256", Compression::HT256_COMPRESSION},

{"ht", Compression::HT_COMPRESSION},

Rename ht256 to ht 256-scanline chunks.

Does this still need to be resolved? "ht256" still appears in other places, such as in CompressionDesc on line 180 above, does that need to be consistent?

TodicaIonut · 2025-01-26T17:12:13Z

setting HT Compression Quality Level float. visually lossless setting Compression Level 0.0f, quantizing lossy. floating point base-error value. @palemieux

cary-ilm · 2025-02-02T21:36:26Z

@palemieux, now that we're getting close with this, we do need to resolve the DCO and CLA, note those checks have been failing. The fix for the other failing checks should get merged to main shortly.

For the contributor license agreement you'll need to sign the form through the red "NOT COVERED" link above.

For the "Digital Certificate of Origin", each commit needs to be "signed" via the -s argument to git commit. This adds the Signed-off-by: <email> line to the commit message. You'll need to do this for every commit. You can do this retroactively to the existing commits via git commit -s --amend --no-edit. You'll then have to git push --force to overwrite the existing commits.

You could also squash everything into a single signed commit, since the history of these commits will get lost when we merge it anyway, since we generally "squash-and-merge" instead of "rebase-and-merge".

For the feature branch, we'll also need this PR to merge to the feature branch name, not to main as it stands now. Should we call that AcademySoftwareFoundation/htj2k?

I look this over in the next few days to see if anything further stands out.

Signed-off-by: Pierre-Anthony Lemieux <[email protected]>

palemieux · 2025-02-02T21:51:28Z

@cary-ilm CLA is executed, DCO is done.

Ok with AcademySoftwareFoundation/htj2k. Can you create the branch and I will change the target branch of this PR and rebase the commit.

cary-ilm

Small nit, the white paper appears to use "High Throughput", with no hyphen, so better to stay consistent with that.

cary-ilm · 2025-02-03T00:05:53Z

share/ci/scripts/install_openjph.sh

+    SUDO="sudo"
+fi
+
+git clone -b add-export https://github.com/palemieux/OpenJPH.git


Should this be https://github.com/aous72/OpenJPH?

Yes. It will be a tagged release of https://github.com/aous72/OpenJPH. I am waiting to ask the maintainer to tag the release until we made sure that no further major changes to the OpenJPH build system is needed to accommodate OpenEXR. Let me know if we are at this point.

cary-ilm · 2025-02-03T00:06:03Z

share/ci/scripts/install_openjph.sh

+git clone -b add-export https://github.com/palemieux/OpenJPH.git
+cd OpenJPH
+
+# git checkout ${TAG}


And uncomment this?

src/lib/OpenEXR/ImfCompression.cpp

src/lib/OpenEXR/ImfCompression.h

cary-ilm · 2025-02-03T00:15:07Z

src/lib/OpenEXR/ImfCompression.cpp

@@ -192,6 +198,7 @@ static const std::map<std::string, Compression> CompressionNameToId = {
    {"b44a", Compression::B44A_COMPRESSION},
    {"dwaa", Compression::DWAA_COMPRESSION},
    {"dwab", Compression::DWAB_COMPRESSION},
+    {"ht256", Compression::HT256_COMPRESSION},


Does this still need to be resolved? "ht256" still appears in other places, such as in CompressionDesc on line 180 above, does that need to be consistent?

website/ReadingAndWritingImageFiles.rst

cary-ilm · 2025-02-03T00:32:44Z

@palemieux I created the branch and called it htj2k-beta.

palemieux · 2025-02-03T04:08:13Z

@cary-ilm The white paper is in error :(. The official name, per the ITU and ISO specification titles is High-throughput JPEG 2000.

peterhillman · 2025-02-03T04:21:37Z

Question about RGB channels: I see special handling of RGB in the code. I admit I don't fully understand what's happening there, but are RGB channels in all layers treated as RGB? Would an image with channels called diffuse.R, diffuse.G, diffuse.B compress as efficiently as one with plain R, G, B? What about an image part that has multiple sets of RGB channels? The channel mapping function which seems to use the channel list to derive the mapping. Is that mapping serialized to the file to allow future changes in the mapping logic, or do both the encoder and decoder use the channel list?

palemieux · 2025-02-03T05:21:28Z

@peterhillman the compressor makes use of a JPEG 2000 (Part 1) feature that improves coding efficiency when the first three channels are RGB (or equivalent): in that scenario, a decorrelating transform (effectively a RGB to YUV transform) is applied to the channels before coding.

Currently both the encode and decoder determine the mapping at run time. The mapping could conceivably be serialized in the file or the the channel order modified by the encoder (if possible).

The algorithm could be modified to accommodate channels named something other than "R", "G" and "B". Does a naming convention exist?

How would multiple sets of RGB channels be signaled in the file? Is that a common use case?

More complex/flexible JPEG 2000 tools are available if needed.

peterhillman · 2025-02-03T06:23:09Z

Channel naming is defined in a couple of places:
https://openexr.com/en/latest/TechnicalIntroduction.html#channel-names
https://openexr.com/en/latest/InterpretingDeepPixels.html#channel-names-and-layers
You need to look for the "base name" - the part after the last dot, if there is one, or the whole name otherwise. (If the penultimate character in the channel name is a . then you can compare the last character against R,G or B)

It's common for a multipart file to contain R G B A in one part and L1.R L1.G L1.B L1.A in another. Since the compression runs per-part it may only see L1.R L1.G L1.B L1.A, so that could be treated as an RGBA group.

Having multiple RGB layers within the same part is possible, e.g. six channels forming two layers: R G B L1.R L1.G L1.B. If the codec only supports one set of RGB channels it makes sense to use R G B if any of those exist, and fall back to the (alphabetically) first set of *.R *.G *.B channels found otherwise.

There would be a lot of compression efficiency to be gained from a codec that understood that RGB channels within each layer have similarity to each other, but you would also expect strong correlation between all the red components (say) of each layer. Perhaps a future OpenEXR codec could apply a more complex decorrelation transform itself before using JPEG2000.

Including the mapping may be useful, and it may even be useful to support custom mappings at write time. #1942 discusses this issue in lossy DWA files, though it's more critical there because it affects image quality not just file size.

palemieux · 2025-02-03T06:43:06Z

@peterhillman Thanks for the detailed information. Will review the code in light of the naming convention.

Would supporting custom mappings at write time require a change to the API? What kind of mappings do you have in mind?

There would be a lot of compression efficiency to be gained from a codec that understood that RGB channels within each layer have similarity to each other, but you would also expect strong correlation between all the red components (say) of each layer. Perhaps a future OpenEXR codec could apply a more complex decorrelation transform itself before using JPEG2000.

JPEG 2000 supports complex component transformations. The questions is whether it is worth involving that machinery at this time. Is there a latent or existing use case for files with, say, multiple R channels within a part?

peterhillman · 2025-02-03T07:29:41Z

Would supporting custom mappings at write time require a change to the API? What kind of mappings do you have in mind?

Probably an API extension, yes, ideally one that's more general so it can be used across different codecs.

JPEG 2000 supports complex component transformations. The questions is whether it is worth involving that machinery at this time. Is there a latent or existing use case for files with, say, multiple R channels within a part?

Future work, I think, and probably a new compression method as far as OpenEXR is concerned.

Storing many channels within an OpenEXR is relatively common, and those files get very large. Making compression work well in that case would be really beneficial. An example of multiple R channels is a render where different shader components (specular, diffuse, emission) or different CG lights are split out to allow interactive adjustments later without rerendering. The Beachball example sequence is 3D-Stereo image and has separate RGB layers for left and right. It stores a single layer in each part. That is a common strategy as it reduces the amount of data that needs to be loaded and decompressed before accessing a subset of all channels. However, if a codec is very efficient at compressing multiple RGB layers together, it would make sense to store all channels in a single part to reduce the total file size.

cary-ilm · 2025-02-03T17:16:07Z

@palemieux, well that's unfortunate for the grammar sticklers, even the Library of Congress document is a mix of "High Throughput", "High-Throughput" and "High-throughput". I do see the ISO document has "High-Throughput" in the title, so I'm fine with that. I remove the original suggestions but left a few where things should be updated.

palemieux · 2025-02-03T17:17:55Z

@peterhillman et al. Should the reordering currently automatically performed by the compressor be removed and instead ask the application to provide the channels in RGB order if it wants decorrelation to be performed by the compressor? That would avoid burning in the channel reordering algorithm in the compressor.

Co-authored-by: Cary Phillips <[email protected]> Signed-off-by: Pierre-Anthony Lemieux <[email protected]>

peterhillman · 2025-02-03T18:58:34Z

The C++ API doesn't track the order you add files to the FrameBuffer or channellist attribute, it maintains them in alphabetical order. That's also the order when they are presented to the compressor. So making the codec aware of the ordering would be tricky.
There's something to be said for making the API as codec agnostic as possible, avoiding codec-specific API functions or special access patterns that change behavior. There's often ambiguity about whether such settings should be presented to the user, or handled by the software internally. It would be much simpler if the advice could be "name your channels according to the spec, and the codec will do the right thing"

palemieux · 2025-02-03T19:09:09Z

@peterhillman Ok. Thanks for the background. I am tempted to include the channel map in the codestream... in case the algorithm/naming convention changes in the future.

palemieux mentioned this pull request Oct 14, 2024

Add HTJ2K Compressor sandflow/openexr-ht#1

Closed

github-advanced-security bot found potential problems Nov 5, 2024

View reviewed changes

src/bin/exrconv/main.cpp Fixed Show fixed Hide fixed

src/bin/exrconv/main.cpp Fixed Show fixed Hide fixed

src/bin/exrconv/main.cpp Fixed Show fixed Hide fixed

palemieux force-pushed the feature/add-ht-support-rebase branch from 2a3f1c5 to 439cfe5 Compare December 13, 2024 04:25

cary-ilm reviewed Dec 13, 2024

View reviewed changes

cmake/CMakeLists.txt Outdated Show resolved Hide resolved

palemieux force-pushed the feature/add-ht-support-rebase branch 3 times, most recently from 2230cc5 to e28aaf5 Compare January 7, 2025 19:05

github-advanced-security bot found potential problems Jan 10, 2025

View reviewed changes

palemieux mentioned this pull request Jan 14, 2025

Bazel build system maintenance #1953

Open

peterhillman mentioned this pull request Jan 15, 2025

Add benchmarking mode to exrmetrics #1956

Open

palemieux marked this pull request as ready for review January 17, 2025 00:09

TodicaIonut reviewed Jan 21, 2025

View reviewed changes

palemieux mentioned this pull request Jan 28, 2025

Add support for find_package() and clean-up TIFF support aous72/OpenJPH#171

Merged

palemieux force-pushed the feature/add-ht-support-rebase branch 2 times, most recently from 15fc4d8 to 2be7650 Compare February 2, 2025 19:55

Add HT256 compressor

c8a670e

Signed-off-by: Pierre-Anthony Lemieux <[email protected]>

palemieux force-pushed the feature/add-ht-support-rebase branch from 2be7650 to c8a670e Compare February 2, 2025 21:49

cary-ilm requested changes Feb 3, 2025

View reviewed changes

palemieux changed the base branch from main to htj2k-beta February 3, 2025 04:08

palemieux and others added 3 commits February 3, 2025 09:18

Update src/lib/OpenEXR/ImfCompression.h

bf65763

Co-authored-by: Cary Phillips <[email protected]> Signed-off-by: Pierre-Anthony Lemieux <[email protected]>

Update src/lib/OpenEXR/ImfCompression.cpp

f8f1d99

Co-authored-by: Cary Phillips <[email protected]> Signed-off-by: Pierre-Anthony Lemieux <[email protected]>

Update website/ReadingAndWritingImageFiles.rst

543be54

Co-authored-by: Cary Phillips <[email protected]> Signed-off-by: Pierre-Anthony Lemieux <[email protected]>

	{"ht256", Compression::HT256_COMPRESSION},
	{"ht", Compression::HT_COMPRESSION},

Add HTJ2K Compressor #1883

Are you sure you want to change the base?

Add HTJ2K Compressor #1883

Conversation

palemieux commented Oct 13, 2024 • edited Loading

linux-foundation-easycla bot commented Oct 13, 2024 • edited Loading

cary-ilm commented Dec 13, 2024

palemieux commented Dec 13, 2024

cary-ilm commented Dec 13, 2024

palemieux commented Dec 13, 2024

palemieux commented Dec 13, 2024

cary-ilm commented Dec 13, 2024

peterhillman commented Jan 10, 2025

palemieux commented Jan 10, 2025

palemieux commented Jan 20, 2025

kmilos commented Jan 21, 2025 • edited Loading

palemieux commented Jan 21, 2025

kmilos commented Jan 21, 2025

palemieux commented Jan 21, 2025

kmilos commented Jan 21, 2025 • edited Loading

palemieux commented Jan 21, 2025

TodicaIonut Jan 21, 2025

Choose a reason for hiding this comment

cary-ilm Feb 3, 2025

Choose a reason for hiding this comment

TodicaIonut commented Jan 26, 2025

cary-ilm commented Feb 2, 2025

palemieux commented Feb 2, 2025

cary-ilm left a comment

Choose a reason for hiding this comment

cary-ilm Feb 3, 2025

Choose a reason for hiding this comment

palemieux Feb 3, 2025

Choose a reason for hiding this comment

cary-ilm Feb 3, 2025

Choose a reason for hiding this comment

palemieux Feb 3, 2025

Choose a reason for hiding this comment

cary-ilm Feb 3, 2025

Choose a reason for hiding this comment

cary-ilm commented Feb 3, 2025

palemieux commented Feb 3, 2025

peterhillman commented Feb 3, 2025

palemieux commented Feb 3, 2025

peterhillman commented Feb 3, 2025

palemieux commented Feb 3, 2025

peterhillman commented Feb 3, 2025

cary-ilm commented Feb 3, 2025

palemieux commented Feb 3, 2025 • edited Loading

peterhillman commented Feb 3, 2025

palemieux commented Feb 3, 2025

palemieux commented Oct 13, 2024 •

edited

Loading

linux-foundation-easycla bot commented Oct 13, 2024 •

edited

Loading

kmilos commented Jan 21, 2025 •

edited

Loading

kmilos commented Jan 21, 2025 •

edited

Loading

palemieux commented Feb 3, 2025 •

edited

Loading