Preemptively close chunks before big messages to isolate them #1291

mikael-s-persson · 2024-12-12T22:39:19Z

Changelog

Close chunks when big messages are written to isolate them, leading to much better read performance when going over small-message channels since those isolated chunks can be skipped over entirely.

Docs

None

Description

A very common situation is to have a few topics with very large messages (images, point clouds, etc.) and many topics with very small (and frequent) messages (state, transforms, statuses, etc.). When writing all those into a single ("merged") mcap file, it naturally leads to chunks typically containing a series of small messages and ending with one big message which fills the rest of the chunk and "closes" it. The result of this is that nearly all chunks are "mixed" (i.e., messages from all channels appear in each chunk). During playback / reading, some of the client code (e.g., C++) implement logic to skip over chunks that don't have any message of interest (topics being read). That logic is completely thwarted if most chunks are mixed in this pathological pattern.

So, this PR adds a very simple change. When an incoming message is very large (larger than the chunk size), instead of writing that message to the current chunk (mixing with prior smaller messages), it closes the current chunk and writes the big message to a fresh new chunk (which will be closed immediately after). The end result is that big-message channels get isolated into their own series of chunks, enabling the playback / readers to skip over them without loading those chunks to RAM or decompressing them.

In theory (but I have not seen this), compression could also perform better on the isolated channels, where it matters most. All I have seen (see below) is that incompressible data (such a protobuf wireformat) does not get uselessly compressed along with the larger "bytes" data (images, point-clouds, etc.). This does not affect the resulting file size but speeds up reading times. (BTW, it would be nice to be able to configure the hardcoded to 1KB minimum compression size in the writer)

The only slight drawback that I can see is getting some smaller / partial chunks. I guess there might be some rather unrealistic pathological cases (like interleaving every big and small message) where this could lead to a noticeable difference in file size (bigger chunk index). If you want a writer option to disable that behavior, that's fine, but I believe this should be the default.

I ran some tests on some simple mcap files with mixed channels, the result are seen below.

Before After

Running mcap-cli info on the recorded mcap file (2.7 GB):

library:   libmcap 1.4.0                                               
profile:                                                               
messages:  31870                                                       
duration:  3m50.316691925s                                             
start:     2024-08-08T13:03:44.098069201-04:00 (1723136624.098069201)  
end:       2024-08-08T13:07:34.414761126-04:00 (1723136854.414761126)  
compression:
    zstd: [6724/6725 chunks] [3.30 GiB/2.52 GiB (23.58%)] [11.21 MiB/sec] 
    : [1/6725 chunks] [516.00 B/516.00 B (0.00%)] [2.00 B/sec] 
channels:
    (1) [redacted_topic_0]           2121 msgs (9.21 Hz)    : foxglove.Grid [protobuf]             
    (2) [redacted_topic_1]           2121 msgs (9.21 Hz)    : [redacted] [protobuf]     
    (3) [redacted_topic_2]           2297 msgs (9.97 Hz)    : foxglove.PointCloud [protobuf]       
    (4) [redacted_topic_3]           2297 msgs (9.97 Hz)    : [redacted] [protobuf]     
    (5) [redacted_topic_4]          11517 msgs (50.01 Hz)   : foxglove.FrameTransforms [protobuf]  
    (6) [redacted_topic_5]          11517 msgs (50.01 Hz)   : [redacted] [protobuf]     
attachments: 0
metadata: 1

Running time mcap-cli cat --topic [redacted_topic_4] on the recorded mcap file (2.7 GB):

real  0m3.063s
user  0m4.179s
sys   0m0.662s

Running mcap-cli info on a recorded mcap file (2.7 GB):

library:   libmcap 1.4.0                                               
profile:                                                               
messages:  31870                                                       
duration:  3m50.316691925s                                             
start:     2024-08-08T13:03:44.098069201-04:00 (1723136624.098069201)  
end:       2024-08-08T13:07:34.414761126-04:00 (1723136854.414761126)  
compression:
    : [2121/8839 chunks] [276.02 KiB/276.02 KiB (0.00%)] [1.20 KiB/sec] 
    zstd: [6718/8839 chunks] [3.30 GiB/2.52 GiB (23.58%)] [11.21 MiB/sec] 
channels:
    (1) [redacted_topic_2]           2297 msgs (9.97 Hz)    : foxglove.PointCloud [protobuf]       
    (2) [redacted_topic_3]           2297 msgs (9.97 Hz)    : [redacted] [protobuf]     
    (3) [redacted_topic_4]          11517 msgs (50.01 Hz)   : foxglove.FrameTransforms [protobuf]  
    (4) [redacted_topic_5]          11517 msgs (50.01 Hz)   : [redacted] [protobuf]     
    (5) [redacted_topic_0]           2121 msgs (9.21 Hz)    : foxglove.Grid [protobuf]             
    (6) [redacted_topic_1]           2121 msgs (9.21 Hz)    : [redacted] [protobuf]     
attachments: 0
metadata: 1

Running time mcap-cli cat --topic [redacted_topic_4] on the recorded mcap file (2.7 GB):

real  0m0.070s
user  0m0.059s
sys   0m0.024s

james-rms · 2024-12-17T00:10:24Z

I think this makes sense and is a good improvement for most users. I'm inclined to believe it should be possible to disable this behavior with a writer option. I also think it changes the documented behavior of the chunkSize writer option, so that docstring should be updated. What do you think?

mikael-s-persson · 2024-12-17T02:04:48Z

Sounds good, I updated the comment for the chunkSize option and added a "noHugeMessageChunk" option to disable this behavior if needed.

jtbandes · 2024-12-17T03:09:54Z

This makes me wonder if we should instead close any chunk when writing a message if the current chunk size + new message size > chunkSize? That is, why check if a message itself is larger than a whole chunk, rather than just closing chunks when they are about to become larger chunkSize (rather than when they are already larger than chunkSize)?

jtbandes · 2024-12-17T03:10:28Z

FYI to @jhurliman in case you have any opinions on this change.

jhurliman · 2024-12-17T03:16:41Z

FYI to @jhurliman in case you have any opinions on this change.

This seems like good default behavior, I’m not sure it needs a configurable option?

EDIT: My personal inclination is toward @jtbandes idea. Bump the major version and change the behavior to chunk size being a (soft) ceiling instead of a floor

…iling, rather than soft floor

mikael-s-persson · 2024-12-18T17:13:41Z

My personal inclination is toward @jtbandes idea. Bump the major version and change the behavior to chunk size being a (soft) ceiling instead of a floor

Ok. I did that. I removed the new unit-test since it became identical to the existing one that I had to modify. I'll let you worry about bumping up the major version with your release process.

jtbandes

Initially I was worried that this might mean we end up writing Channel/Schema records into a different chunk from the Messages themselves. Upon reviewing the spec, it sounds like that is actually fine. Just calling this out in case anyone remembers differently :)

cpp/mcap/include/mcap/writer.inl

Preemptively close chunks before big messages to isolate them

9c256c8

mikael-s-persson requested review from jtbandes, james-rms and indirectlylit as code owners December 12, 2024 22:39

Fixed unit test, added a unit test

92953a3

Added option and doc updates to reflect this new behavior

cf5737b

mikael-s-persson added 2 commits December 18, 2024 11:33

Merge branch 'main' into feature/isolate_big_messages

1ff418a

Updated to @jtbandes suggested behavior of making chunkSize a soft ce…

98ef8ef

…iling, rather than soft floor

james-rms approved these changes Dec 18, 2024

View reviewed changes

jtbandes approved these changes Dec 19, 2024

View reviewed changes

cpp/mcap/include/mcap/writer.inl Outdated Show resolved Hide resolved

jtbandes requested a review from defunctzombie December 19, 2024 01:03

Fixed message size to actual record size and preamble

5c9e998

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preemptively close chunks before big messages to isolate them #1291

Preemptively close chunks before big messages to isolate them #1291

mikael-s-persson commented Dec 12, 2024

james-rms commented Dec 17, 2024

mikael-s-persson commented Dec 17, 2024

jtbandes commented Dec 17, 2024

jtbandes commented Dec 17, 2024

jhurliman commented Dec 17, 2024 •

edited

Loading

mikael-s-persson commented Dec 18, 2024

jtbandes left a comment

Preemptively close chunks before big messages to isolate them #1291

Are you sure you want to change the base?

Preemptively close chunks before big messages to isolate them #1291

Conversation

mikael-s-persson commented Dec 12, 2024

Changelog

Docs

Description

james-rms commented Dec 17, 2024

mikael-s-persson commented Dec 17, 2024

jtbandes commented Dec 17, 2024

jtbandes commented Dec 17, 2024

jhurliman commented Dec 17, 2024 • edited Loading

mikael-s-persson commented Dec 18, 2024

jtbandes left a comment

Choose a reason for hiding this comment

jhurliman commented Dec 17, 2024 •

edited

Loading