-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add option to force compressed data to be byte aligned #49
base: master
Are you sure you want to change the base?
Conversation
2b03734
to
1596167
Compare
@philippeitis @danielrh can i get some feedback on this please? |
I'm not actually a maintainer, but my feedback is largely nit-picking code style. However, given that the codebase itself was machine-generated, I don't think that's a major issue. Otherwise, I think providing a link in the PR to relevant documentation of this feature in other brotli implementations (if it exists, otherwise saying that it's original would also be fine) would be helpful. Again, not a maintainer, so take my comments with a grain of salt. |
@@ -340,6 +341,17 @@ value: u32) -> i32 { | |||
params.favor_cpu_efficiency = value != 0; | |||
return 1i32; | |||
} | |||
if p as (i32) == BrotliEncoderParameter::BROTLI_PARAM_BYTE_ALIGN as (i32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if p == BrotliEncoderParameter::BROTLI_PARAM_BYTE_ALIGN
should compile on rust 1.12? (I know that the rest of the code in this function uses the other convention, but from what I understand, this is all machine generated, and not how you'd naturally write the code).
@@ -721,7 +740,9 @@ fn EnsureInitialized<Alloc: BrotliAlloc> | |||
if (*s).params.quality == 0i32 || (*s).params.quality == 1i32 { | |||
lgwin = brotli_max_int(lgwin, 18i32); | |||
} | |||
EncodeWindowBits(lgwin, s.params.large_window, &mut (*s).last_bytes_, &mut (*s).last_bytes_bits_); | |||
if !(*s).params.bare_stream { | |||
EncodeWindowBits(lgwin, s.params.large_window, &mut (*s).last_bytes_, &mut (*s).last_bytes_bits_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do &mut (*s).field
- &mut s.field
works identically and looks nicer (again, machine generated code which isn't very natural).
src/enc/encode.rs
Outdated
@@ -2014,7 +2035,13 @@ fn WriteMetaBlockInternal<Alloc: BrotliAlloc, | |||
false, | |||
cb); | |||
if actual_is_last != is_last { | |||
BrotliWriteEmptyLastMetaBlock(storage_ix, storage) | |||
// insert empty block for byte alignment if required | |||
if params.byte_align && ((*storage_ix & 7u32 as (usize)) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better as (storage_ix & 7) != 0
(machine generated code)
src/enc/encode.rs
Outdated
@@ -2154,7 +2181,13 @@ fn WriteMetaBlockInternal<Alloc: BrotliAlloc, | |||
cb); | |||
} | |||
if actual_is_last != is_last { | |||
BrotliWriteEmptyLastMetaBlock(storage_ix, storage) | |||
// insert empty block for byte alignment if required | |||
if params.byte_align && ((*storage_ix & 7u32 as (usize)) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better as (storage_ix & 7) != 0 (machine generated code)
Thanks for following up. I was basing my additions on the existing codebase, I didn't realize it was machine generated. I think it'd be mildly confusing to mix use of e.g. (*s).field and s.field, but not my repo, and I'm find doing it however will get merged. To the best of my knowledge, this feature doesn't exist in any other Brotli implementation, and I can edit the PR description to reflect that a bit later. I can expand a bit on the motivation as well. |
Hi! Thanks for doing this--sorry I've been away for a bit but I'm back now...I'd love to hear about the motivation of this patch and what problem it's trying to solve...is it taking the catable command one step further to allow vanilla cat instead of the special broccoli cat tools to combine files? |
@danielrh Yes, the idea is to produce substreams which can be assembled with the generic This could be used in a number of scenarios, but the two main example I had in mind:
|
What do we want to do about the the issue of changes matching existing code conventions vs changes being reasonably idiomatic rust code? My sensibilities swing towards keeping code style consistent, but it's your call. |
799f10c
to
df6a88e
Compare
Ok this is really cool! We have a scenario that is similar to https://dropbox.tech/infrastructure/-broccoli--syncing-faster-by-syncing-less where we our customers upload to a content-addressable-store. Create a header block: Then you can both
|
@ryancdotorg Have you done anything more with this idea? I brought it over to https://github.com/johnterickson/BrotliSharpLib/tree/bytealign |
@johnterickson No, haven't been working on it. All this PR needs is a rebase, tests and updated docs... |
FYI I rebased it here: johnterickson@a3c51e8 |
This adds
byte_align
andbare_stream
parameters to enable production of Brotli compressed blocks that can be used to construct a complete stream with trivial byte-wise concatonation.To my knowledge, this functionality does not currently exist in any other Brotli implementation.
These options are exposed via the command line tool as
--bytealign
and--bare
(bare mode enables byte alignment).Byte align mode inserts an empty metadata block before the final empty block if the compressed data block is not already byte aligned.
Bare mode additionally omits the final empty block, and, in catable mode, the stream header.
TODO
Notes to maintainers
This is the first time I've ever touched Rust code, so please forgive me if I've done something silly.