Block splitter #4136

Cyan4973 · 2024-09-03T23:37:44Z

Instead of ingesting full blocks only (128 KB),
make an a-priori analysis of the data,
and infer a position to split a block at a more appropriate boundary.

This can notably happen in an archive scenario, at the boundary between 2 files of different nature within the archive.

This leads to some non-trivial compression gains, for a correspondingly acceptable speed cost.
The benefit is higher when there isn't already a post-splitter (like in higher btopt levels and above (16+)),
but even when a post-splitter is active, there is still some small compression ratio benefit, making this strategy desirable even for higher compression modes.

However, this input analysis is not free.
The initial variant is currently reserved for higher compression strategies (>= btopt) where it combines well with the post-splitter operating on known sequences.
A second, faster variant, is applied to middle strategies btlazy2 and lazy2, where it improves compressed size by ~300 KB. It achieves its faster speed by sampling the input, instead of scanning it sequentially.

For faster modes, the analysis is skipped, and replaced by a static split size (no longer limited to 128 KB only). Through tests, it appears that a static 92 KB block size brings higher compression ratio, at a small-to-negligible compression speed loss (mostly due to increased nb of blocks, hence of block headers).

Here are some benchmarks, focusing on compression savings:

silesia.tar :

level	`dev`	this `PR`	savings	note
1	73422432	73324476	-97956	static block size
7	60546773	60488311	-58462	static block size (last)
8	60015264	59700036	-315228	dynamic sampling (first)
15	57176243	56867697	-308546	dynamic sampling (last)
16	55338671	55256977	-81694	full splitting analysis (pre+post)
19	52887208	52842629	-44579	idem
22	52322760	52284324	-38436	idem

calgary.tar :

level	`dev`	this `PR`	savings	note
1	1143607	1129229	-14378	static block size
7	948081	946941	-1140	static block size (last)
8	940280	932212	-8068	dynamic sampling (first)
15	921606	911627	-9979	dynamic sampling (last)
16	882353	881217	-1136	full splitting analysis (pre+post)
19	861336	859737	-1599	idem
22	861200	859578	-1622	idem

Follow up :

Consider multiple variants, with incremental speed / accuracy trade-offs
Make the choice of splitting strategy selectable via compression parameter

Cyan4973 · 2024-09-04T00:35:50Z

mmmh,
this could be a problem:

test 45 : in-place decompression : Error : test (margin <= ZSTD_DECOMPRESSION_MARGIN(CNBuffSize, ZSTD_BLOCKSIZE_MAX)) failed

I suspect the problem is that the macro ZSTD_DECOMPRESSION_MARGIN() may make an assumption about block sizes being always full (128 KB), and derive an assumption about maximum expansion, which is no longer respected when block sizes are 92 KB by default (lower compression levels). In contrast, ZSTD_decompressionMargin() scans the compressed data, hence it probably ends up with a different result, and now both results differ.

I'm not completely sure what's the wanted behavior here...

edit: confirmed that, when I change the default block size to anything other than 128 KB, it breaks this test.
On the other hand, ZSTD_DECOMPRESSION_MARGIN() macro requires a parameter blockSize, and the test passes ZSTD_BLOCKSIZE_MAX, aka 128 KB, so it's pretty clear what it's expecting.
So the question is: how could this test be passed "reasonably", i.e. without inner knowledge of how the internal block splitting decision is done ?

Cyan4973 · 2024-10-17T21:22:33Z

Who knew adding a single source file (zstd_preSplit.c) would be such a big problem...

currently blocked trying to get the single-file library builder to work,
it doesn't include the new file, resulting in a link stage error.

and then each and every build system also requires updating its own list of files in its own format and location.

Cyan4973 · 2024-10-17T23:14:19Z

Weird stuff :

error: undefined reference to '__mulodi4'

It only happens during compilation of the clang-asan-ubsan-fuzz32 test, aka undefined sanitizer enabled for 32-bit compilation on clang (same test with gcc compiles fine).

The failure seems to correspond to where * multiplication operations on S64 aka long long variables happen.

And of course, it happens all the time on github CI, but not on any other system I can test the same code and build rule with.

Cyan4973 · 2024-10-21T00:54:07Z

All tests passed, ready for review

terrelln

Reviewed everything except zstd_preSplit.[hc]

terrelln · 2024-10-22T20:21:59Z

.github/workflows/dev-long-tests.yml

+        CC=clang CFLAGS="-O3 -m32" FUZZER_FLAGS="--long-tests" make asan-fuzztest
+
+# The following test seems to have issues on github CI specifically,
+# it does not provide the `__mulodi4` instruction emulation


Does it have issues before this PR, or only after? If it is the latter, this might also have issues in the kernel, and we would need to do something similar to what we do with ZSTD_div64()

zstd/lib/common/zstd_deps.h

Line 82 in b880f20

#define ZSTD_div64(dividend, divisor) ((dividend) / (divisor))

It looks like this might be a compiler bug.

Yes, the issue doesn't reproduce on my local linux workstation.
I suspect a bug in the version of ubsan for clang currently deployed within github ci.

It's likely that this issue will be fixed at some unspecified time in the future.
We just can't wait for this issue to be fixed, the CI needs to continue running.
We'll re-enable this test when it can work properly on Github CI again.

Also, dropping temporarily the ubsan test for clang for 32-bit x86 targets seems like a minor inconvenience, given that:

we mostly care about 64-bit x64

ubsan 32-bit is still running in CI with gcc

we keep the more important asan test for 32-bit on clang

terrelln · 2024-10-22T20:23:32Z

lib/common/compiler.h

@@ -278,6 +278,8 @@
 *  Alignment check
 *****************************************************************/

+#define ZSTD_IS_POWER_2(a) (((a) & ((a)-1)) == 0)


nit: We don't need this for compile time constants like array bounds, could we make this a function ZSTD_isPow2()?

terrelln · 2024-10-22T20:29:44Z

lib/compress/zstd_compress.c

+     * heuristic, tested as being "generally better".
+     * do not split incompressible data though: respect the 3 bytes per block overhead limit.
+     */
+    return savings ? 92 KB : 128 KB;


This looks like it is splitting if the savings is non-zero. So it will split if the savings is negative. Is that correct? If so why do we want to do that?

Nope, savings will only be >0 if there were some observed savings (aka, smaller cSize than srcSize) from previous blocks.

It also means that, for the first block, it will never split.

But this currently will also split if the savings is negative, which doesn't seem like the desired behavior. Is this what you intended?

Suggested change

return savings ? 92 KB : 128 KB;

return savings > 0 ? 92 KB : 128 KB;

Additionally, we need to compute how much overhead this can possibly add, and ensure that this block split can never grow us beyond the compress bound.

E.g.

Suggested change

return savings ? 92 KB : 128 KB;

return savings >= 3 ? 92 KB : 128 KB;

But this currently will also split if the savings is negative, which doesn't seem like the desired behavior. Is this what you intended?

That's a very good point.
In my mind, savings was boxed to remain positive,
which it is as long as it remains within compress_frameChunk().

But as soon as we leave that function and return (streaming scenario),
savings is regenerated from actual bytes consumes and written,
which can be negative if data was so far incompressible.

This is fixed in newer version, thanks to the >3 test.

Great catch @terrelln !

lib/compress/zstd_compress.c

terrelln · 2024-10-22T20:34:30Z

lib/compress/zstd_compress.c

+    /* note: conservatively only split full blocks (128 KB) currently,
+     * and even then only if there is more than 128 KB input remaining.
+     */
+    if (srcSize <= 128 KB || blockSizeMax < 128 KB)


Does this mean that the block splitter won't work in streaming mode? What if the user passes in exactly 128KB at a time?

Indeed, if the user passes exactly 128 KB at a time, block splitting will not be triggered.

I don't think that is a good idea. In my model of the streaming API, how you chunk your data doesn't impact the compression, with the exception of the final block. This seems like it could introduce confusing behavior. Can we get rid of this restriction?

OK, I'll look into it.

All full blocks are now sent to the splitter function, irrespective of the presence of additional data beyond the full block.

terrelln · 2024-10-22T20:35:58Z

lib/compress/zstd_compress.c

@@ -4556,7 +4584,7 @@ static size_t ZSTD_compress_frameChunk(ZSTD_CCtx* cctx,
                }
            }  /* if (ZSTD_useTargetCBlockSize(&cctx->appliedParams))*/

-
+            if (cSize < blockSize) savings += (blockSize - cSize);


Do we also need to subtract savings if we grow the block?

If we aren't precisely tracking the savings I can see cases where an adversarial pattern could cause compression to fail.

OK, we can add some malus when a block is not compressed.
This will be a conservative over-estimate, but that's not important, we just want to make sure there is no possible scenario that could expand data by more than 3-bytes per full block.

OK, we can add some malus when a block is not compressed.

I don't like this solution because it is a patch that hides the problem. But I don't see how it guarantees that it can never occur. I'd prefer to precisely calculate the savings so we can know exactly if we are allowed to split or not.

I've added a paragraph which explains why a 1-byte malus to savings for each incompressible block is enough to guarantee that the 3-bytes per 128 KB expansion limit cannot be breached.

terrelln · 2024-10-22T20:38:54Z

lib/compress/zstd_cwksp.h

+ * with alignment control
+ * Note : should happen only once, at workspace first initialization
+ */
+MEM_STATIC void* ZSTD_cwksp_reserve_object_aligned(ZSTD_cwksp* ws, size_t byteSize, size_t alignment)


Why do we need alignment smaller than 64? It seems it might be simpler to just keep 64-byte alignment on all allocations.

I guess it could also be done this way.
As align-8 is sufficient, align-64 felt overkill.

Agreed. Even if this is a desirable change (why?), this seems like it could be done in a different PR.

The new TMP_WORKSPACE requires an 8-bytes alignment.
The original ZSTD_cwksp_reserve_object() that was used here was aligned on sizeof(void*).
It was breaking tests on 32-bit systems.

terrelln · 2024-10-22T20:54:18Z

lib/compress/zstd_preSplit.c

+typedef struct {
+  int events[HASHTABLESIZE];
+  S64 nbEvents;
+} FingerPrint;


nit: Just one word

Suggested change

} FingerPrint;

} Fingerprint;

terrelln · 2024-10-22T20:59:42Z

lib/compress/zstd_preSplit.c

+
+#define CHUNKSIZE (8 << 10)
+/* Note: technically, we use CHUNKSIZE, so that's 8 KB */
+size_t ZSTD_splitBlock_4k(const void* src, size_t srcSize,


nit: Can we pick a better name, since as noted we're splitting by 8 KB currently? Maybe ZSTD_splitBlock_byFixedChunks()

Yes, this name is fixed in the next patch.

I went ahead and merged the next patch (sample5) in this feature branch,
so that it also updates the naming convention in the process.

terrelln · 2024-10-22T21:02:25Z

lib/compress/zstd_preSplit.c

+    initStats(fpstats);
+    recordFingerprint(&fpstats->pastEvents, p, CHUNKSIZE);
+    for (pos = CHUNKSIZE; pos < blockSizeMax; pos += CHUNKSIZE) {
+        assert(pos <= blockSizeMax - CHUNKSIZE);


I know this holds because blockSizeMax == (128 << 10), but can we fix the looping logic to guarantee this instead?

It would be cleaner to have this function support any blockSizeMax. Given that the checks in outer loop isn't super performance critical, I'd prefer clarity and explicit checks rather than assumptions.

terrelln · 2024-10-22T21:03:54Z

lib/compress/zstd_preSplit.c

+        } else {
+            mergeEvents(&fpstats->pastEvents, &fpstats->newEvents);
+            ZSTD_memset(&fpstats->newEvents, 0, sizeof(fpstats->newEvents));
+            penalty = penalty - 1 + (penalty == 0);


nit: Lets rewrite for clarity

Suggested change

penalty = penalty - 1 + (penalty == 0);

if (penalty > 0) {

--penalty;

}

felixhandte · 2024-10-22T21:36:20Z

lib/compress/zstd_cwksp.h

@@ -272,7 +276,7 @@ MEM_STATIC size_t ZSTD_cwksp_bytes_to_align_ptr(void* ptr, const size_t alignByt
 * which we can allocate from the end of the workspace.
 */
 MEM_STATIC void*  ZSTD_cwksp_initialAllocStart(ZSTD_cwksp* ws) {
-    return (void*)((size_t)ws->workspaceEnd & ~(ZSTD_CWKSP_ALIGNMENT_BYTES-1));
+    return (void*)((size_t)ws->workspaceEnd & (size_t)~(ZSTD_CWKSP_ALIGNMENT_BYTES-1));


I don't think this cast is actually correct. You need to cast inside the not expression if your goal is to ensure that the not doesn't happen to a narrower type that then gets zero-extended during promotion to the size_t.

Also should we be casting to intptr_t, not size_t?

Also should we be casting to intptr_t, not size_t?

Unfortunately, intptr_t is not guaranteed to exist, while size_t is.

Also, since this is just for alignment checks, we just care about the last bits (realistically, the last 6 bits maximum for cache-line alignment). So we could even use a smaller type. But size_t is good enough already.

rewrote ZSTD_cwksp_initialAllocStart() for clarity

felixhandte · 2024-10-22T21:40:46Z

lib/compress/zstd_compress.c

@@ -137,11 +137,12 @@ ZSTD_CCtx* ZSTD_initStaticCCtx(void* workspace, size_t workspaceSize)
    ZSTD_cwksp_move(&cctx->workspace, &ws);
    cctx->staticSize = workspaceSize;

-    /* statically sized space. entropyWorkspace never moves (but prev/next block swap places) */
+    /* statically sized space. tmpWorkspace never moves (but prev/next block swap places) */
    if (!ZSTD_cwksp_check_available(&cctx->workspace, ENTROPY_WORKSPACE_SIZE + 2 * sizeof(ZSTD_compressedBlockState_t))) return NULL;


Don't we basically have to replace every usage of ENTROPY_WORKSPACE_SIZE with TMP_WORKSPACE_SIZE? Including here?

Not every instance, since in some cases it's really "just" about entropy,
but quite possibly here.
This might be a left over.

edit :
yes, it was a left over, to be fixed.
Great catch @felixhandte !

felixhandte · 2024-10-22T21:45:50Z

lib/compress/zstd_cwksp.h

+ * with alignment control
+ * Note : should happen only once, at workspace first initialization
+ */
+MEM_STATIC void* ZSTD_cwksp_reserve_object_aligned(ZSTD_cwksp* ws, size_t byteSize, size_t alignment)


Agreed. Even if this is a desirable change (why?), this seems like it could be done in a different PR.

felixhandte

Yeah, I think you should be able to avoid all of these cwksp changes if you just switch the entropy workspace allocation from being in the objects group (and allocation phase) to being in the aligned group.

This should just mean moving the allocation down into ZSTD_reset_matchState() where we do the other aligned allocations as well as changing:

zstd/lib/compress/zstd_compress.c

Line 1709 in b880f20

size_t const entropySpace = ZSTD_cwksp_alloc_size(ENTROPY_WORKSPACE_SIZE);

to use ZSTD_cwksp_aligned_alloc_size().

The ZSTD_SLIPBLOCK_WORKSPACESIZE is 8208, which is not an even divisor of 64. So it does mean we'll grow the workspace to be a multiple of 64 and throw away 48 bytes. We could look at improving this in a follow-up PR.

Cyan4973 · 2024-10-23T18:32:53Z

Addressed the last comment by removing the need for 64-bit members in the workspace structure, so that it doesn't need 8-bytes alignment. Now, TMP_WORKSPACE uses the same reserve_object() method and alignment rule (sizeof(void*)) as was previously used by ENTROPY_WORKSPACE, and is therefore just a size change.

Note that signed 64-bit multiplications are still needed for the distance function, so this doesn't address the problem of a bug in the Github CI ubsan version for clang in 32-bit compilation mode, which remains disabled for the time being.

Cyan4973 · 2024-10-23T18:36:46Z

All comments addressed

instead of ingesting only full blocks, make an analysis of data, and infer where to split.

for better portability on Linux kernel

though I really wonder if this is a property worth maintaining.

for non 64-bit systems

let's fill the initial stats directly into target fingerprint

that samples 1 in 5 positions. This variant is fast enough for lazy2 and btlazy2, but it's less good in combination with post-splitter at higher levels (>= btopt).

@terrelln

reported by @terrelln

…loss ensure data can never be expanded by more than 3 bytes per full block.

@terrelln

suggested by @terrelln

…in the loop.

@terrelln

suggested by @terrelln

@felixhandte

following a discussion with @felixhandte

@felixhandte

detected by @felixhandte

due to integration of `sample5` strategy, leading to better compression ratios on a range of levels

strict C90 compliance test

so that it can be stored using standard alignment requirement (sizeof(void*)). Distance function still requires 64-bit signed multiplication though, so it won't change the issue regarding the bug in ubsan for clang 32-bit on github ci.

terrelln

The logic in ZSTD_compress_frameChunk() doesn't look like it guarantees that it respects ZSTD_compressBound(). It looks like it is very likely to respect the bound. But I'm positive that the fuzzers will eventually find a case where it doesn't.

terrelln · 2024-10-23T18:51:37Z

lib/compress/zstd_compress.c

+    /* note: conservatively only split full blocks (128 KB) currently,
+     * and even then only if there is more than 128 KB input remaining.
+     */
+    if (srcSize <= 128 KB || blockSizeMax < 128 KB)


I don't think that is a good idea. In my model of the streaming API, how you chunk your data doesn't impact the compression, with the exception of the final block. This seems like it could introduce confusing behavior. Can we get rid of this restriction?

terrelln · 2024-10-23T18:52:52Z

lib/compress/zstd_compress.c

+     * heuristic, tested as being "generally better".
+     * do not split incompressible data though: respect the 3 bytes per block overhead limit.
+     */
+    return savings ? 92 KB : 128 KB;


But this currently will also split if the savings is negative, which doesn't seem like the desired behavior. Is this what you intended?

Suggested change

return savings ? 92 KB : 128 KB;

return savings > 0 ? 92 KB : 128 KB;

terrelln · 2024-10-23T18:55:50Z

lib/compress/zstd_compress.c

+     * heuristic, tested as being "generally better".
+     * do not split incompressible data though: respect the 3 bytes per block overhead limit.
+     */
+    return savings ? 92 KB : 128 KB;


Additionally, we need to compute how much overhead this can possibly add, and ensure that this block split can never grow us beyond the compress bound.

E.g.

Suggested change

return savings ? 92 KB : 128 KB;

return savings >= 3 ? 92 KB : 128 KB;

terrelln · 2024-10-23T18:57:25Z

lib/compress/zstd_compress.c

@@ -4556,7 +4584,7 @@ static size_t ZSTD_compress_frameChunk(ZSTD_CCtx* cctx,
                }
            }  /* if (ZSTD_useTargetCBlockSize(&cctx->appliedParams))*/

-
+            if (cSize < blockSize) savings += (blockSize - cSize);


OK, we can add some malus when a block is not compressed.

I don't like this solution because it is a patch that hides the problem. But I don't see how it guarantees that it can never occur. I'd prefer to precisely calculate the savings so we can know exactly if we are allowed to split or not.

@terrelln

this helps make the streaming behavior more consistent, since it does no longer depend on having more data presented on the input. suggested by @terrelln

@terrelln

issue reported by @terrelln

Cyan4973 · 2024-10-23T23:29:09Z

I updated the blind-split strategy with the recommended updates from the code review.
I'm no longer concerned by a risk of over-expanding incompressible data at this point.

That being said, the blind-split strategy was proposed initially as a way to leverage "something" from the capability to ingest partial blocks for lower levels, for which the initial split heuristic was not fast enough.

With now an execution plan which seems able to offer block splitting all the way down to level 1, it makes the blind-split strategy less important, now essentially reserved to negative compression levels.

Issue is, for negative compression levels, the blind-split strategy isn't effective. This is likely because negative compression levels drop the huffman compression of literals, which is where it had the most impact.

So, if it's not useful anywhere, it's debatable if this split strategy deserves to stay in the code.

An alternative could be to just drop it.

instead of just for blind split. This is in anticipation of adversarial input, that would intentionally target the sampling pattern of the split detector. Note that, even without this protection, splitting can never expand beyond ZSTD_COMPRESSBOUND(), because this upper limit uses a 1KB block size worst case scenario, and splitting never creates blocks thath small. The protection is more to ensure that data is not expanded by more than 3-bytes per 128 KB full block, which is a much stricter limit.

first block is no longer splitted since adding the @Savings over-split protection

Cyan4973 · 2024-10-24T18:51:12Z

I updated the policy around savings over-split protection variable
to employ it for all splitting strategies,
not just the last "blind-split" strategy.

This is in anticipation of some adversarial input that would take advantage of the sampling pattern of faster split detectors,
thus making it split blocks which are effectively incompressible, thus creating additional block headers.

Note that, even without this protection, the ZSTD_COMPRESSBOUND() limit was never at risk:
this limit is based on a worst-case scenario of 1KB per block,
so even if the splitter is made to split incompressible data via adversarial inputs,
it will never get close to that limit (current minimum block size after splitting is 8 KB).

The problem is more about ZSTD_DECOMPRESSION_MARGIN(),
which uses its own worst size estimation, separate from ZSTD_COMPRESSBOUND(),
which is much more stringent (effectively, expansion there is limited to 3 bytes per full 128 KB block).
ZSTD_DECOMPRESSION_MARGIN() though is a very niche use case (overlapping input and output).

terrelln · 2024-10-24T19:27:59Z

lib/compress/zstd_compress.c

+             * otherwise only full blocks are used.
+             * But being conservative is fine,
+             * since splitting barely compressible blocks is not fruitful anyway */
+            savings += (S64)blockSize - (S64)cSize;


Thanks for fixing this! IMO this method is also more clear, because it is clear that it is tracking exactly the savings from compression.

Cyan4973 self-assigned this Sep 3, 2024

facebook-github-bot added the CLA Signed label Sep 3, 2024

Cyan4973 force-pushed the preSplit branch from 6d6d3db to 9f692af Compare October 17, 2024 18:40

Cyan4973 force-pushed the preSplit branch from ed4f04c to 31bfcce Compare October 17, 2024 22:58

Cyan4973 force-pushed the preSplit branch 3 times, most recently from a4c653c to 904fa69 Compare October 20, 2024 23:52

Cyan4973 marked this pull request as ready for review October 20, 2024 23:52

Cyan4973 force-pushed the preSplit branch from 904fa69 to e1c373f Compare October 21, 2024 00:08

Cyan4973 changed the title ~~Experiment : pre-splitter~~ Block splitter Oct 21, 2024

terrelln requested changes Oct 22, 2024

View reviewed changes

terrelln reviewed Oct 22, 2024

View reviewed changes

felixhandte reviewed Oct 22, 2024

View reviewed changes

felixhandte reviewed Oct 23, 2024

View reviewed changes

Cyan4973 force-pushed the preSplit branch from a6427a9 to 65721de Compare October 23, 2024 18:17

Cyan4973 requested a review from terrelln October 23, 2024 18:36

Cyan4973 added 8 commits October 23, 2024 11:50

XP: add a pre-splitter

a5bce4a

instead of ingesting only full blocks, make an analysis of data, and infer where to split.

fixed strict C90 semantic

9e52789

do not use new as variable name

586ca96

use ZSTD_memset()

e2d7d08

for better portability on Linux kernel

minor C++-ism

6021b66

though I really wonder if this is a property worth maintaining.

more ZSTD_memset() to apply

fa147cb

fix overlap write scenario in presence of incompressible data

83a3402

fixed RLE detection test

f83ed08

Cyan4973 and others added 19 commits October 23, 2024 11:50

fixed minor conversion warnings on Visual

433f459

fix alignment test

4685eaf

for non 64-bit systems

splitter workspace is now provided by ZSTD_CCtx*

cae8d13

fixed workspace alignment on non 64-bit systems

4ce91cb

updated regression test results

dac26ea

minor split optimization

1c62e71

let's fill the initial stats directly into target fingerprint

added a faster block splitter variant

a167571

that samples 1 in 5 positions. This variant is fast enough for lazy2 and btlazy2, but it's less good in combination with post-splitter at higher levels (>= btopt).

made ZSTD_isPower2() an inline function

7bad787

ensure lastBlock is correctly determined

5ae34e4

reported by @terrelln

conservatively estimate over-splitting in presence of incompressible …

ea85dc7

…loss ensure data can never be expanded by more than 3 bytes per full block.

renamed: FingerPrint => Fingerprint

4662f6e

suggested by @terrelln

changed loop exit condition so that there is no need to assert() with…

1ec5f9f

…in the loop.

rewrite penalty update

16450d0

suggested by @terrelln

rewrote ZSTD_cwksp_initialAllocStart() to be easier to read

06b7cfa

following a discussion with @felixhandte

fixes static state allocation check

0be334d

detected by @felixhandte

updated compression results

d2eeed5

due to integration of `sample5` strategy, leading to better compression ratios on a range of levels

fixed extraneous return

18b1e67

strict C90 compliance test

fixed minor strict pedantic C90 issue

57239c4

terrelln requested changes Oct 23, 2024

View reviewed changes

Cyan4973 added 2 commits October 23, 2024 14:18

split all full 128 KB blocks

7d3e5e3

this helps make the streaming behavior more consistent, since it does no longer depend on having more data presented on the input. suggested by @terrelln

stricter limits to ensure expansion factor with blind-split strategy

c80645a

issue reported by @terrelln

Cyan4973 force-pushed the preSplit branch from 65721de to c80645a Compare October 23, 2024 21:55

update regression results

bbda1ac

Cyan4973 added 2 commits October 24, 2024 11:36

update regression results

70c77d2

first block is no longer splitted since adding the @Savings over-split protection

terrelln approved these changes Oct 24, 2024

View reviewed changes

Cyan4973 merged commit 2dddf09 into dev Oct 24, 2024
94 checks passed

	return savings ? 92 KB : 128 KB;
	return savings > 0 ? 92 KB : 128 KB;

	return savings ? 92 KB : 128 KB;
	return savings >= 3 ? 92 KB : 128 KB;

-            penalty = penalty - 1 + (penalty == 0);
+            if (penalty > 0) {
+                --penalty;
+            }

Block splitter #4136

Block splitter #4136

Conversation

Cyan4973 commented Sep 3, 2024 • edited Loading

Cyan4973 commented Sep 4, 2024 • edited Loading

Cyan4973 commented Oct 17, 2024 • edited Loading

Cyan4973 commented Oct 17, 2024 • edited Loading

Cyan4973 commented Oct 21, 2024

terrelln left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyan4973 Oct 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyan4973 Oct 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyan4973 Oct 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

felixhandte left a comment

Choose a reason for hiding this comment

Cyan4973 commented Oct 23, 2024

Cyan4973 commented Oct 23, 2024

terrelln left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyan4973 commented Oct 23, 2024

Cyan4973 commented Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

Cyan4973 commented Sep 3, 2024 •

edited

Loading

Cyan4973 commented Sep 4, 2024 •

edited

Loading

Cyan4973 commented Oct 17, 2024 •

edited

Loading

Cyan4973 commented Oct 17, 2024 •

edited

Loading

Cyan4973 Oct 22, 2024 •

edited

Loading

Cyan4973 Oct 22, 2024 •

edited

Loading

Cyan4973 Oct 22, 2024 •

edited

Loading

Cyan4973 commented Oct 24, 2024 •

edited

Loading