You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many of the existing tests use the FormatPage.hs reference implementation of the page binary format, the type PageLogical and its QC generator.
Prior to PR #195 the PageLogical representation was always valid, for all possible values of the representation (at least for the purpose of serialisation, which doesn't care about key sort order). That's because the reference implementation used an unbounded approach to serialisation, rather than insisting that the binary format fit within a fixed disk page size (like 4k).
In PR #195 we introduce fixed disk page sizes, to make the reference implementation and real implementation closer and thus give us an opportunity to simplify tests using the reference impl. A consequence is that PageLogical now represents a superset of valid pages, rather than exactly all valid pages, and so the encodePage now has to be partial.
Furthermore, the tests for the real implementation need (or could be simpler) with a few variations on key/value sequences.
Existing tests
We review what existing tests do, or could do, to help us decide what set of types and QC generators would help us simplify and/or improve the tests. We go in rough order from low to higher level.
RawPage
The RawPage module is for decoding.
Existing tests:
a range of unit tests
a range of focused property tests for each aspect of the page format, or part of the size distribution, comparing against the reference impl
Missing tests:
A general test against the reference impl for the whole page content
The property tests want a type that represents a valid page (with total serialisation) with its logical content. This would make toRawPage total.
PageAcc
This covers a single page, with many key/ops in it, but not the single page case with overflow.
Existing tests:
general prototype test which compares the reference and real impls for a sequence of key/ops, including checking the sequences end at the same point.
several specific instances of prototype providing guaranteed coverage (could perhaps be replaced by labels & coverage conditions)
The existing prototype comparison test could be simplified if we could turn it into a bulk comparison test. Would need a reference impl page construction that returns the suffix that didn't fit.
These tests want a sequence of key/ops that may or may not fit within a page. The generator should cover:
single page, all operations
more than will fit in a page
probably wants a range of sizes: small, near page size
These tests need to exclude single value pages that overflow. This is covered by PageAcc1.
PageAcc1
This covers a single page, with a single key/op in it, including the single page case with overflow.
Existing tests:
two property test covering equal serialised pages between real and reference impl for single key with inserts or upsert
instances of the two above for specific cases (could be replaced by labels and coverage conditions)
These tests want a generator for single key/operation pairs, wants a range of (single value) sizes: small, near page size, overflow, multiple overflow pages.
IndexCompact
The tests here use their own page summary types for test case generation and it's unclear that they would benefit from improved types or generators for pages.
RunAcc
This now covers a whole run, including the bloom filter and compact index.
Existing tests:
now-redundant test about page size == 4k
general test about page matches prototype
currently uses only nearly full pages but unclear why
generates keys that are >= 6 bytes, for compact index
can probably just use the full range of sizes: small, nealy full, overflow, multi-overflow
existing RunAcc tests are a bit of a mess after prior changes, they still talk about old PageAcc.
Missing tests:
Need to test multi-page RFP partitioning
Need general test for run consisting of multiple pages
These tests want a generator for key/op sequences that need not fit one page, and should cover multiple pages.
Need a RFP sequence partitioning, using reference impl within each partition. Reference impl does not need to be taught about RFP. But could provide a simple run construction using unfoldr + concat, in terms of a page construction returning the suffix.
Run
This is like RunAcc but now does file I/O. Much the same tests should still work at this level though.
Existing tests:
some unit tests
round trip write then read, for real impl with itself
property comparing write and open, for real impl with itself
Missing tests:
comparison of serialised run against reference impl
comparison of key/ops decoded from real run, vs original key/ops serialised by reference impl
The extra tests should mirror the RunAcc ones and so need the same types and generators.
Proposal
key/op pair sequences corresponding to a valid page (thus fitting)
key/op pair sequences that may or may not fit into a page
single key/op pair for the single-value page case
key/op pair sequences corresponding to runs of multiple pages (with keys >= 6 bytes)
What would each test use:
RawPage would use 1
PageAcc would use 2
PageAcc1 would use 3
RunAcc and Run would use 4
Extend reference impl to provide a construction function that returns the unconsumed suffix.
The text was updated successfully, but these errors were encountered:
The tests here use their own page summary types for test case generation and it's unclear that they would benefit from improved types or generators for pages.
RunAcc
This now covers a whole run, including the bloom filter and compact index.
Existing tests:
* now-redundant test about page size == 4k
* general test about page matches prototype
* currently uses only nearly full pages but unclear why
I did some digging in the git history, and I changed the generator at some point from using arbitrary :: Gen PageLogical to genFullPage, I think because the latter allows me to control how to generate keys (with >= 6 bytes), and the former doesn't. But, they don't have to be nearly full for the tests to work, I think, so that could be changed
* generates keys that are >= 6 bytes, for compact index
* can probably just use the full range of sizes: small, nealy full, overflow, multi-overflow
* existing RunAcc tests are a bit of a mess after prior changes, they still talk about old PageAcc.
Missing tests:
* Need to test multi-page RFP partitioning
* Need general test for run consisting of multiple pages
These tests want a generator for key/op sequences that need not fit one page, and should cover multiple pages.
Need a RFP sequence partitioning, using reference impl within each partition. Reference impl does not need to be taught about RFP. But could provide a simple run construction using unfoldr + concat, in terms of a page construction returning the suffix.
If we have this generator for key/op sequences that corresponding to runs of multiple pages, and if knows about RFP and contains sorted keys, then maybe we could use that instead of the page summary generator that we use for the compact index tests
Many of the existing tests use the
FormatPage.hs
reference implementation of the page binary format, the typePageLogical
and its QC generator.Prior to PR #195 the
PageLogical
representation was always valid, for all possible values of the representation (at least for the purpose of serialisation, which doesn't care about key sort order). That's because the reference implementation used an unbounded approach to serialisation, rather than insisting that the binary format fit within a fixed disk page size (like 4k).In PR #195 we introduce fixed disk page sizes, to make the reference implementation and real implementation closer and thus give us an opportunity to simplify tests using the reference impl. A consequence is that
PageLogical
now represents a superset of valid pages, rather than exactly all valid pages, and so theencodePage
now has to be partial.Furthermore, the tests for the real implementation need (or could be simpler) with a few variations on key/value sequences.
Existing tests
We review what existing tests do, or could do, to help us decide what set of types and QC generators would help us simplify and/or improve the tests. We go in rough order from low to higher level.
RawPage
The
RawPage
module is for decoding.Existing tests:
Missing tests:
The property tests want a type that represents a valid page (with total serialisation) with its logical content. This would make
toRawPage
total.PageAcc
This covers a single page, with many key/ops in it, but not the single page case with overflow.
Existing tests:
prototype
test which compares the reference and real impls for a sequence of key/ops, including checking the sequences end at the same point.prototype
providing guaranteed coverage (could perhaps be replaced by labels & coverage conditions)The existing
prototype
comparison test could be simplified if we could turn it into a bulk comparison test. Would need a reference impl page construction that returns the suffix that didn't fit.These tests want a sequence of key/ops that may or may not fit within a page. The generator should cover:
These tests need to exclude single value pages that overflow. This is covered by PageAcc1.
PageAcc1
This covers a single page, with a single key/op in it, including the single page case with overflow.
Existing tests:
These tests want a generator for single key/operation pairs, wants a range of (single value) sizes: small, near page size, overflow, multiple overflow pages.
IndexCompact
The tests here use their own page summary types for test case generation and it's unclear that they would benefit from improved types or generators for pages.
RunAcc
This now covers a whole run, including the bloom filter and compact index.
Existing tests:
Missing tests:
These tests want a generator for key/op sequences that need not fit one page, and should cover multiple pages.
Need a RFP sequence partitioning, using reference impl within each partition. Reference impl does not need to be taught about RFP. But could provide a simple run construction using unfoldr + concat, in terms of a page construction returning the suffix.
Run
This is like RunAcc but now does file I/O. Much the same tests should still work at this level though.
Existing tests:
Missing tests:
The extra tests should mirror the RunAcc ones and so need the same types and generators.
Proposal
What would each test use:
Extend reference impl to provide a construction function that returns the unconsumed suffix.
The text was updated successfully, but these errors were encountered: