@@ -64,12 +64,12 @@ then write the metadata into the reassembly section along with the trailer
64
64
at the end. This allows a stream to be converted to a Super Columnar file
65
65
in a single pass.
66
66
67
- ::: tip note
67
+ {{< tip "Note" >}}
68
68
That said, the layout is
69
69
flexible enough that an implementation may optimize the data layout with
70
70
additional passes or by writing the output to multiple files then
71
71
merging them together (or even leaving the Super Columnar entity as separate files).
72
- :::
72
+ {{< /tip >}}
73
73
74
74
### The Data Section
75
75
@@ -85,17 +85,17 @@ There is no information in the data section for how segments relate
85
85
to one another or how they are reconstructed into columns. They are just
86
86
blobs of Super Binary data.
87
87
88
- ::: tip note
88
+ {{< tip "Note" >}}
89
89
Unlike Parquet, there is no explicit arrangement of the column chunks into
90
90
row groups but rather they are allowed to grow at different rates so a
91
91
high-volume column might be comprised of many segments while a low-volume
92
92
column must just be one or several. This allows scans of low-volume record types
93
93
(the "mice") to perform well amongst high-volume record types (the "elephants"),
94
94
i.e., there are not a bunch of seeks with tiny reads of mice data interspersed
95
95
throughout the elephants.
96
- :::
96
+ {{< /tip >}}
97
97
98
- ::: tip TBD
98
+ {{< tip " TBD" >}}
99
99
The mice/elephants model creates an interesting and challenging layout
100
100
problem. If you let the row indexes get too far apart (call this "skew"), then
101
101
you have to buffer very large amounts of data to keep the column data aligned.
@@ -109,15 +109,15 @@ if you use lots of buffering on ingest, you can write the mice in front of the
109
109
elephants so the read path requires less buffering to align columns. Or you can
110
110
do two passes where you store segments in separate files then merge them at close
111
111
according to an optimization plan.
112
- :::
112
+ {{< /tip >}}
113
113
114
114
### The Reassembly Section
115
115
116
116
The reassembly section provides the information needed to reconstruct
117
117
column streams from segments, and in turn, to reconstruct the original values
118
118
from column streams, i.e., to map columns back to composite values.
119
119
120
- ::: tip note
120
+ {{< tip "Note" >}}
121
121
Of course, the reassembly section also provides the ability to extract just subsets of columns
122
122
to be read and searched efficiently without ever needing to reconstruct
123
123
the original rows. How well this performs is up to any particular
@@ -127,7 +127,7 @@ Also, the reassembly section is in general vastly smaller than the data section
127
127
so the goal here isn't to express information in cute and obscure compact forms
128
128
but rather to represent data in an easy-to-digest, programmer-friendly form that
129
129
leverages Super Binary.
130
- :::
130
+ {{< /tip >}}
131
131
132
132
The reassembly section is a Super Binary stream. Unlike Parquet,
133
133
which uses an externally described schema
@@ -147,9 +147,9 @@ A super type's integer position in this sequence defines its identifier
147
147
encoded in the [ super column] ( #the-super-column ) . This identifier is called
148
148
the super ID.
149
149
150
- ::: tip note
150
+ {{< tip "Note" >}}
151
151
Change the first N values to type values instead of nulls?
152
- :::
152
+ {{< /tip >}}
153
153
154
154
The next N+1 records contain reassembly information for each of the N super types
155
155
where each record defines the column streams needed to reconstruct the original
@@ -171,11 +171,11 @@ type signature:
171
171
In the rest of this document, we will refer to this type as ` <segmap> ` for
172
172
shorthand and refer to the concept as a "segmap".
173
173
174
- ::: tip note
174
+ {{< tip "Note" >}}
175
175
We use the type name "segmap" to emphasize that this information represents
176
176
a set of byte ranges where data is stored and must be read from * rather than*
177
177
the data itself.
178
- :::
178
+ {{< /tip >}}
179
179
180
180
#### The Super Column
181
181
@@ -216,11 +216,11 @@ This simple top-down arrangement, along with the definition of the other
216
216
column structures below, is all that is needed to reconstruct all of the
217
217
original data.
218
218
219
- ::: tip note
219
+ {{< tip "Note" >}}
220
220
Each row reassembly record has its own layout of columnar
221
221
values and there is no attempt made to store like-typed columns from different
222
222
schemas in the same physical column.
223
- :::
223
+ {{< /tip >}}
224
224
225
225
The notation ` <any_column> ` refers to any instance of the five column types:
226
226
* [ ` <record_column> ` ] ( #record-column ) ,
@@ -296,9 +296,9 @@ in the same column order implied by the union type, and
296
296
* ` tags ` is a column of ` int32 ` values where each subsequent value encodes
297
297
the tag of the union type indicating which column the value falls within.
298
298
299
- ::: tip TBD
299
+ {{< tip " TBD" >}}
300
300
Change code to conform to columns array instead of record{c0,c1,...}
301
- :::
301
+ {{< /tip >}}
302
302
303
303
The number of times each value of ` tags ` appears must equal the number of values
304
304
in each respective column.
@@ -350,14 +350,14 @@ data in the file,
350
350
it will typically fit comfortably in memory and it can be very fast to scan the
351
351
entire reassembly structure for any purpose.
352
352
353
- ::: tip Example
353
+ {{< tip " Example" >}}
354
354
For a given query, a "scan planner" could traverse all the
355
355
reassembly records to figure out which segments will be needed, then construct
356
356
an intelligent plan for reading the needed segments and attempt to read them
357
357
in mostly sequential order, which could serve as
358
358
an optimizing intermediary between any underlying storage API and the
359
359
Super Columnar decoding logic.
360
- :::
360
+ {{< /tip >}}
361
361
362
362
To decode the "next" row, its schema index is read from the root reassembly
363
363
column stream.
0 commit comments