@@ -7,8 +7,10 @@ This document describes the DwarFS file system format, version 2.5.
77## FILE STRUCTURE
88
99A DwarFS file system image is just a sequence of blocks, optionally
10- prefixed by a "header", which is typically some sort of shell script.
11- Each block has the following format:
10+ prefixed by a "header", which is typically some sort of shell script
11+ or other executable that intends to use the "bundled" DwarFS image.
12+
13+ Each block in the DwarFS image has the following format:
1214
1315 ┌───┬───┬───┬───┬───┬───┬───┬───┐
1416 0x00 │'D'│'W'│'A'│'R'│'F'│'S'│MAJ│MIN│ MAJ=0x02, MIN=0x05 for v2.5
@@ -61,8 +63,9 @@ A couple of notes:
6163
6264- A minor version number change will be backwards compatible, i.e. an
6365 old program will refuse to read a file system with a minor version
64- larger than the one it supports. However, a new program will still
65- read all file systems with a smaller minor version number.
66+ larger than the one it supports. However, a new program can still
67+ read all file systems with a smaller minor version number, although
68+ very old versions may at some point no longer be supported.
6669
6770### Header Detection
6871
@@ -81,21 +84,32 @@ without any problems.
8184
8285### Section Types
8386
84- There are currently 4 different section types.
87+ Currently, the following different section types are defined:
8588
8689- ` BLOCK ` (0):
8790 A block of data. This is where all file data is stored. There can be
88- an arbitrary number of blocks of this type.
91+ an arbitrary number of blocks of this type. The file data can only be
92+ interpreted using the metadata blocks. The metadata contains a list
93+ of chunks for each file, each of which references a small part of the
94+ data in a single ` BLOCK ` .
8995
9096- ` METADATA_V2_SCHEMA ` (7):
91- The schema used to layout the ` METADATA_V2 ` block contents. This is
92- stored in "compact" thrift encoding.
97+ The [ schema] ( https://github.com/facebook/fbthrift/blob/main/thrift/lib/thrift/frozen.thrift )
98+ used to layout the ` METADATA_V2 ` block contents. This is stored in
99+ "compact" thrift encoding. The metadata cannot be read without the
100+ schema, as it defines the exact bit widths used to store each metadata
101+ field.
93102
94103- ` METADATA_V2 ` (8):
95104 This section contains the bulk of the metadata. It's essentially just
96105 a collection of bit-packed arrays and structures. The exact layout of
97106 each list and structure depends on the actual data and is stored
98- separately in ` METADATA_V2_SCHEMA ` .
107+ separately in ` METADATA_V2_SCHEMA ` . The metadata format is defined in
108+ [ metadata.thrift] ( ../thrift/metadata.thrift ) and the binary format that
109+ derives from that definition uses
110+ [ Frozen2] ( https://github.com/facebook/fbthrift/blob/main/thrift/lib/cpp2/frozen/Frozen.h ) .
111+ Frozen2 is not only extremely space efficient, it also allows accessing
112+ huge data structures directly through memory-mapping.
99113
100114- ` SECTION_INDEX ` (9):
101115 The section index is, well, an index of all sections in the file
@@ -117,7 +131,19 @@ There are currently 4 different section types.
117131- ` HISTORY ` (10):
118132 File system history information as defined ` thrift/history.thrift ` .
119133 This is stored in "compact" thrift encoding. Zero or more history
120- sections are supported.
134+ sections are supported. This section type is purely informational
135+ and not needed to read the DwarFS image.
136+
137+ ### Compression Algorithms
138+
139+ DwarFS supports a wide range of block compression algorithms, some of
140+ which require additional metadata. The full list of supported algorithms
141+ is defined in [ ` dwarfs/compression.h ` ] ( ../include/dwarfs/compression.h ) .
142+
143+ For compression algorithms with metadata, the metadata is defined in
144+ [ ` thrift/compression.thrift ` ] ( ../thrift/compression.thrift ) . The metadata
145+ is stored in "compact" thrift encoding at the beginning of the block, just
146+ after the header.
121147
122148## METADATA FORMAT
123149
0 commit comments