-
Notifications
You must be signed in to change notification settings - Fork 49
Try TerarkDB With Full Features
rockeet edited this page Sep 1, 2018
·
20 revisions
Now you have compiled terark rocksdb and downloaded terark-zip-rocksdb, if not, see here...
- People who need more granular control on
SSTable
(Requires code changes in your application). - People who want to know more details about
TerarkZipTable
(Our implementation ofSSTable
).
NOTE: If you just want to experience TerarkDB
without code changes of your application, please refer to Quick Start
In this RocksDB fork, we add some extra options to use TerarkDB, e.g.:
Utility | TerarkDB option |
---|---|
ldb | --use_terarkdb=1 |
- User comparator is not supported, you should encode your keys byte lexical order
-
EnvOptions::use_mmap_reads
must betrue
, can be set byDBOptions::allow_mmap_reads
-
table_factory
is a member ofColumnFamilyOptions
, notDBOptions
! - If calling
rocksdb::DB::Open()
with column families, you must settable_factory
for eachColumnFamilyDescriptor
// Caution: When calling this `Open` overload, you must
// set column_families[i].options.table_factory
//
// You may pass an rocksdb::Option object as db_options, but this
// db_options.table_factory will NOT be used!!
//
static Status Open(const DBOptions& db_options, const std::string& name,
const std::vector<ColumnFamilyDescriptor>& column_families,
std::vector<ColumnFamilyHandle*>* handles, DB** dbptr);
- Compile flags
CXXFLAGS += -I/path/to/terark-zip-rocksdb/src
- Linker flags
LDFLAGS += -L/path/to/terark-zip-rocksdb-lib
LDFLAGS += -lterark-zip-rocksdb-r
LDFLAGS += -lterark-zbs-r -lterark-fsa-r -lterark-core-r
- C++ code
#include <table/terark_zip_table.h>
/// other includes...
///....
TerarkZipTableOptions opt;
/// TerarkZipTable needs to create temp files during compression
opt.localTempDir = "/path/to/some/temp/dir"; // default is "/tmp"
/// 0 : check sum nothing
/// 1 : check sum meta data and index, check on file load
/// 2 : check sum all data, not check on file load, checksum is for
/// each record, this incurs 4 bytes overhead for each record
/// 3 : check sum all data with one checksum value, not checksum each record,
/// if checksum doesn't match, load will fail
opt.checksumLevel = 3; // default 1
/// < 0 : only last level using terarkZip
/// this is equivalent to terarkZipMinLevel == num_levels-1
/// others : use terarkZip when curlevel >= terarkZipMinLevel
/// this includes the two special cases:
/// == 0 : all levels using terarkZip
/// >= num_levels : all levels using fallback TableFactory
/// it shown that set terarkZipMinLevel = 0 is the best choice
/// if mixed with rocksdb's native SST, those SSTs may using too much
/// memory & SSD, which degrades the performance
opt.terarkZipMinLevel = 0; // default
/// optional
opt.softZipWorkingMemLimit = 16ull << 30; // default
opt.hardZipWorkingMemLimit = 32ull << 30; // default
/// to let rocksdb compaction algo know the estimate SST file size
opt.estimateCompressionRatio = 0.2;
/// the global dictionary size over all value size
opt.sampleRatio = 0.03;
/// other opt are tricky, just use default
/// rocksdb options when using terark-zip-rocksdb:
/// fallback can be NULL
auto fallback = NewBlockBasedTableFactory(); // or NewAdaptiveTableFactory();
auto factory = NewTerarkZipTableFactory(opt, fallback);
options.table_factory.reset(factory);
/// terark-zip use mmap
options.allow_mmap_reads = true;
/// universal compaction reduce write amplification and is more friendly for
/// large SST file, terark SST is better on larger SST file.
/// although universal compaction needs 2x SSD space on worst case, but
/// with terark-zip's high compression, the used SSD space is much smaller
/// than rocksdb's block compression schema
options.compaction_style = rocksdb::kCompactionStyleUniversal;
/// larger MemTable yield larger level0 SST file
/// larger SST file make terark-zip better
options.write_buffer_size = 1ull << 30; // 1G
options.target_file_size_base = 1ull << 30; // 1G
/// single sst file size on greater levels should be larger
/// filesize(level[n+1]) = filesize(level[n]) * target_file_size_multiplier
options.target_file_size_multiplier = 2; // can be larger, such as 3,5,10
/// turn off rocksdb write slowdown, optional. If write slowdown is enabled
/// and write was really slow down, you may doubt that terark-zip caused it
options.level0_slowdown_writes_trigger = INT_MAX;
options.level0_stop_writes_trigger = INT_MAX;
options.soft_pending_compaction_bytes_limit = 0;
options.hard_pending_compaction_bytes_limit = 0;