Tr13 is a library for constructing and using read-only compact (memory-efficient) in-memory trie data structures, using raw (byte-sequences, byte[]
) keys and values. Raw values can be automatically
converted to/from basic JDK types such as java.lang.String
. Resulting trie
s are "raw", in that they are stored as raw byte sequences (ByteBuffer
, off-heap or in-heap, or byte[]
), meaning that garbage-collection overhead should be minimal as the main data area is either off-heap (for native or mmap'ed ByteBuffer
s), or a single byte[]
.
Main benefits for using such tries is both relatively low memory usage (typically 20 - 35% less than raw input size) and low GC impact.
Mailing list: http://groups.google.com/group/ning-tr13-users
Tries need to be built in lexicographic order, so pre-sorting may be needed. For that, you may want to check out Java Merge-sort project. Key and value types
Building is done in two steps:
- Construct raw trie structure in streaming fashion, either into a memory buffer (like
ByteArrayOutputStream
), or a File. - Construct actual
trie
from serialized raw sequence (File
,byte[]
).
Two-phase processing is needed since the actual result size is not known in advance.
This tr13 from LevelDB gist shows simple usage.
Check out [https://github.com/ning/tr13/wiki]