Semidbm is a fast, pure python implementation of a dbm, which is a persistent key value store. It allows you to get and set keys through a dict interface:
import semidbm db = semidbm.open('testdb', 'c') db['foo'] = 'bar' print db['foo'] db.close()
These values are persisted to disk, and you can later retrieve these key/value pairs:
# Then at a later time: db = semidbm.open('testdb', 'r') # prints "bar" print db['foo']
It was written with these things in mind:
- Pure python, supporting python 2.6, 2.7, 3.3, and 3.4.
- Cross platform, works on Windows, Linux, Mac OS X.
- Supports CPython, pypy, and jython (versions 2.7-b3 and higher).
- Simple and Fast (See Benchmarking Semidbm).
Semidbm supports python 2.6, 2.7, 3.3, and 3.4.
Read the semidbm docs for more information and how to use semidbm.
Semidbm originally started off as an improvement over the dumbdbm library in the python standard library. Below are a list of some of the improvements over dumbdbm.
Instead of an index file and a data file, the index and data have been consolidated into a single file. This single data file is always appended to, data written to the file is never modified.
Semidbm uses an append only file format. This has the potential to grow to
large sizes as space is never reclaimed. Semidbm addresses this by adding a
compact()
method that will rewrite the data file to a minimal size.
Semidbm is significantly faster than dumbdbm (keep in mind both are pure python libraries) in just about every way. The documentation shows the results of semidbm vs. other dbms, along with how to run the benchmarking script yourself.
- Not thread safe; can't be accessed by multiple processes.
- The entire index must fit in memory. This essentially means that all of the keys must fit in memory.
Post feedback and issues on github issues, or check out the latest changes at the github repo.