Elektra
0.9.10
|
quickdump
is a storage plugin based on the dump
format. It is a lot quicker (see benchmarks.md) than the old dump
plugin, because it does not use commands and stores string lengths as binary data. Through these changes all string comparisons and integer-string conversions can be eliminated, which made up for a lot of the time spent by the dump
plugin.
The format is also useful for IPC and streaming, because of this it is used by the specload
plugin.
A quickdump
file starts with the magic number 0x454b444200000003
. The first 4 bytes are the ASCII codes for EKDB
(for Elektra KDB), followed by a version number. This 64-bit is always stored as big-endian (i.e. the way it is written above).
After the magic number the file is just a list of Keys. Each Key consists of a name, a value and any number of metakey names and values. Each name and value is written as a 64-bit length n
followed by exactly n
bytes of data. For strings we do not store a null terminator. Therefore the length also does not account for that. When reading a string, the plugin allocates n+1
bytes and sets the last one to 0
. Note that ALL lengths are stored in little-endian format, because most modern machines are little-endian. To save disk space, we use a variable length encoding for integers. The exact format is described below.
We don't store the full name of the key. Instead we only store the name relative to the parent key.
The end of a key is marked by a null byte. This cannot be confused with null bytes embedded in binary key values, because of the length prefixes before each key and metavalue.
To distinguish between binary and string keys the (length of the) key value is prefixed with either a b
or an s
. Each metakey is prefixed with an m
, unless we detect that the same metakey was already present on a previous key (e.g. through keyCopyMeta
). In this case the prefix c
is used and instead of the metakey name and value, we write the name of the previous key and the metakey name.
The basic idea of the format is to store integers in base 128. This means we only use 7 bits per byte and the 8th bit (marker bit) indicates whether or not there are more bytes to read. However, to make things more efficient we move all those marker bits to the first byte. Then we can read one byte and immediately know, how much bytes follow. This is similar to what UTF-8 does.
The table below shows how the encoding works. The first byte is shown in full (x
is either 0
or 1
), then [n]
indicates that n
bytes of data follow.
Thanks to https://github.com/stoklund/varint for listing various integer encodings.
Like any other storage plugin, you simply use quickdump
during mounting, import or export.
None.
None.