@@ -6827,222 +6827,246 @@ Map Operations and Versioning
...
@@ -6827,222 +6827,246 @@ Map Operations and Versioning
Store Builder
Store Builder
@mvstore_1021_p
@mvstore_1021_p
The <code>MVStore.Builder</code> provides a fluid interface to build a store if more complex configuration options are used:
The <code>MVStore.Builder</code> provides a fluid interface to build a store if more complex configuration options are used. The following code contains all supported configuration options:
@mvstore_1022_h3
@mvstore_1022_li
cacheSizeMB: the cache size in MB.
@mvstore_1023_li
compressData: compress the data when storing.
@mvstore_1024_li
encryptionKey: the encryption key for file encryption.
@mvstore_1025_li
fileName: the name of the file, for file based stores.
@mvstore_1026_li
readOnly: open the file in read-only mode.
@mvstore_1027_li
writeBufferSize: the size of the write buffer in MB.
@mvstore_1028_li
writeDelay: the maximum delay until committed changes are stored (unless stored explicitly).
@mvstore_1029_h3
R-Tree
R-Tree
@mvstore_1023_p
@mvstore_1030_p
The <code>MVRTreeMap</code> is an R-tree implementation that supports fast spatial queries. It can be used as follows:
The <code>MVRTreeMap</code> is an R-tree implementation that supports fast spatial queries. It can be used as follows:
@mvstore_1024_h2
@mvstore_1031_p
The default number of dimensions is 2. To use a different number of dimensions, use <code>new MVRTreeMap.Builder<String>().dimensions(3)</code>. The minimum number of dimensions is 1, the maximum is 255.
@mvstore_1032_h2
Features
Features
@mvstore_1025_h3
@mvstore_1033_h3
Maps
Maps
@mvstore_1026_p
@mvstore_1034_p
Each store supports a set of named maps. A map is sorted by key, and supports the common lookup operations, including access to the first and last key, iterate over some or all keys, and so on.
Each store supports a set of named maps. A map is sorted by key, and supports the common lookup operations, including access to the first and last key, iterate over some or all keys, and so on.
@mvstore_1027_p
@mvstore_1035_p
Also supported, and very uncommon for maps, is fast index lookup. The keys of the map can be accessed like a list (get the key at the given index, get the index of a certain key). That means getting the median of two keys is trivial, and it allows to very quickly count ranges. The iterator supports fast skipping. This is possible because internally, each map is organized in the form of a counted B+-tree.
Also supported, and very uncommon for maps, is fast index lookup. The keys of the map can be accessed like a list (get the key at the given index, get the index of a certain key). That means getting the median of two keys is trivial, and it allows to very quickly count ranges. The iterator supports fast skipping. This is possible because internally, each map is organized in the form of a counted B+-tree.
@mvstore_1028_p
@mvstore_1036_p
In database terms, a map can be used like a table, where the key of the map is the primary key of the table, and the value is the row. A map can also represent an index, where the key of the map is the key of the index, and the value of the map is the primary key of the table (for non-unique indexes, the key of the map must also contain the primary key).
In database terms, a map can be used like a table, where the key of the map is the primary key of the table, and the value is the row. A map can also represent an index, where the key of the map is the key of the index, and the value of the map is the primary key of the table (for non-unique indexes, the key of the map must also contain the primary key).
@mvstore_1029_h3
@mvstore_1037_h3
Versions / Transactions
Versions / Transactions
@mvstore_1030_p
@mvstore_1038_p
Multiple versions are supported. A version is a snapshot of all the data of all maps at a given point in time. A transaction is a number of actions between two versions.
Multiple versions are supported. A version is a snapshot of all the data of all maps at a given point in time. A transaction is a number of actions between two versions.
@mvstore_1031_p
@mvstore_1039_p
Versions / transactions are not immediately persisted; instead, only the version counter is incremented. If there is a change after switching to a new version, a snapshot of the old version is kept in memory, so that it can still be read.
Versions / transactions are not immediately persisted; instead, only the version counter is incremented. If there is a change after switching to a new version, a snapshot of the old version is kept in memory, so that it can still be read.
@mvstore_1032_p
@mvstore_1040_p
Old persisted versions are readable until the old data was explicitly overwritten. Creating a snapshot is fast: only the pages that are changed after a snapshot are copied. This behavior also called COW (copy on write).
Old persisted versions are readable until the old data was explicitly overwritten. Creating a snapshot is fast: only the pages that are changed after a snapshot are copied. This behavior also called COW (copy on write).
@mvstore_1033_p
@mvstore_1041_p
Rollback is supported (rollback to any old in-memory version or an old persisted version).
Rollback is supported (rollback to any old in-memory version or an old persisted version).
@mvstore_1034_h3
@mvstore_1042_h3
In-Memory Performance and Usage
In-Memory Performance and Usage
@mvstore_1035_p
@mvstore_1043_p
Performance of in-memory operations is comparable with <code>java.util.TreeMap</code> (many operations are actually faster), but usually slower than <code>java.util.HashMap</code>.
Performance of in-memory operations is comparable with <code>java.util.TreeMap</code> (many operations are actually faster), but usually slower than <code>java.util.HashMap</code>.
@mvstore_1036_p
@mvstore_1044_p
The memory overhead for large maps is slightly better than for the regular map implementations, but there is a higher overhead per map. For maps with less than 25 entries, the regular map implementations use less memory on average.
The memory overhead for large maps is slightly better than for the regular map implementations, but there is a higher overhead per map. For maps with less than 25 entries, the regular map implementations use less memory on average.
@mvstore_1037_p
@mvstore_1045_p
If no file name is specified, the store operates purely in memory. Except for persisting data, all features are supported in this mode (multi-versioning, index lookup, R-tree and so on). If a file name is specified, all operations occur in memory (with the same performance characteristics) until data is persisted.
If no file name is specified, the store operates purely in memory. Except for persisting data, all features are supported in this mode (multi-versioning, index lookup, R-tree and so on). If a file name is specified, all operations occur in memory (with the same performance characteristics) until data is persisted.
@mvstore_1038_h3
@mvstore_1046_h3
Pluggable Data Types
Pluggable Data Types
@mvstore_1039_p
@mvstore_1047_p
Serialization is pluggable. The default serialization currently supports many common data types, and uses Java serialization for other objects. The following classes are currently directly supported: <code>Boolean, Byte, Short, Character, Integer, Long, Float, Double, BigInteger, BigDecimal, byte[], char[], int[], long[], String, UUID</code>. The plan is to add more common classes (date, time, timestamp, object array).
Serialization is pluggable. The default serialization currently supports many common data types, and uses Java serialization for other objects. The following classes are currently directly supported: <code>Boolean, Byte, Short, Character, Integer, Long, Float, Double, BigInteger, BigDecimal, byte[], char[], int[], long[], String, UUID</code>. The plan is to add more common classes (date, time, timestamp, object array).
@mvstore_1040_p
@mvstore_1048_p
Parameterized data types are supported (for example one could build a string data type that limits the length for some reason).
Parameterized data types are supported (for example one could build a string data type that limits the length for some reason).
@mvstore_1041_p
@mvstore_1049_p
The storage engine itself does not have any length limits, so that keys, values, pages, and chunks can be very big (as big as fits in memory). Also, there is no inherent limit to the number of maps and chunks. Due to using a log structured storage, there is no special case handling for large keys or pages.
The storage engine itself does not have any length limits, so that keys, values, pages, and chunks can be very big (as big as fits in memory). Also, there is no inherent limit to the number of maps and chunks. Due to using a log structured storage, there is no special case handling for large keys or pages.
@mvstore_1042_h3
@mvstore_1050_h3
BLOB Support
BLOB Support
@mvstore_1043_p
@mvstore_1051_p
There is a mechanism that stores large binary objects by splitting them into smaller blocks. This allows to store objects that don't fit in memory. Streaming as well as random access reads on such objects are supported. This tool is written on top of the store (only using the map interface).
There is a mechanism that stores large binary objects by splitting them into smaller blocks. This allows to store objects that don't fit in memory. Streaming as well as random access reads on such objects are supported. This tool is written on top of the store (only using the map interface).
@mvstore_1044_h3
@mvstore_1052_h3
R-Tree and Pluggable Map Implementations
R-Tree and Pluggable Map Implementations
@mvstore_1045_p
@mvstore_1053_p
The map implementation is pluggable. In addition to the default MVMap (multi-version map), there is a multi-version R-tree map implementation for spatial operations (contain and intersection; nearest neighbor is not yet implemented).
The map implementation is pluggable. In addition to the default MVMap (multi-version map), there is a multi-version R-tree map implementation for spatial operations (contain and intersection; nearest neighbor is not yet implemented).
@mvstore_1046_h3
@mvstore_1054_h3
Concurrent Operations and Caching
Concurrent Operations and Caching
@mvstore_1047_p
@mvstore_1055_p
The default map implementation supports concurrent reads on old versions of the data. All such read operations can occur in parallel. Concurrent reads from the page cache, as well as concurrent reads from the file system are supported.
The default map implementation supports concurrent reads on old versions of the data. All such read operations can occur in parallel. Concurrent reads from the page cache, as well as concurrent reads from the file system are supported.
@mvstore_1048_p
@mvstore_1056_p
Storing changes can occur concurrently to modifying the data, as <code>store()</code> operates on a snapshot.
Storing changes can occur concurrently to modifying the data, as <code>store()</code> operates on a snapshot.
@mvstore_1049_p
@mvstore_1057_p
Caching is done on the page level. The page cache is a concurrent LIRS cache, which should be resistant against scan operations.
Caching is done on the page level. The page cache is a concurrent LIRS cache, which should be resistant against scan operations.
@mvstore_1050_p
@mvstore_1058_p
The default map implementation does not support concurrent modification operations on a map (the same as <code>HashMap</code> and <code>TreeMap</code>).
The default map implementation does not support concurrent modification operations on a map (the same as <code>HashMap</code> and <code>TreeMap</code>).
@mvstore_1051_p
@mvstore_1059_p
With the <code>MVMapConcurrent</code> implementation, read operations even on the newest version can happen concurrently with all other operations, without risk of corruption. This comes with slightly reduced speed in single threaded mode, the same as with other <code>ConcurrentHashMap</code> implementations. Write operations first read the relevant area from disk to memory (this can happen concurrently), and only then modify the data. The in-memory part of write operations is synchronized.
With the <code>MVMapConcurrent</code> implementation, read operations even on the newest version can happen concurrently with all other operations, without risk of corruption. This comes with slightly reduced speed in single threaded mode, the same as with other <code>ConcurrentHashMap</code> implementations. Write operations first read the relevant area from disk to memory (this can happen concurrently), and only then modify the data. The in-memory part of write operations is synchronized.
@mvstore_1052_p
@mvstore_1060_p
For fully scalable concurrent write operations to a map (in-memory and to disk), the map could be split into multiple maps in different stores ('sharding'). The plan is to add such a mechanism later when needed.
For fully scalable concurrent write operations to a map (in-memory and to disk), the map could be split into multiple maps in different stores ('sharding'). The plan is to add such a mechanism later when needed.
@mvstore_1053_h3
@mvstore_1061_h3
Log Structured Storage
Log Structured Storage
@mvstore_1054_p
@mvstore_1062_p
Currently, <code>store()</code> needs to be called explicitly to save changes. Changes are buffered in memory, and once enough changes have accumulated (for example 2 MB), all changes are written in one continuous disk write operation. (According to a test, write throughput of a common SSD gets higher the larger the block size, until a block size of 2 MB, and then does not further increase.) But of course, if needed, changes can also be persisted if only little data was changed. The estimated amount of unsaved changes is tracked. The plan is to automatically store in a background thread once there are enough changes.
Currently, <code>store()</code> needs to be called explicitly to save changes. Changes are buffered in memory, and once enough changes have accumulated (for example 2 MB), all changes are written in one continuous disk write operation. (According to a test, write throughput of a common SSD gets higher the larger the block size, until a block size of 2 MB, and then does not further increase.) But of course, if needed, changes can also be persisted if only little data was changed. The estimated amount of unsaved changes is tracked. The plan is to automatically store in a background thread once there are enough changes.
@mvstore_1055_p
@mvstore_1063_p
When storing, all changed pages are serialized, optionally compressed using the LZF algorithm, and written sequentially to a free area of the file. Each such change set is called a chunk. All parent pages of the changed B-trees are stored in this chunk as well, so that each chunk also contains the root of each changed map (which is the entry point to read this version of the data). There is no separate index: all data is stored as a list of pages. Per store, the is one additional map that contains the metadata (the list of maps, where the root page of each map is stored, and the list of chunks).
When storing, all changed pages are serialized, optionally compressed using the LZF algorithm, and written sequentially to a free area of the file. Each such change set is called a chunk. All parent pages of the changed B-trees are stored in this chunk as well, so that each chunk also contains the root of each changed map (which is the entry point to read this version of the data). There is no separate index: all data is stored as a list of pages. Per store, the is one additional map that contains the metadata (the list of maps, where the root page of each map is stored, and the list of chunks).
@mvstore_1056_p
@mvstore_1064_p
There are usually two write operations per chunk: one to store the chunk data (the pages), and one to update the file header (so it points to the latest chunk). If the chunk is appended at the end of the file, the file header is only written at the end of the chunk.
There are usually two write operations per chunk: one to store the chunk data (the pages), and one to update the file header (so it points to the latest chunk). If the chunk is appended at the end of the file, the file header is only written at the end of the chunk.
@mvstore_1057_p
@mvstore_1065_p
There is currently no transaction log, no undo log, and there are no in-place updates (however unused chunks are overwritten). To save space when persisting very small transactions, the plan is to use a transaction log where only the deltas are stored, until enough changes have accumulated to persist a chunk.
There is currently no transaction log, no undo log, and there are no in-place updates (however unused chunks are overwritten). To save space when persisting very small transactions, the plan is to use a transaction log where only the deltas are stored, until enough changes have accumulated to persist a chunk.
@mvstore_1058_p
@mvstore_1066_p
Old data is kept for at least 45 seconds (configurable), so that there are no explicit sync operations required to guarantee data consistency, but an application can also sync explicitly when needed. To reuse disk space, the chunks with the lowest amount of live data are compacted (the live data is simply stored again in the next chunk). To improve data locality and disk space usage, the plan is to automatically defragment and compact data.
Old data is kept for at least 45 seconds (configurable), so that there are no explicit sync operations required to guarantee data consistency, but an application can also sync explicitly when needed. To reuse disk space, the chunks with the lowest amount of live data are compacted (the live data is simply stored again in the next chunk). To improve data locality and disk space usage, the plan is to automatically defragment and compact data.
@mvstore_1059_p
@mvstore_1067_p
Compared to regular databases (that use a transaction log, undo log, and main storage area), the log structured storage is simpler, more flexible, and typically needs less disk operations per change, as data is only written once instead of twice or 3 times, and because the B-tree pages are always full (they are stored next to each other) and can be easily compressed. But temporarily, disk space usage might actually be a bit higher than for a regular database, as disk space is not immediately re-used (there are no in-place updates).
Compared to regular databases (that use a transaction log, undo log, and main storage area), the log structured storage is simpler, more flexible, and typically needs less disk operations per change, as data is only written once instead of twice or 3 times, and because the B-tree pages are always full (they are stored next to each other) and can be easily compressed. But temporarily, disk space usage might actually be a bit higher than for a regular database, as disk space is not immediately re-used (there are no in-place updates).
@mvstore_1060_h3
@mvstore_1068_h3
File System Abstraction, File Locking and Online Backup
File System Abstraction, File Locking and Online Backup
@mvstore_1061_p
@mvstore_1069_p
The file system is pluggable (the same file system abstraction is used as H2 uses). Support for encryption is planned using an encrypting file system. Other file system implementations support reading from a compressed zip or tar file.
The file system is pluggable (the same file system abstraction is used as H2 uses). Support for encryption is planned using an encrypting file system. Other file system implementations support reading from a compressed zip or tar file.
@mvstore_1062_p
@mvstore_1070_p
Each store may only be opened once within a JVM. When opening a store, the file is locked in exclusive mode, so that the file can only be changed from within one process. Files can be opened in read-only mode, in which case a shared lock is used.
Each store may only be opened once within a JVM. When opening a store, the file is locked in exclusive mode, so that the file can only be changed from within one process. Files can be opened in read-only mode, in which case a shared lock is used.
@mvstore_1063_p
@mvstore_1071_p
The persisted data can be backed up to a different file at any time, even during write operations (online backup). To do that, automatic disk space reuse needs to be first disabled, so that new data is always appended at the end of the file. Then, the file can be copied (the file handle is available to the application).
The persisted data can be backed up to a different file at any time, even during write operations (online backup). To do that, automatic disk space reuse needs to be first disabled, so that new data is always appended at the end of the file. Then, the file can be copied (the file handle is available to the application).
@mvstore_1064_h3
@mvstore_1072_h3
Encrypted Files
Encrypted Files
@mvstore_1065_p
@mvstore_1073_p
File encryption ensures the data can only be read with the correct password. Data can be encrypted as follows:
File encryption ensures the data can only be read with the correct password. Data can be encrypted as follows:
@mvstore_1066_p
@mvstore_1074_p
The following algorithms and settings are used:
The following algorithms and settings are used:
@mvstore_1067_li
@mvstore_1075_li
The password char array is cleared after use, to reduce the risk that the password is stolen even if the attacker has access to the main memory.
The password char array is cleared after use, to reduce the risk that the password is stolen even if the attacker has access to the main memory.
@mvstore_1068_li
@mvstore_1076_li
The password is hashed according to the PBKDF2 standard, using the SHA-256 hash algorithm.
The password is hashed according to the PBKDF2 standard, using the SHA-256 hash algorithm.
@mvstore_1069_li
@mvstore_1077_li
The length of the salt is 64 bits, so that an attacker can not use a pre-calculated password hash table (rainbow table). It is generated using a cryptographically secure random number generator.
The length of the salt is 64 bits, so that an attacker can not use a pre-calculated password hash table (rainbow table). It is generated using a cryptographically secure random number generator.
@mvstore_1070_li
@mvstore_1078_li
To speed up opening an encrypted stores on Android, the number of PBKDF2 iterations is 10. The higher the value, the better the protection against brute-force password cracking attacks, but the slower is opening a file.
To speed up opening an encrypted stores on Android, the number of PBKDF2 iterations is 10. The higher the value, the better the protection against brute-force password cracking attacks, but the slower is opening a file.
@mvstore_1071_li
@mvstore_1079_li
The file itself is encrypted using the standardized disk encryption mode XTS-AES. Only little more than one AES-128 round per block is needed.
The file itself is encrypted using the standardized disk encryption mode XTS-AES. Only little more than one AES-128 round per block is needed.
@mvstore_1072_h3
@mvstore_1080_h3
Tools
Tools
@mvstore_1073_p
@mvstore_1081_p
There is a tool (<code>MVStoreTool</code>) to dump the contents of a file.
There is a tool (<code>MVStoreTool</code>) to dump the contents of a file.
@mvstore_1074_h3
@mvstore_1082_h3
Exception Handling
Exception Handling
@mvstore_1075_p
@mvstore_1083_p
This tool does not throw checked exceptions. Instead, unchecked exceptions are thrown if needed. The error message always contains the version of the tool. The following exceptions can occur:
This tool does not throw checked exceptions. Instead, unchecked exceptions are thrown if needed. The error message always contains the version of the tool. The following exceptions can occur:
@mvstore_1076_code
@mvstore_1084_code
IllegalStateException
IllegalStateException
@mvstore_1077_li
@mvstore_1085_li
if a map was already closed or an IO exception occurred, for example if the file was locked, is already closed, could not be opened or closed, if reading or writing failed, if the file is corrupt, or if there is an internal error in the tool.
if a map was already closed or an IO exception occurred, for example if the file was locked, is already closed, could not be opened or closed, if reading or writing failed, if the file is corrupt, or if there is an internal error in the tool.
@mvstore_1078_code
@mvstore_1086_code
IllegalArgumentException
IllegalArgumentException
@mvstore_1079_li
@mvstore_1087_li
if a method was called with an illegal argument.
if a method was called with an illegal argument.
@mvstore_1080_code
@mvstore_1088_code
UnsupportedOperationException
UnsupportedOperationException
@mvstore_1081_li
@mvstore_1089_li
if a method was called that is not supported, for example trying to modify a read-only map or view.
if a method was called that is not supported, for example trying to modify a read-only map or view.
@mvstore_1082_h2
@mvstore_1090_h2
Similar Projects and Differences to Other Storage Engines
Similar Projects and Differences to Other Storage Engines
@mvstore_1083_p
@mvstore_1091_p
Unlike similar storage engines like LevelDB and Kyoto Cabinet, the MVStore is written in Java and can easily be embedded in a Java and Android application.
Unlike similar storage engines like LevelDB and Kyoto Cabinet, the MVStore is written in Java and can easily be embedded in a Java and Android application.
@mvstore_1084_p
@mvstore_1092_p
The MVStore is somewhat similar to the Berkeley DB Java Edition because it is also written in Java, and is also a log structured storage, but the H2 license is more liberal.
The MVStore is somewhat similar to the Berkeley DB Java Edition because it is also written in Java, and is also a log structured storage, but the H2 license is more liberal.
@mvstore_1085_p
@mvstore_1093_p
Like SQLite, the MVStore keeps all data in one file. Unlike SQLite, the MVStore uses is a log structured storage. The plan is to make the MVStore both easier to use as well as faster than SQLite. In a recent (very simple) test, the MVStore was about twice as fast as SQLite on Android.
Like SQLite, the MVStore keeps all data in one file. Unlike SQLite, the MVStore uses is a log structured storage. The plan is to make the MVStore both easier to use as well as faster than SQLite. In a recent (very simple) test, the MVStore was about twice as fast as SQLite on Android.
@mvstore_1086_p
@mvstore_1094_p
The API of the MVStore is similar to MapDB (previously known as JDBM) from Jan Kotek, and some code is shared between MapDB and JDBM. However, unlike MapDB, the MVStore uses is a log structured storage. The MVStore does not have a record size limit.
The API of the MVStore is similar to MapDB (previously known as JDBM) from Jan Kotek, and some code is shared between MapDB and JDBM. However, unlike MapDB, the MVStore uses is a log structured storage. The MVStore does not have a record size limit.
@mvstore_1087_h2
@mvstore_1095_h2
Current State
Current State
@mvstore_1088_p
@mvstore_1096_p
The code is still very experimental at this stage. The API as well as the behavior will probably change. Features may be added and removed (even thought the main features will stay).
The code is still very experimental at this stage. The API as well as the behavior will probably change. Features may be added and removed (even thought the main features will stay).
@mvstore_1089_h2
@mvstore_1097_h2
Requirements
Requirements
@mvstore_1090_p
@mvstore_1098_p
The MVStore is included in the latest H2 jar file.
The MVStore is included in the latest H2 jar file.
@mvstore_1091_p
@mvstore_1099_p
There are no special requirements to use it. The MVStore should run on any JVM as well as on Android.
There are no special requirements to use it. The MVStore should run on any JVM as well as on Android.
@mvstore_1092_p
@mvstore_1100_p
To build just the MVStore (without the database engine), run:
To build just the MVStore (without the database engine), run:
@mvstore_1093_p
@mvstore_1101_p
This will create the file <code>bin/h2mvstore-1.3.170.jar</code> (about 130 KB).
This will create the file <code>bin/h2mvstore-1.3.170.jar</code> (about 130 KB).
# The <code>MVStore.Builder</code> provides a fluid interface to build a store if more complex configuration options are used:
# The <code>MVStore.Builder</code> provides a fluid interface to build a store if more complex configuration options are used. The following code contains all supported configuration options:
@mvstore_1022_h3
@mvstore_1022_li
#cacheSizeMB: the cache size in MB.
@mvstore_1023_li
#compressData: compress the data when storing.
@mvstore_1024_li
#encryptionKey: the encryption key for file encryption.
@mvstore_1025_li
#fileName: the name of the file, for file based stores.
@mvstore_1026_li
#readOnly: open the file in read-only mode.
@mvstore_1027_li
#writeBufferSize: the size of the write buffer in MB.
@mvstore_1028_li
#writeDelay: the maximum delay until committed changes are stored (unless stored explicitly).
@mvstore_1029_h3
#R-Tree
#R-Tree
@mvstore_1023_p
@mvstore_1030_p
# The <code>MVRTreeMap</code> is an R-tree implementation that supports fast spatial queries. It can be used as follows:
# The <code>MVRTreeMap</code> is an R-tree implementation that supports fast spatial queries. It can be used as follows:
@mvstore_1024_h2
@mvstore_1031_p
# The default number of dimensions is 2. To use a different number of dimensions, use <code>new MVRTreeMap.Builder<String>().dimensions(3)</code>. The minimum number of dimensions is 1, the maximum is 255.
@mvstore_1032_h2
特徴
特徴
@mvstore_1025_h3
@mvstore_1033_h3
#Maps
#Maps
@mvstore_1026_p
@mvstore_1034_p
# Each store supports a set of named maps. A map is sorted by key, and supports the common lookup operations, including access to the first and last key, iterate over some or all keys, and so on.
# Each store supports a set of named maps. A map is sorted by key, and supports the common lookup operations, including access to the first and last key, iterate over some or all keys, and so on.
@mvstore_1027_p
@mvstore_1035_p
# Also supported, and very uncommon for maps, is fast index lookup. The keys of the map can be accessed like a list (get the key at the given index, get the index of a certain key). That means getting the median of two keys is trivial, and it allows to very quickly count ranges. The iterator supports fast skipping. This is possible because internally, each map is organized in the form of a counted B+-tree.
# Also supported, and very uncommon for maps, is fast index lookup. The keys of the map can be accessed like a list (get the key at the given index, get the index of a certain key). That means getting the median of two keys is trivial, and it allows to very quickly count ranges. The iterator supports fast skipping. This is possible because internally, each map is organized in the form of a counted B+-tree.
@mvstore_1028_p
@mvstore_1036_p
# In database terms, a map can be used like a table, where the key of the map is the primary key of the table, and the value is the row. A map can also represent an index, where the key of the map is the key of the index, and the value of the map is the primary key of the table (for non-unique indexes, the key of the map must also contain the primary key).
# In database terms, a map can be used like a table, where the key of the map is the primary key of the table, and the value is the row. A map can also represent an index, where the key of the map is the key of the index, and the value of the map is the primary key of the table (for non-unique indexes, the key of the map must also contain the primary key).
@mvstore_1029_h3
@mvstore_1037_h3
#Versions / Transactions
#Versions / Transactions
@mvstore_1030_p
@mvstore_1038_p
# Multiple versions are supported. A version is a snapshot of all the data of all maps at a given point in time. A transaction is a number of actions between two versions.
# Multiple versions are supported. A version is a snapshot of all the data of all maps at a given point in time. A transaction is a number of actions between two versions.
@mvstore_1031_p
@mvstore_1039_p
# Versions / transactions are not immediately persisted; instead, only the version counter is incremented. If there is a change after switching to a new version, a snapshot of the old version is kept in memory, so that it can still be read.
# Versions / transactions are not immediately persisted; instead, only the version counter is incremented. If there is a change after switching to a new version, a snapshot of the old version is kept in memory, so that it can still be read.
@mvstore_1032_p
@mvstore_1040_p
# Old persisted versions are readable until the old data was explicitly overwritten. Creating a snapshot is fast: only the pages that are changed after a snapshot are copied. This behavior also called COW (copy on write).
# Old persisted versions are readable until the old data was explicitly overwritten. Creating a snapshot is fast: only the pages that are changed after a snapshot are copied. This behavior also called COW (copy on write).
@mvstore_1033_p
@mvstore_1041_p
# Rollback is supported (rollback to any old in-memory version or an old persisted version).
# Rollback is supported (rollback to any old in-memory version or an old persisted version).
@mvstore_1034_h3
@mvstore_1042_h3
#In-Memory Performance and Usage
#In-Memory Performance and Usage
@mvstore_1035_p
@mvstore_1043_p
# Performance of in-memory operations is comparable with <code>java.util.TreeMap</code> (many operations are actually faster), but usually slower than <code>java.util.HashMap</code>.
# Performance of in-memory operations is comparable with <code>java.util.TreeMap</code> (many operations are actually faster), but usually slower than <code>java.util.HashMap</code>.
@mvstore_1036_p
@mvstore_1044_p
# The memory overhead for large maps is slightly better than for the regular map implementations, but there is a higher overhead per map. For maps with less than 25 entries, the regular map implementations use less memory on average.
# The memory overhead for large maps is slightly better than for the regular map implementations, but there is a higher overhead per map. For maps with less than 25 entries, the regular map implementations use less memory on average.
@mvstore_1037_p
@mvstore_1045_p
# If no file name is specified, the store operates purely in memory. Except for persisting data, all features are supported in this mode (multi-versioning, index lookup, R-tree and so on). If a file name is specified, all operations occur in memory (with the same performance characteristics) until data is persisted.
# If no file name is specified, the store operates purely in memory. Except for persisting data, all features are supported in this mode (multi-versioning, index lookup, R-tree and so on). If a file name is specified, all operations occur in memory (with the same performance characteristics) until data is persisted.
@mvstore_1038_h3
@mvstore_1046_h3
#Pluggable Data Types
#Pluggable Data Types
@mvstore_1039_p
@mvstore_1047_p
# Serialization is pluggable. The default serialization currently supports many common data types, and uses Java serialization for other objects. The following classes are currently directly supported: <code>Boolean, Byte, Short, Character, Integer, Long, Float, Double, BigInteger, BigDecimal, byte[], char[], int[], long[], String, UUID</code>. The plan is to add more common classes (date, time, timestamp, object array).
# Serialization is pluggable. The default serialization currently supports many common data types, and uses Java serialization for other objects. The following classes are currently directly supported: <code>Boolean, Byte, Short, Character, Integer, Long, Float, Double, BigInteger, BigDecimal, byte[], char[], int[], long[], String, UUID</code>. The plan is to add more common classes (date, time, timestamp, object array).
@mvstore_1040_p
@mvstore_1048_p
# Parameterized data types are supported (for example one could build a string data type that limits the length for some reason).
# Parameterized data types are supported (for example one could build a string data type that limits the length for some reason).
@mvstore_1041_p
@mvstore_1049_p
# The storage engine itself does not have any length limits, so that keys, values, pages, and chunks can be very big (as big as fits in memory). Also, there is no inherent limit to the number of maps and chunks. Due to using a log structured storage, there is no special case handling for large keys or pages.
# The storage engine itself does not have any length limits, so that keys, values, pages, and chunks can be very big (as big as fits in memory). Also, there is no inherent limit to the number of maps and chunks. Due to using a log structured storage, there is no special case handling for large keys or pages.
@mvstore_1042_h3
@mvstore_1050_h3
#BLOB Support
#BLOB Support
@mvstore_1043_p
@mvstore_1051_p
# There is a mechanism that stores large binary objects by splitting them into smaller blocks. This allows to store objects that don't fit in memory. Streaming as well as random access reads on such objects are supported. This tool is written on top of the store (only using the map interface).
# There is a mechanism that stores large binary objects by splitting them into smaller blocks. This allows to store objects that don't fit in memory. Streaming as well as random access reads on such objects are supported. This tool is written on top of the store (only using the map interface).
@mvstore_1044_h3
@mvstore_1052_h3
#R-Tree and Pluggable Map Implementations
#R-Tree and Pluggable Map Implementations
@mvstore_1045_p
@mvstore_1053_p
# The map implementation is pluggable. In addition to the default MVMap (multi-version map), there is a multi-version R-tree map implementation for spatial operations (contain and intersection; nearest neighbor is not yet implemented).
# The map implementation is pluggable. In addition to the default MVMap (multi-version map), there is a multi-version R-tree map implementation for spatial operations (contain and intersection; nearest neighbor is not yet implemented).
@mvstore_1046_h3
@mvstore_1054_h3
#Concurrent Operations and Caching
#Concurrent Operations and Caching
@mvstore_1047_p
@mvstore_1055_p
# The default map implementation supports concurrent reads on old versions of the data. All such read operations can occur in parallel. Concurrent reads from the page cache, as well as concurrent reads from the file system are supported.
# The default map implementation supports concurrent reads on old versions of the data. All such read operations can occur in parallel. Concurrent reads from the page cache, as well as concurrent reads from the file system are supported.
@mvstore_1048_p
@mvstore_1056_p
# Storing changes can occur concurrently to modifying the data, as <code>store()</code> operates on a snapshot.
# Storing changes can occur concurrently to modifying the data, as <code>store()</code> operates on a snapshot.
@mvstore_1049_p
@mvstore_1057_p
# Caching is done on the page level. The page cache is a concurrent LIRS cache, which should be resistant against scan operations.
# Caching is done on the page level. The page cache is a concurrent LIRS cache, which should be resistant against scan operations.
@mvstore_1050_p
@mvstore_1058_p
# The default map implementation does not support concurrent modification operations on a map (the same as <code>HashMap</code> and <code>TreeMap</code>).
# The default map implementation does not support concurrent modification operations on a map (the same as <code>HashMap</code> and <code>TreeMap</code>).
@mvstore_1051_p
@mvstore_1059_p
# With the <code>MVMapConcurrent</code> implementation, read operations even on the newest version can happen concurrently with all other operations, without risk of corruption. This comes with slightly reduced speed in single threaded mode, the same as with other <code>ConcurrentHashMap</code> implementations. Write operations first read the relevant area from disk to memory (this can happen concurrently), and only then modify the data. The in-memory part of write operations is synchronized.
# With the <code>MVMapConcurrent</code> implementation, read operations even on the newest version can happen concurrently with all other operations, without risk of corruption. This comes with slightly reduced speed in single threaded mode, the same as with other <code>ConcurrentHashMap</code> implementations. Write operations first read the relevant area from disk to memory (this can happen concurrently), and only then modify the data. The in-memory part of write operations is synchronized.
@mvstore_1052_p
@mvstore_1060_p
# For fully scalable concurrent write operations to a map (in-memory and to disk), the map could be split into multiple maps in different stores ('sharding'). The plan is to add such a mechanism later when needed.
# For fully scalable concurrent write operations to a map (in-memory and to disk), the map could be split into multiple maps in different stores ('sharding'). The plan is to add such a mechanism later when needed.
@mvstore_1053_h3
@mvstore_1061_h3
#Log Structured Storage
#Log Structured Storage
@mvstore_1054_p
@mvstore_1062_p
# Currently, <code>store()</code> needs to be called explicitly to save changes. Changes are buffered in memory, and once enough changes have accumulated (for example 2 MB), all changes are written in one continuous disk write operation. (According to a test, write throughput of a common SSD gets higher the larger the block size, until a block size of 2 MB, and then does not further increase.) But of course, if needed, changes can also be persisted if only little data was changed. The estimated amount of unsaved changes is tracked. The plan is to automatically store in a background thread once there are enough changes.
# Currently, <code>store()</code> needs to be called explicitly to save changes. Changes are buffered in memory, and once enough changes have accumulated (for example 2 MB), all changes are written in one continuous disk write operation. (According to a test, write throughput of a common SSD gets higher the larger the block size, until a block size of 2 MB, and then does not further increase.) But of course, if needed, changes can also be persisted if only little data was changed. The estimated amount of unsaved changes is tracked. The plan is to automatically store in a background thread once there are enough changes.
@mvstore_1055_p
@mvstore_1063_p
# When storing, all changed pages are serialized, optionally compressed using the LZF algorithm, and written sequentially to a free area of the file. Each such change set is called a chunk. All parent pages of the changed B-trees are stored in this chunk as well, so that each chunk also contains the root of each changed map (which is the entry point to read this version of the data). There is no separate index: all data is stored as a list of pages. Per store, the is one additional map that contains the metadata (the list of maps, where the root page of each map is stored, and the list of chunks).
# When storing, all changed pages are serialized, optionally compressed using the LZF algorithm, and written sequentially to a free area of the file. Each such change set is called a chunk. All parent pages of the changed B-trees are stored in this chunk as well, so that each chunk also contains the root of each changed map (which is the entry point to read this version of the data). There is no separate index: all data is stored as a list of pages. Per store, the is one additional map that contains the metadata (the list of maps, where the root page of each map is stored, and the list of chunks).
@mvstore_1056_p
@mvstore_1064_p
# There are usually two write operations per chunk: one to store the chunk data (the pages), and one to update the file header (so it points to the latest chunk). If the chunk is appended at the end of the file, the file header is only written at the end of the chunk.
# There are usually two write operations per chunk: one to store the chunk data (the pages), and one to update the file header (so it points to the latest chunk). If the chunk is appended at the end of the file, the file header is only written at the end of the chunk.
@mvstore_1057_p
@mvstore_1065_p
# There is currently no transaction log, no undo log, and there are no in-place updates (however unused chunks are overwritten). To save space when persisting very small transactions, the plan is to use a transaction log where only the deltas are stored, until enough changes have accumulated to persist a chunk.
# There is currently no transaction log, no undo log, and there are no in-place updates (however unused chunks are overwritten). To save space when persisting very small transactions, the plan is to use a transaction log where only the deltas are stored, until enough changes have accumulated to persist a chunk.
@mvstore_1058_p
@mvstore_1066_p
# Old data is kept for at least 45 seconds (configurable), so that there are no explicit sync operations required to guarantee data consistency, but an application can also sync explicitly when needed. To reuse disk space, the chunks with the lowest amount of live data are compacted (the live data is simply stored again in the next chunk). To improve data locality and disk space usage, the plan is to automatically defragment and compact data.
# Old data is kept for at least 45 seconds (configurable), so that there are no explicit sync operations required to guarantee data consistency, but an application can also sync explicitly when needed. To reuse disk space, the chunks with the lowest amount of live data are compacted (the live data is simply stored again in the next chunk). To improve data locality and disk space usage, the plan is to automatically defragment and compact data.
@mvstore_1059_p
@mvstore_1067_p
# Compared to regular databases (that use a transaction log, undo log, and main storage area), the log structured storage is simpler, more flexible, and typically needs less disk operations per change, as data is only written once instead of twice or 3 times, and because the B-tree pages are always full (they are stored next to each other) and can be easily compressed. But temporarily, disk space usage might actually be a bit higher than for a regular database, as disk space is not immediately re-used (there are no in-place updates).
# Compared to regular databases (that use a transaction log, undo log, and main storage area), the log structured storage is simpler, more flexible, and typically needs less disk operations per change, as data is only written once instead of twice or 3 times, and because the B-tree pages are always full (they are stored next to each other) and can be easily compressed. But temporarily, disk space usage might actually be a bit higher than for a regular database, as disk space is not immediately re-used (there are no in-place updates).
@mvstore_1060_h3
@mvstore_1068_h3
#File System Abstraction, File Locking and Online Backup
#File System Abstraction, File Locking and Online Backup
@mvstore_1061_p
@mvstore_1069_p
# The file system is pluggable (the same file system abstraction is used as H2 uses). Support for encryption is planned using an encrypting file system. Other file system implementations support reading from a compressed zip or tar file.
# The file system is pluggable (the same file system abstraction is used as H2 uses). Support for encryption is planned using an encrypting file system. Other file system implementations support reading from a compressed zip or tar file.
@mvstore_1062_p
@mvstore_1070_p
# Each store may only be opened once within a JVM. When opening a store, the file is locked in exclusive mode, so that the file can only be changed from within one process. Files can be opened in read-only mode, in which case a shared lock is used.
# Each store may only be opened once within a JVM. When opening a store, the file is locked in exclusive mode, so that the file can only be changed from within one process. Files can be opened in read-only mode, in which case a shared lock is used.
@mvstore_1063_p
@mvstore_1071_p
# The persisted data can be backed up to a different file at any time, even during write operations (online backup). To do that, automatic disk space reuse needs to be first disabled, so that new data is always appended at the end of the file. Then, the file can be copied (the file handle is available to the application).
# The persisted data can be backed up to a different file at any time, even during write operations (online backup). To do that, automatic disk space reuse needs to be first disabled, so that new data is always appended at the end of the file. Then, the file can be copied (the file handle is available to the application).
@mvstore_1064_h3
@mvstore_1072_h3
#Encrypted Files
#Encrypted Files
@mvstore_1065_p
@mvstore_1073_p
# File encryption ensures the data can only be read with the correct password. Data can be encrypted as follows:
# File encryption ensures the data can only be read with the correct password. Data can be encrypted as follows:
@mvstore_1066_p
@mvstore_1074_p
# The following algorithms and settings are used:
# The following algorithms and settings are used:
@mvstore_1067_li
@mvstore_1075_li
#The password char array is cleared after use, to reduce the risk that the password is stolen even if the attacker has access to the main memory.
#The password char array is cleared after use, to reduce the risk that the password is stolen even if the attacker has access to the main memory.
@mvstore_1068_li
@mvstore_1076_li
#The password is hashed according to the PBKDF2 standard, using the SHA-256 hash algorithm.
#The password is hashed according to the PBKDF2 standard, using the SHA-256 hash algorithm.
@mvstore_1069_li
@mvstore_1077_li
#The length of the salt is 64 bits, so that an attacker can not use a pre-calculated password hash table (rainbow table). It is generated using a cryptographically secure random number generator.
#The length of the salt is 64 bits, so that an attacker can not use a pre-calculated password hash table (rainbow table). It is generated using a cryptographically secure random number generator.
@mvstore_1070_li
@mvstore_1078_li
#To speed up opening an encrypted stores on Android, the number of PBKDF2 iterations is 10. The higher the value, the better the protection against brute-force password cracking attacks, but the slower is opening a file.
#To speed up opening an encrypted stores on Android, the number of PBKDF2 iterations is 10. The higher the value, the better the protection against brute-force password cracking attacks, but the slower is opening a file.
@mvstore_1071_li
@mvstore_1079_li
#The file itself is encrypted using the standardized disk encryption mode XTS-AES. Only little more than one AES-128 round per block is needed.
#The file itself is encrypted using the standardized disk encryption mode XTS-AES. Only little more than one AES-128 round per block is needed.
@mvstore_1072_h3
@mvstore_1080_h3
#Tools
#Tools
@mvstore_1073_p
@mvstore_1081_p
# There is a tool (<code>MVStoreTool</code>) to dump the contents of a file.
# There is a tool (<code>MVStoreTool</code>) to dump the contents of a file.
@mvstore_1074_h3
@mvstore_1082_h3
#Exception Handling
#Exception Handling
@mvstore_1075_p
@mvstore_1083_p
# This tool does not throw checked exceptions. Instead, unchecked exceptions are thrown if needed. The error message always contains the version of the tool. The following exceptions can occur:
# This tool does not throw checked exceptions. Instead, unchecked exceptions are thrown if needed. The error message always contains the version of the tool. The following exceptions can occur:
@mvstore_1076_code
@mvstore_1084_code
#IllegalStateException
#IllegalStateException
@mvstore_1077_li
@mvstore_1085_li
# if a map was already closed or an IO exception occurred, for example if the file was locked, is already closed, could not be opened or closed, if reading or writing failed, if the file is corrupt, or if there is an internal error in the tool.
# if a map was already closed or an IO exception occurred, for example if the file was locked, is already closed, could not be opened or closed, if reading or writing failed, if the file is corrupt, or if there is an internal error in the tool.
@mvstore_1078_code
@mvstore_1086_code
#IllegalArgumentException
#IllegalArgumentException
@mvstore_1079_li
@mvstore_1087_li
# if a method was called with an illegal argument.
# if a method was called with an illegal argument.
@mvstore_1080_code
@mvstore_1088_code
#UnsupportedOperationException
#UnsupportedOperationException
@mvstore_1081_li
@mvstore_1089_li
# if a method was called that is not supported, for example trying to modify a read-only map or view.
# if a method was called that is not supported, for example trying to modify a read-only map or view.
@mvstore_1082_h2
@mvstore_1090_h2
#Similar Projects and Differences to Other Storage Engines
#Similar Projects and Differences to Other Storage Engines
@mvstore_1083_p
@mvstore_1091_p
# Unlike similar storage engines like LevelDB and Kyoto Cabinet, the MVStore is written in Java and can easily be embedded in a Java and Android application.
# Unlike similar storage engines like LevelDB and Kyoto Cabinet, the MVStore is written in Java and can easily be embedded in a Java and Android application.
@mvstore_1084_p
@mvstore_1092_p
# The MVStore is somewhat similar to the Berkeley DB Java Edition because it is also written in Java, and is also a log structured storage, but the H2 license is more liberal.
# The MVStore is somewhat similar to the Berkeley DB Java Edition because it is also written in Java, and is also a log structured storage, but the H2 license is more liberal.
@mvstore_1085_p
@mvstore_1093_p
# Like SQLite, the MVStore keeps all data in one file. Unlike SQLite, the MVStore uses is a log structured storage. The plan is to make the MVStore both easier to use as well as faster than SQLite. In a recent (very simple) test, the MVStore was about twice as fast as SQLite on Android.
# Like SQLite, the MVStore keeps all data in one file. Unlike SQLite, the MVStore uses is a log structured storage. The plan is to make the MVStore both easier to use as well as faster than SQLite. In a recent (very simple) test, the MVStore was about twice as fast as SQLite on Android.
@mvstore_1086_p
@mvstore_1094_p
# The API of the MVStore is similar to MapDB (previously known as JDBM) from Jan Kotek, and some code is shared between MapDB and JDBM. However, unlike MapDB, the MVStore uses is a log structured storage. The MVStore does not have a record size limit.
# The API of the MVStore is similar to MapDB (previously known as JDBM) from Jan Kotek, and some code is shared between MapDB and JDBM. However, unlike MapDB, the MVStore uses is a log structured storage. The MVStore does not have a record size limit.
@mvstore_1087_h2
@mvstore_1095_h2
#Current State
#Current State
@mvstore_1088_p
@mvstore_1096_p
# The code is still very experimental at this stage. The API as well as the behavior will probably change. Features may be added and removed (even thought the main features will stay).
# The code is still very experimental at this stage. The API as well as the behavior will probably change. Features may be added and removed (even thought the main features will stay).
@mvstore_1089_h2
@mvstore_1097_h2
必要条件
必要条件
@mvstore_1090_p
@mvstore_1098_p
# The MVStore is included in the latest H2 jar file.
# The MVStore is included in the latest H2 jar file.
@mvstore_1091_p
@mvstore_1099_p
# There are no special requirements to use it. The MVStore should run on any JVM as well as on Android.
# There are no special requirements to use it. The MVStore should run on any JVM as well as on Android.
@mvstore_1092_p
@mvstore_1100_p
# To build just the MVStore (without the database engine), run:
# To build just the MVStore (without the database engine), run:
@mvstore_1093_p
@mvstore_1101_p
# This will create the file <code>bin/h2mvstore-1.3.170.jar</code> (about 130 KB).
# This will create the file <code>bin/h2mvstore-1.3.170.jar</code> (about 130 KB).
mvstore_1019_p=\ The following sample code show how to create a store, open a map, add some data, and access the current and an old version\:
mvstore_1019_p=\ The following sample code show how to create a store, open a map, add some data, and access the current and an old version\:
mvstore_1020_h3=Store Builder
mvstore_1020_h3=Store Builder
mvstore_1021_p=\ The <code>MVStore.Builder</code> provides a fluid interface to build a store if more complex configuration options are used\:
mvstore_1021_p=\ The <code>MVStore.Builder</code> provides a fluid interface to build a store if more complex configuration options are used. The following code contains all supported configuration options\:
mvstore_1022_h3=R-Tree
mvstore_1022_li=cacheSizeMB\:the cache size in MB.
mvstore_1023_p=\ The <code>MVRTreeMap</code> is an R-tree implementation that supports fast spatial queries. It can be used as follows\:
mvstore_1023_li=compressData\:compress the data when storing.
mvstore_1024_h2=Features
mvstore_1024_li=encryptionKey\:the encryption key for file encryption.
mvstore_1025_h3=Maps
mvstore_1025_li=fileName\:the name of the file, for file based stores.
mvstore_1026_p=\ Each store supports a set of named maps. A map is sorted by key, and supports the common lookup operations, including access to the first and last key, iterate over some or all keys, and so on.
mvstore_1026_li=readOnly\:open the file in read-only mode.
mvstore_1027_p=\ Also supported, and very uncommon for maps, is fast index lookup. The keys of the map can be accessed like a list (get the key at the given index, get the index of a certain key). That means getting the median of two keys is trivial, and it allows to very quickly count ranges. The iterator supports fast skipping. This is possible because internally, each map is organized in the form of a counted B+-tree.
mvstore_1027_li=writeBufferSize\:the size of the write buffer in MB.
mvstore_1028_p=\ In database terms, a map can be used like a table, where the key of the map is the primary key of the table, and the value is the row. A map can also represent an index, where the key of the map is the key of the index, and the value of the map is the primary key of the table (for non-unique indexes, the key of the map must also contain the primary key).
mvstore_1028_li=writeDelay\:the maximum delay until committed changes are stored (unless stored explicitly).
mvstore_1029_h3=Versions / Transactions
mvstore_1029_h3=R-Tree
mvstore_1030_p=\ Multiple versions are supported. A version is a snapshot of all the data of all maps at a given point in time. A transaction is a number of actions between two versions.
mvstore_1030_p=\ The <code>MVRTreeMap</code> is an R-tree implementation that supports fast spatial queries. It can be used as follows\:
mvstore_1031_p=\ Versions / transactions are not immediately persisted; instead, only the version counter is incremented. If there is a change after switching to a new version, a snapshot of the old version is kept in memory, so that it can still be read.
mvstore_1031_p=\ The default number of dimensions is 2. To use a different number of dimensions, use <code>new MVRTreeMap.Builder<String>().dimensions(3)</code>. The minimum number of dimensions is 1, the maximum is 255.
mvstore_1032_p=\ Old persisted versions are readable until the old data was explicitly overwritten. Creating a snapshot is fast\:only the pages that are changed after a snapshot are copied. This behavior also called COW (copy on write).
mvstore_1032_h2=Features
mvstore_1033_p=\ Rollback is supported (rollback to any old in-memory version or an old persisted version).
mvstore_1033_h3=Maps
mvstore_1034_h3=In-Memory Performance and Usage
mvstore_1034_p=\ Each store supports a set of named maps. A map is sorted by key, and supports the common lookup operations, including access to the first and last key, iterate over some or all keys, and so on.
mvstore_1035_p=\ Performance of in-memory operations is comparable with <code>java.util.TreeMap</code> (many operations are actually faster), but usually slower than <code>java.util.HashMap</code>.
mvstore_1035_p=\ Also supported, and very uncommon for maps, is fast index lookup. The keys of the map can be accessed like a list (get the key at the given index, get the index of a certain key). That means getting the median of two keys is trivial, and it allows to very quickly count ranges. The iterator supports fast skipping. This is possible because internally, each map is organized in the form of a counted B+-tree.
mvstore_1036_p=\ The memory overhead for large maps is slightly better than for the regular map implementations, but there is a higher overhead per map. For maps with less than 25 entries, the regular map implementations use less memory on average.
mvstore_1036_p=\ In database terms, a map can be used like a table, where the key of the map is the primary key of the table, and the value is the row. A map can also represent an index, where the key of the map is the key of the index, and the value of the map is the primary key of the table (for non-unique indexes, the key of the map must also contain the primary key).
mvstore_1037_p=\ If no file name is specified, the store operates purely in memory. Except for persisting data, all features are supported in this mode (multi-versioning, index lookup, R-tree and so on). If a file name is specified, all operations occur in memory (with the same performance characteristics) until data is persisted.
mvstore_1037_h3=Versions / Transactions
mvstore_1038_h3=Pluggable Data Types
mvstore_1038_p=\ Multiple versions are supported. A version is a snapshot of all the data of all maps at a given point in time. A transaction is a number of actions between two versions.
mvstore_1039_p=\ Serialization is pluggable. The default serialization currently supports many common data types, and uses Java serialization for other objects. The following classes are currently directly supported\:<code>Boolean, Byte, Short, Character, Integer, Long, Float, Double, BigInteger, BigDecimal, byte[], char[], int[], long[], String, UUID</code>. The plan is to add more common classes (date, time, timestamp, object array).
mvstore_1039_p=\ Versions / transactions are not immediately persisted; instead, only the version counter is incremented. If there is a change after switching to a new version, a snapshot of the old version is kept in memory, so that it can still be read.
mvstore_1040_p=\ Parameterized data types are supported (for example one could build a string data type that limits the length for some reason).
mvstore_1040_p=\ Old persisted versions are readable until the old data was explicitly overwritten. Creating a snapshot is fast\:only the pages that are changed after a snapshot are copied. This behavior also called COW (copy on write).
mvstore_1041_p=\ The storage engine itself does not have any length limits, so that keys, values, pages, and chunks can be very big (as big as fits in memory). Also, there is no inherent limit to the number of maps and chunks. Due to using a log structured storage, there is no special case handling for large keys or pages.
mvstore_1041_p=\ Rollback is supported (rollback to any old in-memory version or an old persisted version).
mvstore_1042_h3=BLOB Support
mvstore_1042_h3=In-Memory Performance and Usage
mvstore_1043_p=\ There is a mechanism that stores large binary objects by splitting them into smaller blocks. This allows to store objects that don't fit in memory. Streaming as well as random access reads on such objects are supported. This tool is written on top of the store (only using the map interface).
mvstore_1043_p=\ Performance of in-memory operations is comparable with <code>java.util.TreeMap</code> (many operations are actually faster), but usually slower than <code>java.util.HashMap</code>.
mvstore_1044_h3=R-Tree and Pluggable Map Implementations
mvstore_1044_p=\ The memory overhead for large maps is slightly better than for the regular map implementations, but there is a higher overhead per map. For maps with less than 25 entries, the regular map implementations use less memory on average.
mvstore_1045_p=\ The map implementation is pluggable. In addition to the default MVMap (multi-version map), there is a multi-version R-tree map implementation for spatial operations (contain and intersection; nearest neighbor is not yet implemented).
mvstore_1045_p=\ If no file name is specified, the store operates purely in memory. Except for persisting data, all features are supported in this mode (multi-versioning, index lookup, R-tree and so on). If a file name is specified, all operations occur in memory (with the same performance characteristics) until data is persisted.
mvstore_1046_h3=Concurrent Operations and Caching
mvstore_1046_h3=Pluggable Data Types
mvstore_1047_p=\ The default map implementation supports concurrent reads on old versions of the data. All such read operations can occur in parallel. Concurrent reads from the page cache, as well as concurrent reads from the file system are supported.
mvstore_1047_p=\ Serialization is pluggable. The default serialization currently supports many common data types, and uses Java serialization for other objects. The following classes are currently directly supported\:<code>Boolean, Byte, Short, Character, Integer, Long, Float, Double, BigInteger, BigDecimal, byte[], char[], int[], long[], String, UUID</code>. The plan is to add more common classes (date, time, timestamp, object array).
mvstore_1048_p=\ Storing changes can occur concurrently to modifying the data, as <code>store()</code> operates on a snapshot.
mvstore_1048_p=\ Parameterized data types are supported (for example one could build a string data type that limits the length for some reason).
mvstore_1049_p=\ Caching is done on the page level. The page cache is a concurrent LIRS cache, which should be resistant against scan operations.
mvstore_1049_p=\ The storage engine itself does not have any length limits, so that keys, values, pages, and chunks can be very big (as big as fits in memory). Also, there is no inherent limit to the number of maps and chunks. Due to using a log structured storage, there is no special case handling for large keys or pages.
mvstore_1050_p=\ The default map implementation does not support concurrent modification operations on a map (the same as <code>HashMap</code> and <code>TreeMap</code>).
mvstore_1050_h3=BLOB Support
mvstore_1051_p=\ With the <code>MVMapConcurrent</code> implementation, read operations even on the newest version can happen concurrently with all other operations, without risk of corruption. This comes with slightly reduced speed in single threaded mode, the same as with other <code>ConcurrentHashMap</code> implementations. Write operations first read the relevant area from disk to memory (this can happen concurrently), and only then modify the data. The in-memory part of write operations is synchronized.
mvstore_1051_p=\ There is a mechanism that stores large binary objects by splitting them into smaller blocks. This allows to store objects that don't fit in memory. Streaming as well as random access reads on such objects are supported. This tool is written on top of the store (only using the map interface).
mvstore_1052_p=\ For fully scalable concurrent write operations to a map (in-memory and to disk), the map could be split into multiple maps in different stores ('sharding'). The plan is to add such a mechanism later when needed.
mvstore_1052_h3=R-Tree and Pluggable Map Implementations
mvstore_1053_h3=Log Structured Storage
mvstore_1053_p=\ The map implementation is pluggable. In addition to the default MVMap (multi-version map), there is a multi-version R-tree map implementation for spatial operations (contain and intersection; nearest neighbor is not yet implemented).
mvstore_1054_p=\ Currently, <code>store()</code> needs to be called explicitly to save changes. Changes are buffered in memory, and once enough changes have accumulated (for example 2 MB), all changes are written in one continuous disk write operation. (According to a test, write throughput of a common SSD gets higher the larger the block size, until a block size of 2 MB, and then does not further increase.) But of course, if needed, changes can also be persisted if only little data was changed. The estimated amount of unsaved changes is tracked. The plan is to automatically store in a background thread once there are enough changes.
mvstore_1054_h3=Concurrent Operations and Caching
mvstore_1055_p=\ When storing, all changed pages are serialized, optionally compressed using the LZF algorithm, and written sequentially to a free area of the file. Each such change set is called a chunk. All parent pages of the changed B-trees are stored in this chunk as well, so that each chunk also contains the root of each changed map (which is the entry point to read this version of the data). There is no separate index\:all data is stored as a list of pages. Per store, the is one additional map that contains the metadata (the list of maps, where the root page of each map is stored, and the list of chunks).
mvstore_1055_p=\ The default map implementation supports concurrent reads on old versions of the data. All such read operations can occur in parallel. Concurrent reads from the page cache, as well as concurrent reads from the file system are supported.
mvstore_1056_p=\ There are usually two write operations per chunk\:one to store the chunk data (the pages), and one to update the file header (so it points to the latest chunk). If the chunk is appended at the end of the file, the file header is only written at the end of the chunk.
mvstore_1056_p=\ Storing changes can occur concurrently to modifying the data, as <code>store()</code> operates on a snapshot.
mvstore_1057_p=\ There is currently no transaction log, no undo log, and there are no in-place updates (however unused chunks are overwritten). To save space when persisting very small transactions, the plan is to use a transaction log where only the deltas are stored, until enough changes have accumulated to persist a chunk.
mvstore_1057_p=\ Caching is done on the page level. The page cache is a concurrent LIRS cache, which should be resistant against scan operations.
mvstore_1058_p=\ Old data is kept for at least 45 seconds (configurable), so that there are no explicit sync operations required to guarantee data consistency, but an application can also sync explicitly when needed. To reuse disk space, the chunks with the lowest amount of live data are compacted (the live data is simply stored again in the next chunk). To improve data locality and disk space usage, the plan is to automatically defragment and compact data.
mvstore_1058_p=\ The default map implementation does not support concurrent modification operations on a map (the same as <code>HashMap</code> and <code>TreeMap</code>).
mvstore_1059_p=\ Compared to regular databases (that use a transaction log, undo log, and main storage area), the log structured storage is simpler, more flexible, and typically needs less disk operations per change, as data is only written once instead of twice or 3 times, and because the B-tree pages are always full (they are stored next to each other) and can be easily compressed. But temporarily, disk space usage might actually be a bit higher than for a regular database, as disk space is not immediately re-used (there are no in-place updates).
mvstore_1059_p=\ With the <code>MVMapConcurrent</code> implementation, read operations even on the newest version can happen concurrently with all other operations, without risk of corruption. This comes with slightly reduced speed in single threaded mode, the same as with other <code>ConcurrentHashMap</code> implementations. Write operations first read the relevant area from disk to memory (this can happen concurrently), and only then modify the data. The in-memory part of write operations is synchronized.
mvstore_1060_h3=File System Abstraction, File Locking and Online Backup
mvstore_1060_p=\ For fully scalable concurrent write operations to a map (in-memory and to disk), the map could be split into multiple maps in different stores ('sharding'). The plan is to add such a mechanism later when needed.
mvstore_1061_p=\ The file system is pluggable (the same file system abstraction is used as H2 uses). Support for encryption is planned using an encrypting file system. Other file system implementations support reading from a compressed zip or tar file.
mvstore_1061_h3=Log Structured Storage
mvstore_1062_p=\ Each store may only be opened once within a JVM. When opening a store, the file is locked in exclusive mode, so that the file can only be changed from within one process. Files can be opened in read-only mode, in which case a shared lock is used.
mvstore_1062_p=\ Currently, <code>store()</code> needs to be called explicitly to save changes. Changes are buffered in memory, and once enough changes have accumulated (for example 2 MB), all changes are written in one continuous disk write operation. (According to a test, write throughput of a common SSD gets higher the larger the block size, until a block size of 2 MB, and then does not further increase.) But of course, if needed, changes can also be persisted if only little data was changed. The estimated amount of unsaved changes is tracked. The plan is to automatically store in a background thread once there are enough changes.
mvstore_1063_p=\ The persisted data can be backed up to a different file at any time, even during write operations (online backup). To do that, automatic disk space reuse needs to be first disabled, so that new data is always appended at the end of the file. Then, the file can be copied (the file handle is available to the application).
mvstore_1063_p=\ When storing, all changed pages are serialized, optionally compressed using the LZF algorithm, and written sequentially to a free area of the file. Each such change set is called a chunk. All parent pages of the changed B-trees are stored in this chunk as well, so that each chunk also contains the root of each changed map (which is the entry point to read this version of the data). There is no separate index\:all data is stored as a list of pages. Per store, the is one additional map that contains the metadata (the list of maps, where the root page of each map is stored, and the list of chunks).
mvstore_1064_h3=Encrypted Files
mvstore_1064_p=\ There are usually two write operations per chunk\:one to store the chunk data (the pages), and one to update the file header (so it points to the latest chunk). If the chunk is appended at the end of the file, the file header is only written at the end of the chunk.
mvstore_1065_p=\ File encryption ensures the data can only be read with the correct password. Data can be encrypted as follows\:
mvstore_1065_p=\ There is currently no transaction log, no undo log, and there are no in-place updates (however unused chunks are overwritten). To save space when persisting very small transactions, the plan is to use a transaction log where only the deltas are stored, until enough changes have accumulated to persist a chunk.
mvstore_1066_p=\ The following algorithms and settings are used\:
mvstore_1066_p=\ Old data is kept for at least 45 seconds (configurable), so that there are no explicit sync operations required to guarantee data consistency, but an application can also sync explicitly when needed. To reuse disk space, the chunks with the lowest amount of live data are compacted (the live data is simply stored again in the next chunk). To improve data locality and disk space usage, the plan is to automatically defragment and compact data.
mvstore_1067_li=The password char array is cleared after use, to reduce the risk that the password is stolen even if the attacker has access to the main memory.
mvstore_1067_p=\ Compared to regular databases (that use a transaction log, undo log, and main storage area), the log structured storage is simpler, more flexible, and typically needs less disk operations per change, as data is only written once instead of twice or 3 times, and because the B-tree pages are always full (they are stored next to each other) and can be easily compressed. But temporarily, disk space usage might actually be a bit higher than for a regular database, as disk space is not immediately re-used (there are no in-place updates).
mvstore_1068_li=The password is hashed according to the PBKDF2 standard, using the SHA-256 hash algorithm.
mvstore_1068_h3=File System Abstraction, File Locking and Online Backup
mvstore_1069_li=The length of the salt is 64 bits, so that an attacker can not use a pre-calculated password hash table (rainbow table). It is generated using a cryptographically secure random number generator.
mvstore_1069_p=\ The file system is pluggable (the same file system abstraction is used as H2 uses). Support for encryption is planned using an encrypting file system. Other file system implementations support reading from a compressed zip or tar file.
mvstore_1070_li=To speed up opening an encrypted stores on Android, the number of PBKDF2 iterations is 10. The higher the value, the better the protection against brute-force password cracking attacks, but the slower is opening a file.
mvstore_1070_p=\ Each store may only be opened once within a JVM. When opening a store, the file is locked in exclusive mode, so that the file can only be changed from within one process. Files can be opened in read-only mode, in which case a shared lock is used.
mvstore_1071_li=The file itself is encrypted using the standardized disk encryption mode XTS-AES. Only little more than one AES-128 round per block is needed.
mvstore_1071_p=\ The persisted data can be backed up to a different file at any time, even during write operations (online backup). To do that, automatic disk space reuse needs to be first disabled, so that new data is always appended at the end of the file. Then, the file can be copied (the file handle is available to the application).
mvstore_1072_h3=Tools
mvstore_1072_h3=Encrypted Files
mvstore_1073_p=\ There is a tool (<code>MVStoreTool</code>) to dump the contents of a file.
mvstore_1073_p=\ File encryption ensures the data can only be read with the correct password. Data can be encrypted as follows\:
mvstore_1074_h3=Exception Handling
mvstore_1074_p=\ The following algorithms and settings are used\:
mvstore_1075_p=\ This tool does not throw checked exceptions. Instead, unchecked exceptions are thrown if needed. The error message always contains the version of the tool. The following exceptions can occur\:
mvstore_1075_li=The password char array is cleared after use, to reduce the risk that the password is stolen even if the attacker has access to the main memory.
mvstore_1076_code=IllegalStateException
mvstore_1076_li=The password is hashed according to the PBKDF2 standard, using the SHA-256 hash algorithm.
mvstore_1077_li=\ if a map was already closed or an IO exception occurred, for example if the file was locked, is already closed, could not be opened or closed, if reading or writing failed, if the file is corrupt, or if there is an internal error in the tool.
mvstore_1077_li=The length of the salt is 64 bits, so that an attacker can not use a pre-calculated password hash table (rainbow table). It is generated using a cryptographically secure random number generator.
mvstore_1078_code=IllegalArgumentException
mvstore_1078_li=To speed up opening an encrypted stores on Android, the number of PBKDF2 iterations is 10. The higher the value, the better the protection against brute-force password cracking attacks, but the slower is opening a file.
mvstore_1079_li=\ if a method was called with an illegal argument.
mvstore_1079_li=The file itself is encrypted using the standardized disk encryption mode XTS-AES. Only little more than one AES-128 round per block is needed.
mvstore_1080_code=UnsupportedOperationException
mvstore_1080_h3=Tools
mvstore_1081_li=\ if a method was called that is not supported, for example trying to modify a read-only map or view.
mvstore_1081_p=\ There is a tool (<code>MVStoreTool</code>) to dump the contents of a file.
mvstore_1082_h2=Similar Projects and Differences to Other Storage Engines
mvstore_1082_h3=Exception Handling
mvstore_1083_p=\ Unlike similar storage engines like LevelDB and Kyoto Cabinet, the MVStore is written in Java and can easily be embedded in a Java and Android application.
mvstore_1083_p=\ This tool does not throw checked exceptions. Instead, unchecked exceptions are thrown if needed. The error message always contains the version of the tool. The following exceptions can occur\:
mvstore_1084_p=\ The MVStore is somewhat similar to the Berkeley DB Java Edition because it is also written in Java, and is also a log structured storage, but the H2 license is more liberal.
mvstore_1084_code=IllegalStateException
mvstore_1085_p=\ Like SQLite, the MVStore keeps all data in one file. Unlike SQLite, the MVStore uses is a log structured storage. The plan is to make the MVStore both easier to use as well as faster than SQLite. In a recent (very simple) test, the MVStore was about twice as fast as SQLite on Android.
mvstore_1085_li=\ if a map was already closed or an IO exception occurred, for example if the file was locked, is already closed, could not be opened or closed, if reading or writing failed, if the file is corrupt, or if there is an internal error in the tool.
mvstore_1086_p=\ The API of the MVStore is similar to MapDB (previously known as JDBM) from Jan Kotek, and some code is shared between MapDB and JDBM. However, unlike MapDB, the MVStore uses is a log structured storage. The MVStore does not have a record size limit.
mvstore_1086_code=IllegalArgumentException
mvstore_1087_h2=Current State
mvstore_1087_li=\ if a method was called with an illegal argument.
mvstore_1088_p=\ The code is still very experimental at this stage. The API as well as the behavior will probably change. Features may be added and removed (even thought the main features will stay).
mvstore_1088_code=UnsupportedOperationException
mvstore_1089_h2=Requirements
mvstore_1089_li=\ if a method was called that is not supported, for example trying to modify a read-only map or view.
mvstore_1090_p=\ The MVStore is included in the latest H2 jar file.
mvstore_1090_h2=Similar Projects and Differences to Other Storage Engines
mvstore_1091_p=\ There are no special requirements to use it. The MVStore should run on any JVM as well as on Android.
mvstore_1091_p=\ Unlike similar storage engines like LevelDB and Kyoto Cabinet, the MVStore is written in Java and can easily be embedded in a Java and Android application.
mvstore_1092_p=\ To build just the MVStore (without the database engine), run\:
mvstore_1092_p=\ The MVStore is somewhat similar to the Berkeley DB Java Edition because it is also written in Java, and is also a log structured storage, but the H2 license is more liberal.
mvstore_1093_p=\ This will create the file <code>bin/h2mvstore-1.3.170.jar</code> (about 130 KB).
mvstore_1093_p=\ Like SQLite, the MVStore keeps all data in one file. Unlike SQLite, the MVStore uses is a log structured storage. The plan is to make the MVStore both easier to use as well as faster than SQLite. In a recent (very simple) test, the MVStore was about twice as fast as SQLite on Android.
mvstore_1094_p=\ The API of the MVStore is similar to MapDB (previously known as JDBM) from Jan Kotek, and some code is shared between MapDB and JDBM. However, unlike MapDB, the MVStore uses is a log structured storage. The MVStore does not have a record size limit.
mvstore_1095_h2=Current State
mvstore_1096_p=\ The code is still very experimental at this stage. The API as well as the behavior will probably change. Features may be added and removed (even thought the main features will stay).
mvstore_1097_h2=Requirements
mvstore_1098_p=\ The MVStore is included in the latest H2 jar file.
mvstore_1099_p=\ There are no special requirements to use it. The MVStore should run on any JVM as well as on Android.
mvstore_1100_p=\ To build just the MVStore (without the database engine), run\:
mvstore_1101_p=\ This will create the file <code>bin/h2mvstore-1.3.170.jar</code> (about 130 KB).