Hash-Index @ SwayDB
Simer Plaha
Posted on March 22, 2020
Hash-index @ SwayDB
Hash-indexes improves random read performance by 55%!
Hash-indexes like HashSet
or HashMap
are useful for serving random read requests. They increase performance for randomly accessed data, reduce total number of IOps and save CPU time. But they can also be costly for applications that rarely do random reads like
- Time-series or event-source data might only require fast sequential reads.
- Cold storage or archived data might not require random reads at all.
A configurable hash-index is required for SwayDB so various data storage requirements could be handled. The following were the requirements:
- Optionally enable or disable hash-index.
- Allow creation of perfect or nearly perfect hash-indexes.
- Create partially indexed hash-indexes with fallback to alternative indexes like binary-search & linear-search.
- Copy keys directly into the hash-index.
- Controlled concurrency.
- Compression.
Example configuration
The following creates a persistent Map
where hash-index is disabled.
Map<Integer, String, Void> map =
MapConfig
.functionsOff(Paths.get("myMap"), intSerializer(), stringSerializer())
.setRandomKeyIndex(RandomKeyIndex.disable())
.get();
map.put(1, "one");
map.get(1); //Optional[one]
The following enables hash-index (RandomKeyIndex
) with custom configuration.
Map<Integer, String, Void> map =
MapConfig
.functionsOff(Paths.get("myMap"), intSerializer(), stringSerializer())
.setRandomKeyIndex(
RandomKeyIndex
.builder()
.maxProbe(10) //re-hash 10 times to resolve hash collision
.minimumNumberOfKeys(20) //minimum keys required
.minimumNumberOfHits(10) //minimum indexed keys required
//use reference format. IndexFormat.copyKeys() can also be used here
//to copy keys directly into the hash-index
.indexFormat(IndexFormat.reference())
.allocateSpace(
(RandomKeyIndex.RequiredSpace optimalSpace) ->
//allocates 3 times more storage than the
//default optimal space for higher hit rate.
optimalSpace.requiredSpace() * 3
)
//allow async IO for all IO operations
.ioStrategy((IOAction ioAction) -> new IOStrategy.AsyncIO(true))
//Use LZ4 and fallback to Snappy.
.compression((UncompressedBlockInfo info) ->
Arrays.asList(
//try running LZ4 with minimum 20.0% compression
Compression.lz4Pair(
new Pair(LZ4Instance.fastestInstance(), new LZ4Compressor.Fast(20.0)),
new Pair(LZ4Instance.fastestInstance(), LZ4Decompressor.fastDecompressor())
),
//if LZ4 fails try Snappy with 20.0% compression.
new Compression.Snappy(20.0)
)
)
)
.get();
map.put(1, "one");
map.get(1); //Optional[one]
For a detail documentation of the configurations refer to documentation website.
Useful links
- SwayDB on GitHub.
- Java examples repo.
- Kotlin examples repo.
- Scala examples repo.
- Documentation website.
Posted on March 22, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.