Automated Aerospike All Flash Setup

kentune

Ken Tune

Posted on October 28, 2020

Automated Aerospike All Flash Setup

Introduction

Aerospike is a key value database maximising SSD/Flash technology in order to offer best in class throughput and latency at petabyte scale.

Standard Aerospike usage will have the primary key index in DRAM and the data on SSD. Although Aerospike's usage of DRAM is very low at 64 bytes per object, for very large numbers of objects (100bn+) users might wish to consider the all-flash mode in which the primary key index is also placed on disk. More detail at all flash usage.

There are a number of non-trivial steps to go through to set up all flash. For that reason I've extended aerospike-ansible to allow automation of this process. This article walks through the automated process. It's envisaged that this will be useful for those evaluating the feature, or looking to get up and running with it quickly.

A working knowledge of aerospike-ansible is assumed. This introductory article may also be useful.

All Flash Calculations

In order to correctly configure a system for all flash, you need to know the number of partition-tree-sprigs that are appropriate for the object count you will have in your database. You can think of a partition tree sprig as a mini primary key index - we use these in order to have a lower depth primary key tree, allowing us to lookup record location more rapidly. More detail at sprigs.

It's important for all-flash because we size the system so the sprigs fit inside single disk blocks, minimising read and write overhead.

You can find details of the calculation here, but to make life easier a spreadsheet can be found in aerospike-ansible at assets/all-flash-calculator.xlsx.

all-flash-calculator.xlsx

Populate the yellow cells - # of objects, replication factor and object size.

The spreadsheet will calculate required partition-tree-sprigs.

It will also determine the fraction of available disk space that should be given over to the primary key index, based on the object size. In the screenshot, we can see that for 100m records, replication factor 2, average record size 1024 bytes, the overhead per record is 172 bytes and the overall record footprint is 2220 bytes, so approx 1/13 of the disk space should be allocated to the index.

Using Aerospike-Ansible

In vars/cluster-config.yml

  • Set partitions_per_device to the value given in the spreadsheet - 13 in the example. The first partition on each device is used for the all flash index to ensure the correct index:data disk space ratio.
  • Add partition_free_sprigs: YOUR_VALUE - YOUR_VALUE would be 1024 for this example

You will also need to

  • Set all_flash: true
  • Set enterprise: true
  • Provide a path to a valid Aerospike feature key using feature_key: /your/path/to/key. You must therefore be either a licensed Aerospike customer, or running an Aerospike trial.

Having done that

ansible-playbook aws-setup-plus-aerospike-install.yml

You should check that the aggregate disk space across your cluster exceeds the amount recommended in the spreadsheet.

Verification

Once the setup process is complete, log into one of your cluster nodes

./scripts/cluster-quick-ssh.sh 
Enter fullscreen mode Exit fullscreen mode

then access asadm (admin tool) followed by info command

asadm

The index type comes up as 'flash' as per the highlight.

Data Load

You can follow the instructions in benchmarking to quickly load some data into the new configuration.

As before, we can use asadm to examine the (highlighted) disk footprint of the primary key index for (in this case) 10m records (20m includes replicas).

asadm-2

Conclusion

The aerospike-ansible tooling makes it easy to set up all flash for Aerospike and benefit from the DRAM saving it offers.


Cover image Michał Mancewicz

💖 💪 🙅 🚩
kentune
Ken Tune

Posted on October 28, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related