Chastina Li 👩🏻💻
Posted on October 24, 2018
There are several ways to manage databases via code, for Amazon Web Service's Relational Database Service(RDS), you can use Terraform, Cloudformation Templates, or API libraries in your favorite language. While Terraform's statefiles
are a hassle to managed and sometimes it will delete your database and start from scratch. And CloudFormation templates are too verbose to be managed properly. In this blog post, I'm talking specifically about the Python AWS API library: boto3.
Creating a Postgres database
Firstly, we need to set up the boto
client:
import boto3
from botocore.config import Config
boto_config = Config(retries=dict(max_attempts=20))
client = boto3.client(
'rds', region_name='us-east-1', config=boto_config
)
then let's invoke the create_db_instance()
method with a set of configurations:
db_vars = {
"DBName": "db_name",
"DBInstanceIdentifier": "instance_name",
"AllocatedStorage": 20,
"DBInstanceClass": "db.m3.medium",
"Engine": "postgres",
"MasterUsername": "username",
"MasterUserPassword": "password",
"VpcSecurityGroupIds": [
"sg-0007c6489efbd9bca",
],
"DBSubnetGroupName": "my-subnet",
"DBParameterGroupName": "my-parameter-group",
"BackupRetentionPeriod": 7,
"MultiAZ": True,
"EngineVersion": "10.0.1",
"PubliclyAccessible": False,
"StorageType": "gp2",
}
client.create_db_instance(**db_vars)
Updating existing database
Updating database is a bit more complicated because it involves handling idempotency. In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters. Why is it important? We don't want to update the same set of configurations on the database again and again, this causes confusion from boto
and will result in errors.
Part of handling idempotency is reconciliation with existing state. This usually involve computation of differences between desired state and current state. For boto
, this means whatever we give modify_db_instance
must represent this difference. In Python
, this can simply be done with a Dictionary comprehension:
updatable_boto_params = (
"AllocatedStorage",
"DBInstanceClass",
"DBParameterGroupName",
"DBSubnetGroupName",
"EngineVersion",
"Iops",
"StorageType",
)
updated_params = {
param: new_params.get(param)
for param in updatable_boto_params
if not current_params.get(param) == new_params.get(param)
}
where updatable_boto_params
is parameters that you allow to update, you can obviously lock this list down to a very small set of parameters for compliance or security purposes. See the complete list of updatable configurations here.
Now that we have the change set, we can invoke the modify_db_instance
:
client.modify_db_instance(**updated_params)
Waiting for it to finish
Database operations are usually long-running, when performing a series of steps sequentially (ie. create database and create a read replica), you need to wait for the synchronous process to finish. One way to tell if database has finished updating itself is by looking at the DBInstanceStatus
returned by the 'describe_db_instances' method. Status of available
means database has finished whatever it's doing and you can proceed to the next operation.
A better way is to use boto
's RDS waiter. It polls describe_db_instances()
every 30 seconds until a successful state is reached by default. But you can customize how frequently you want it to pool database status, and how long before you give up and declare the database not ready.
We can also use Python's context manager as a nice little wrapper of the waiter, we can also throw whatever exception handling logic into the context manager just to clean up the code:
from contextlib import contextmanager
from botocore.exceptions import WaiterError
@contextmanager
def wait_for_availability(instance_id, delay=120, retries=60):
" Yield back control after database is available "
yield True
try:
waiter = self.db_client.get_waiter("db_instance_available")
waiter.wait(
DBInstanceIdentifier=instance_id,
WaiterConfig={"Delay": delay, "MaxAttempts": retries},
)
except WaiterError as exception:
LOG.error("DBInstance: %s is not available.", instance_id)
raise exception
else:
LOG.info("DB instance: %s is now available.", instance_id)
break
with wait_for_availability(instance_id):
client.modify_db_instance(**updated_params)
In the code snippet above, we modify our database and pool for its status every 2 minutes and for a total of 60 times. After that the database is either available and we exit, or still unavailable and we raise an exception.
Database operations are usually time consuming, and guarantee idempotence is a major complexity to deal with. Fortunately, boto
provides a bunch of very useful APIs to satisfy our needs.
Posted on October 24, 2018
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.