Creating An Automated Personal Finances Dashboard - Part 4 (DynamoDB)

I wrote some BAD code and redesigned on the fly...

In the last post we had written the code to process a webhook event and set up the API infrastructure to support that.
This part was supposed to cover writing to a CSV file in an S3 bucket - well that has definitely changed. I have redesigned to support DynamoDB and honestly, it was what I should have done from the start.

Here's what the new system looks like:

Why not CSV in S3?

It's very hard to append to CSV files in S3, the functionality with Boto3 just isn't there
The solution would have required more engineering and maintenance
Not as cool

Why DynamoDB?

Fully managed by AWS
Fantastic Boto3 support
Will scale with the system, CSV in S3 would have only been practical for so long

Setting up DynamoDB

This was my first time using DynamoDB and it was really easy to set up, let's go through it:

Create a table in DynamoDB - this is all very self explanatory, I'll be using the TransactionID as the partition key.

... That's it - almost seems too easy right? There's no tricks here, this is what a fully managed NoSQL database looks like. It's just very easy.

Writing to DynamoDB

This is where things get a little more interesting, although still quite simple. The objective is to modify both of our lambdas to write their respective payloads directly do DynamoDB.

Before we can attempt to write to DynamoDB, we need to make sure we have the correct permissions, now someone will probably say "this is your bad code right here" and they would be correct. Blanket "allow all" policies are bad, and I should feel bad. There is a security review staged for the end of v0.1 where I'll go through and ensure all policies are least privileged compliant - right now I'm purely focused on making things work.

"Effect": "Allow",
"Action": [
    "dynamodb:*"

Now that we have granted our lambdas supreme DynamoDB control we can write some code. The boto3 support for DynamoDB is great, it's really easy to use and super intuitive. One of my favourite things about DynamoDB is that you don't have to define a schema before hand, at the minimum you need a partition key, and then you can just add any other key value pair that you want. 'S' and 'N' denote string or number values.

This is the function that the process webhook lambda uses:

def write_to_dynamo(dictionary):
    dynamodb = boto3.client('dynamodb')
    dynamodb.put_item(TableName='quicksightTest', Item={'TransactionID':{'S': dictionary['ID']},'Category':{'S': dictionary['Category']}, 
    'ParentCategory' : {'S' : dictionary['ParentCategory']}, 'Value' : {'N' : dictionary['Value']}, 'Description' : {'S' : dictionary['Description']}, 
    'CreatedAt' : {'S' : dictionary['CreatedAt']}})

The Bad Code

Is code bad if it works? Probably yes.
Why was it bad? A lack of concurrency and an overall inefficient design.

I replicated the process webhook code for the provision new user lambda. Makes sense right? It does basically the same thing just on a larger scale

Here's the code block as it stands:

def write_to_dynamo(dictionary):
    dynamodb = boto3.client('dynamodb')
    a = 0
    for transaction in dictionary['id']:
        dynamodb.put_item(TableName='quicksightTest', Item={'TransactionID':{'S': dictionary['id'][a]},'Category':{'S': dictionary['category'][a]}, 
        'ParentCategory' : {'S' : dictionary['parentCategory'][a]}, 'Value' : {'N' : dictionary['value'][a]}, 'Description' : {'S' : dictionary['description'][a]}, 
        'CreatedAt' : {'S' : dictionary['createdAt'][a]}})
        a += 1


def lambda_handler(event, context):
    dictionary = create_Dictionary()
    write_to_dynamo(dictionary)

When I first tried running this, it timed out. Nothing to worry about, the lambda limit was only set to a minute, a large transaction history will take longer than that to download.

In the end I had to increase the lambda timeout to it's max of 15 minutes, and my lambda resource ran for ~13 minutes. This is clearly not ideal as my transaction history will only grow in size, meaning there will come a time when this code just does not work.

So what did I do?

Nothing.

If this was on the process webhook side I would have been forced to refactor and create a more efficient solution, but this function is only run once to provision a new user, it can afford to be inefficient for now.

Even though it's ugly, it worked, I now have 4,418 records in DynamoDB and more being added with every webhook event. Although it got me thinking about how I'm handling the compute, there are many ways I can improve the whole package. Stay tuned for those updates.

Today's Progress

Created a table in DynamoDB
Wrote a function that writes to DynamoDB
Imported my entire transaction history to DynamoDB!!!

Next up we will go through querying dynamoDB with Athena, and bringing our financial records into Quicksight - this is the fun stuff!

Blog

Creating An Automated Personal Finances Dashboard - Part 4 (DynamoDB)

bradenrichardson

Why not CSV in S3?

Why DynamoDB?

Setting up DynamoDB

Writing to DynamoDB

The Bad Code

Today's Progress

Next Post

Join Our Newsletter. No Spam, Only the good stuff.

Related