Building a low code URL shortener using API Gateway, StepFunctions and DynamoDB

ljacobsson

Lars Jacobsson

Posted on October 25, 2022

Building a low code URL shortener using API Gateway, StepFunctions and DynamoDB

This post describes how you can implement a simple URL shortener using native low code AWS services. Some previous knowledge of API Gateway and StepFunctions is assumed.

Motivation

At Mathem, we use a home built URL shortener in SMS communication with our customers. Until now, this has been hosted on an on-prem legacy environment which is soon to be turned off, so we had to move it to our serverless architecture in AWS one way or another.

There are plenty of serverless URL shorteners publically available on GitHub. However, I wanted to explore if there's a way to achieve this without having to write any Lambda function code at all.

I'm a huge fan of Lambda, but it does come with (for this use case, arguably insignificant) cold starts and the responsibility of keeping the runtime version up to date.

The StepFunctions team has made a couple of releases during 2022 that has enabled state machines to do more than just orchestrating Lambda functions invocations, and this solution would not have been possible a few months ago prior to the new intrinsic functions were released.

The full solution is available on GitHub

Creating a short URL

Architecture

State machine design

The state machine consists of three states, Initialise, Create hash and Store URL as well as some logic to handle duplicate hashes.

Let's go through each of them from top down:

Initialise

Initialise:
  Type: Pass
  OutputPath: $
  Parameters: 
    Splitter: "-"
    Attempts: 0
  Next: Create hash
Enter fullscreen mode Exit fullscreen mode

This is a Pass state that initialises the execution. It passes on two parameters; Splitter, which is used to split the UUID in the next step as well as Attempts which is used to avoid an infinite loop if all hashes are already taken.

Create hash

To get a short, but unique hash to hide the long URL behind we'll make use of three new intrinsic functions:

This is also a Pass state and looks like this:

Create hash:
  Type: Pass
  OutputPath: $
  Parameters:
    Hash.$: States.ArrayGetItem(States.StringSplit(States.UUID(), $.Splitter), 1)
    Attempts.$: States.MathAdd($.Attempts, 1)
  Next: Store URL
Enter fullscreen mode Exit fullscreen mode

Note how the output from the first state, $.Splitter is used here. Ideally we'd like to just use States.StringSplit(States.UUID(), "-"), but the StringSplit function expects a valid JSON path as the second argument.

The UUID is formatted like this: ca4c1140-dcc1-40cd-ad05-7b4aa23df4a8. Splitting it on the dash ('-') character gives us this array:

["ca4c1140", "dcc1", "40cd", "ad05", "7b4aa23df4a8"]
Enter fullscreen mode Exit fullscreen mode

It's always divided in lower case, alphanumeric sequences of 8, 4, 4, 4, and 12 characters.

Next, we have to decide how long a hash we want and it comes down to how many different combinations (n36) of URLs we need.
4 characters: 1,679,616 permutations
8 characters: 2,821109907×1012 permutations
12 characters: 4,738381338×1018 permutations

We are fine with 4 for our use case, so we'll access it using index 1: States.ArrayGetItem(splitArray, 1).

Note that we are limited to lower case characters. To make short hashes with a mix of casing, we'd need a Lambda function in the mix.

Store URL

This state takes the output and stores it in DynamoDB using a native service integration

Store URL:
  Type: Task
  Resource: arn:aws:states:::aws-sdk:dynamodb:putItem
  ResultPath: null
  Parameters:
    TableName: ${UrlTable}
    ConditionExpression: attribute_not_exists(Id)
    Item:
      Id:
        S.$: $.Hash
      Url:
        S.$: $$.Execution.Input.Url
      HitCount: 
        N: "0"
  Catch:
    - ErrorEquals: 
        - DynamoDb.ConditionalCheckFailedException
      Next: Continue trying?
      ResultPath: null
  End: true
Continue trying?:
  Type: Choice
  Choices:
    - Variable: $.Attempts
      NumericLessThan: 10
      Next: Create hash
  Default: Fail
Fail:
  Type: Fail
Enter fullscreen mode Exit fullscreen mode

Note the ConditionExpression and the error handling in the Catch clause. This handles scenarios of duplicate hashes and will simply generate a new one until it finds an available one. As a safety guard it will bail out after 10 attempts. In a production environment you'd want an alarm on when that happens as it's an indication that the number of available permutations are running out.

Accessing a short URL

This state machine is much simpler and only contains a single state that does two things; increments a hit counter and returns the long URL.

Architecture

The ASL looks like this and uses an SDK integration to DynamoDB:

StartAt: Do redirect
States:
  Do redirect:
    Type: Task
    Resource: arn:aws:states:::aws-sdk:dynamodb:updateItem
    Parameters:
      TableName: ${UrlTable}
      ConditionExpression: attribute_exists(Id)
      ReturnValues: ALL_NEW
      UpdateExpression: SET HitCount = HitCount + :incr
      ExpressionAttributeValues:
        :incr:
          N: "1"
      Key:
        Id:
          S.$: $.hash
    ResultSelector:
      Url.$: $.Attributes.Url.S
    End: true    
Enter fullscreen mode Exit fullscreen mode

Hooking the state machines up with API Gateway

At first I wanted to use a HttpApi to enjoy lower latency and less cost, but it proved really hard to get the request and response mapping working. The main issue was that the output from the state machine comes as stringified JSON, and when using HttpApi, the $util.parseJSON() function wasn't available. Shoutout to all Community Builders, and in particular @jimmydahlqvist who got engaged in the problem <3

After much frustration I swapped to use a RestApi, which made my life easier. I will not go into details here, but let's zoom in on the request and response mapping. The full OpenAPI

Create URL: (POST /)

responses:
  200:
    statusCode: 200
    responseTemplates:
      application/json: 
        Fn::Sub: "#set($response = $input.path('$'))\n { \"ShortUrl\": \"https://${DomainName}/$util.parseJson($response.output).Hash\" }"
requestTemplates:
  application/json: 
    Fn::Sub: "#set($data = $input.json('$')) { \"input\": \"$util.escapeJavaScript($data)\", \"stateMachineArn\": \"${CreateUrl}\" }"
Enter fullscreen mode Exit fullscreen mode

Redirect to URL (GET /{id})

responses:
  200:
    statusCode: 301
    responseTemplates:
      text/html: "#set($response = $input.path('$'))\n#set($context.responseOverride.header.Location = $util.parseJson($response.output).Url)"
requestTemplates:
  application/json: 
    Fn::Sub: "#set($data = $util.escapeJavaScript($input.params('id'))) { \"input\": \"{ \\\"hash\\\": \\\"$data\\\" }\", \"stateMachineArn\": \"${RedirectToUrl}\" }"
Enter fullscreen mode Exit fullscreen mode

Too see the above mappings in context, visit the OpenAPI spec here

Conclusion

This article showed how we can use StepFunctions new intrinsic functions together with its native SDK service integrations to create a fully functional, yet simple, URL shortener. It certainly comes with some limitations that Lambda can solve and if you hit them, feel free to extend the workflow with a function.

If you have any improvements, such as converting to HttpApi or introducing better hashing, please submit a pull request

Building this project I spent 5% on creating the state machines and 95% on the API Gateway mappings. I'm hoping to see an improved SAM support for connecting API Gateway and synchronous StepFunctions Express state machines. Please upvote this issue if you agree.

💖 💪 🙅 🚩
ljacobsson
Lars Jacobsson

Posted on October 25, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related