Filtering DynamoDB Streams before Lambda

tom_millner

Tom Milner

Posted on November 29, 2021

Filtering DynamoDB Streams before Lambda

Event Filtering

AWS recently released a feature that could dramatically reduce your Lambda costs and improve the sustainability of your code. This new feature allows you to filter event sources from SQS, Kinesis Data Streams and DynamoDB Streams before they invoke the Lambda. See the official announcement here.

Before the release of this feature, every insert, update and delete operation on items in the source DynamoDB table caused the Lambda function to be invoked. Filtering would have to be applied to the events within the function to ascertain whether further processing should occur. This always seemed inherently wasteful to me and this new feature is most definitely welcome.

Example

I have a use case where I maintain a running count of items per partition key on a DynamoDB table. To do this, I enabled the Streams feature on the source table and used that to trigger a Lambda function. As I am just counting items the function only processes INSERT and REMOVE events and does not process MODIFY events. You can read more about it here. I should say that the item holding the counts is on the same table and every update to that item puts a MODIFY event on the stream. This event in turn then triggers the Lambda again which the function ignores.
One of the earliest steps in the function is to check for an INSERT or REMOVE event. If it is not an INSERT or REMOVE event, the function exits. With the new ability to filter INSERT and REMOVE events on the stream, I can choose to ignore MODIFY events before invoking the Lambda function. This will reduce the number of times Lambda is invoked by 50%.

Implementation

It was surprisingly easy to implement the filter and it just meant the addition of 3 lines of yaml in my SAM template.

Without filtering

      Events:
        DynamoDBEvent:
          Type: DynamoDB
          Properties:
            Stream:
              !GetAtt DynamoDBTable.StreamArn
            StartingPosition: TRIM_HORIZON
            BatchSize: 1
Enter fullscreen mode Exit fullscreen mode

With filtering

      Events:
        DynamoDBEvent:
          Type: DynamoDB
          Properties:
            Stream:
              !GetAtt DynamoDBTable.StreamArn
            StartingPosition: TRIM_HORIZON
            BatchSize: 1
            FilterCriteria:
              Filters:
                - Pattern: '{"eventName": ["INSERT","REMOVE"]}'
Enter fullscreen mode Exit fullscreen mode

Single Table Design

This feature will have positive benefits where you are using single table design in conjunction with DynamoDB Streams. In a single table design, you can record several different entities within the one table. If you have a Lambda that is targeted only to one entity in the table, you should now be able to filter events belonging to that entity and ignore others. This new feature allows you to filter to patterns that include the data item being written to the stream. Depending on how you designed your partitioned key or other fields, you can reference then within the data field labelled "dynamodb".

      FilterCriteria:
         Filters:
           - Pattern: '{"dynamodb": {"pk1": [{"prefix":"ANIMAL#"}]}}'
Enter fullscreen mode Exit fullscreen mode

Sustainability

I love the implication here that this feature reduces the carbon footprint of your code while also saving you money. This is sustainability in action. In my use case, there wasn't any code that could be removed. But I could see that there may be cases where code in a Lambda function could shift left into a filter. This would further reduce the footprint of the function and make it faster to load.

💖 💪 🙅 🚩
tom_millner
Tom Milner

Posted on November 29, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related