Build with CDK (Cloud Development Kit) an infra for Image processing using Lambda functions, S3 buckets, and DynamoDB

Day 004 - 100DaysAWSIaCDevopsChallenge : Part 2

In this article, I am going to create a cloud architecture that allow me to resize in Lambda function and save it in DynamoDB tables all objects of type image uploaded inside S3 Bucket following these steps:

Emit an event after all actions of type s3:ObjectCreated:Put on the S3 Bucket
A Lambda function captures the above event and then processes it
The Lambda function get the original object created by its key
If the object is an image file (with an extension png, jpeg, jpg, bmp, webp or gif), resize the original image using Jimp lib docs
Store the orginal and resized images in the DynamoDB tables.
Finally store the resized image in another Bucket

All the steps will be achived using CDK (Cloud development Kits) infrastructure as code.

I recenlty wrote the same for Terraform lovers to achieve this. So if you want to do same but with Terraform please refer to 👉🏽 article

Architecture Diagram

Create S3 buckets

The event attached to the bucket will be directed to a Lambda Function. To create S3 Event for Lambda function, we first need to create the Bucket. Let's create our buckets using CDK:

const buckets = [
  new s3.CfnBucket(this, value.id, {
    bucketName: "PicturesBucketResource",
    publicAccessBlockConfiguration: {
      ignorePublicAcls: true,
      blockPublicAcls: true,
      blockPublicPolicy: true,
      restrictPublicBuckets: true
    },
    objectLockEnabled: false,
    tags: [{key: 'Name', value: `s3:bucket:PicturesBucketResource`}]
  }),
  new s3.CfnBucket(this, value.id, {
    bucketName: "ThumbnailsBucketResource",
    publicAccessBlockConfiguration: {
      ignorePublicAcls: true,
      blockPublicAcls: true,
      blockPublicPolicy: true,
      restrictPublicBuckets: true
    },
    objectLockEnabled: false,
    tags: [{key: 'Name', value: `s3:bucket:ThumbnailsBucketResource`}]
  })


]

Thes ignorePublicAcls, blockPublicAcls, blockPublicPolicy and restrictPublicBuckets parameters allow us to make the bucket publicly inaccessible.

The objectLockEnabled parameter indicates whether this bucket has an Object Lock configuration enabled, it applies only to news resources.

Now that the bucket is created 🙂, let's attach a trigger to it that will notify the Lambda Function when new object is uploaded to the bucket.

new s3.CfnBucket(this, value.id, {
  bucketName: "PicturesBucketResource",
  ...
  lambdaConfigurations: [{
      event: 's3:ObjectCreated:*',
      function: lambdaFunctionResource.functionArn
    }]
  ...
}),

Note that lambdaFunctionResource is the our Lambda function that will be created later in the function section.

⚠️ Note: As mentionned in the AWS Docs, an S3 Bucket support only one notification configuration. To bypass this issue, I suggest you if you have more that one notification (Lambda invokation, SNS topic trigger, etc.), create one Lambda notification and inside the Lambda function, dispatch your information to others resources (such as other Lambda, SQS,SNS, etc.).

Lambda function

Now that the bucket is created and event notification trigger is properly configured, let's create our Lambda function to catch all messages emitted by the bucket. The function code will perform the following operations:

Retrieve object created - The first operation for our Lambda will be to retrieve the created object if it is of image.
Resizing the image - Use Jimp ↗ library to create a miniature (thumb) of the original object.
Upload resized image to s3 bucket dedicated - Upload the thumb image to another s3 bucket.
Save the original and resized image metadata - After the image is resized without error, the metadata such as URL, Object key, size, etc., will be stored in two dynamoDB tables: one for original image and another for the resized image.

Before creating the Lambda function, we need to grant it the necessaries permissions to interact with others resources and vice versa:

Allow the bucket to Invoke function - lambda:InvokeFunction
Allow the Lambda function to get objects inside the bucket - s3:GetObject and s3:GetObjectAcl
Allow the Lambda function to put items in the dynamodb tables - dynamodb:PutItem.

Lambda Assume Role

Generate an IAM policy that allow action sts:AssumeRole where identifier of type Service is lambda.amazonaws.com. To do this, we will use the CDK Level 2 instead Level 1. ⚠️ Notte the the level 1 generally start by new Cfn<ResourceName> and more abstract. Exemple: CfnRoute, CfnBucket, etc. And the level 2 is less abstract and doesn't start by Cfn.

const lambdaRole = new iam.Role(this, "LambdaExecRole", {
  roleName: "s3-lambda-execution-role",
  assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com',{
    region: "us-east-1"
  }),
  ... // more configurations
})

The above block will generate the trusted Policy for assuming Lambda function:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create IAM Policy for lambda

Let's create a new policy to allow Lambda to:

get S3 objects
create S3 objects
put items into DynamoDB tables.

Let's back to the Lambda Assume Role section and complet the Role

const props = {
  picturesBucketName: "pictures-cdk-<HASH>",
  thumbnailsBucketName: "thumbnails-pictures-cdk-<HASH>",
  pictureTableName: "PicturesTable",
  thumbnailTableName: "ThumbnailsTable",
    ...
};
...

const lambdaRole = new iam.Role(this, 'LambdaExecRole', {
  roleName: 's3-lambda-execution-role',
  assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com',{
    region: "us-east-1"
  }),
  path: '/',
  inlinePolicies: {
    logging: new iam.PolicyDocument({
      assignSids: true,
      statements: [
        new iam.PolicyStatement({
          effect: iam.Effect.ALLOW,
          actions: [
            'logs:CreateLogGroup',
            'logs:CreateLogSteam',
            'logs:PutLogEvents'
          ],
          resources: ['*']
        })
      ]
    }),
    putItemDynamoDB: new iam.PolicyDocument({
      assignSids: true,
      statements: [
        new iam.PolicyStatement({
          effect: iam.Effect.ALLOW,
          actions: [
            'dynamodb:PutItem'
          ],
          resources: [
            `arn:aws:dynamodb:<REGION>:<ACCOUNT_ID>:table/${props.pictureTableName}`,
            `arn:aws:dynamodb:<REGION>:<ACCOUNT_ID>:table/${props.thumbnailTableName}`
          ]
        })
      ]
    }),
    s3Bucket: new iam.PolicyDocument({
      assignSids: true,
      statements: [
        new iam.PolicyStatement({
          effect: iam.Effect.ALLOW,
          actions: [
            's3:PutObject'
          ],
          resources: [
            `arn:aws:s3:::${props.thumbnailsBucketName}`,
            `arn:aws:s3:::${props.thumbnailsBucketName}/*`
          ]
        }),
        new iam.PolicyStatement({
          effect: iam.Effect.ALLOW,
          actions: [
            's3:GetObject',
            's3:GetObjectAcl'
          ],
          resources: [
            `arn:aws:s3:::${props.picturesBucketName}`,
            `arn:aws:s3:::${props.picturesBucketName}/*`
          ]
        })
      ]
    })
  }
});

In the policy, we have also allowed Lambda to log its activities into CloudWatch, which will permit us to visualize all activities inside the Lambda as shown below:

logging: new iam.PolicyDocument({
  assignSids: true,
  statements: [
    new iam.PolicyStatement({
      effect: iam.Effect.ALLOW,
      actions: [
        'logs:CreateLogGroup',
        'logs:CreateLogSteam',
        'logs:PutLogEvents'
      ],
      resources: ['*']
    })
  ]
})

Create the lambda function

Before creating the Lambda construct we need first to write code that will be executed inside the function. Below is the organization and content of the directory:

./assets
   |___lambda
       |___ node_modules/
       |___ index.ts
       |___ package.json

The content of assets/lambda/package.json file:

{
  "main": "index.ts",
  "type": "module",
  "scripts": {},
  "dependencies": {
    "@aws-sdk/client-dynamodb": "^3.613.0",
    "@aws-sdk/client-s3": "^3.613.0",
    "@aws-sdk/lib-dynamodb": "^3.613.0",
    "@jimp/plugin-resize": "^0.22.12",
    "jimp": "^0.22.12",
    "uuid": "^10.0.0"
  },
  "devDependencies": {
    "@types/uuid": "^10.0.0"
  }
}

Run npm install inside the assets/lambda directory to install dependencies.

cd assets/lambda
npm install

The function source code assets/lambda/index.ts:

import { GetObjectCommand, PutObjectCommand, S3Client } from '@aws-sdk/client-s3'
import { DynamoDBClient } from '@aws-sdk/client-dynamodb'
import { DynamoDBDocumentClient, PutCommand } from '@aws-sdk/lib-dynamodb'
import { v4 as UUID } from 'uuid'
import Jimp from 'jimp'


const region = process.env.REGION || 'us-east-1'
const thumbsDestBucket = process.env.THUMBS_BUCKET_NAME
const picturesTableName = process.env.DYNAMODB_PICTURES_TABLE_NAME
const thumbnailsTableName = process.env.DYNAMODB_THUMBNAILS_PICTURES_TABLE_NAME
const s3Client = new S3Client({
  region
})
const dynClient = new DynamoDBClient({
  region: region
})
const documentClient = DynamoDBDocumentClient.from(dynClient)

export const handler = async (event: any, context: any) => {
  for (const _record of [...event.Records || []]) {
    const bucket = _record.s3.bucket.name
    const objKey = decodeURIComponent(_record.s3.object.key.replace('/\+/g', ' '))

    if (new RegExp('[\/.](jpeg|png|jpg|gif|svg|webp|bmp)$').test(objKey)) {
      try {
        const originalObject = await s3Client.send(new GetObjectCommand({
          Bucket: bucket,
          Key: objKey
        }))
        console.log('Get S3 Object: [OK]')
        const imageBody = await originalObject.Body?.transformToByteArray()
        const image = await Jimp.read(Buffer.from(imageBody!.buffer))
        const thumbnail = await image.resize(128, Jimp.AUTO)
          .getBufferAsync(Jimp.MIME_PNG)
        console.log('Image resized: [OK]')
        await s3Client.send(new PutObjectCommand({
          Bucket: thumbsDestBucket,
          Key: objKey,
          Body: thumbnail
        }))
        console.log('Put resized image into S3 bucket: [OK]')
        const itemPictureCommand = new PutCommand({
          TableName: picturesTableName,
          Item: {
            ID: UUID(),
            ObjectKey: objKey,
            BucketName: bucket,
            Region: region,
            CreatedAt: Math.floor(Date.now() / 1000),
            FileSize: _record.s3.object.size
          }
        })
        await documentClient.send(itemPictureCommand)
        console.log('Put original metadata into DynamoDB Table: [OK]');
        const itemThumbCommand = new PutCommand({
          TableName: thumbnailsTableName,
          Item: {
            ID: UUID(),
            ObjectKey: objKey,
            BucketName: thumbsDestBucket,
            Region: region,
            CreatedAt: Math.floor(Date.now() / 1000),
            FileSize: thumbnail.byteLength
          }
        })

        await documentClient.send(itemThumbCommand)
        console.log('Put resized metadata into DynamoDB Table: [OK]')
        console.debug({
          statusCode: 200,
          body: JSON.stringify({
            object: `${bucket}/${objKey}`,
            thumbs: `${thumbsDestBucket}/${objKey}`
          })
        })
      } catch (e) {
        console.log(e)
        console.debug({
          statusCode: 500,
          body: JSON.stringify(e)
        })
      }
    } else {
      console.log('The image type is not supported. Supported types are: jpeg, png, jpg, gif, svg, webp, or bmp')
    }
  }
}

🚨🚨 if you want to precess large images, I recommended using Sharp ↗ instead Jimp, because Sharp is more faster due to its use of native code and efficient processiong library libvips and it's well suited for high performance.

Return to the CDK. Now that the source code is ready, we can now create our CDK Lambda function resource.

const lambdaFunction =  new NodejsFunction(this, 'zeLambdaFunction', {
  functionName: 'ProcessingImageFunction',
  role: role,
  handler: 'handler',
  runtime: lambda.Runtime.NODEJS_20_X,
  architecture: lambda.Architecture.ARM_64,
  timeout: Duration.seconds(10),
  memorySize: 512,
  bundling: {
    minify: false,
    format: OutputFormat.CJS,
    logLevel: LogLevel.VERBOSE
  },
  entry: './assets/lambda/index.ts',
  environment: {
    REGION: 'us-east-1',
    TRIGGER_BUCKET_NAME: props.picturesBucketName,
    THUMBS_BUCKET_NAME: props.thumbnailsBucketName,
    DYNAMODB_THUMBNAILS_PICTURES_TABLE_NAME: props.thumbnailTableName,
    DYNAMODB_PICTURES_TABLE_NAME: props.pictureTableName
  },
  allowPublicSubnet: false
})

Tags.of(lambdaFunction).add('Name', `lambda:ProcessionImageFunction`)

⚠️⚠️ Note: Every time index.ts changes, the CDK will update by redeploying the Lambda resource.

And the last Construct and the must important one in our Lambda section, is Lambda permission, that grants permission to S3 Bucket to invoke Lambda for all object-created events:

 const permission = new lambda.CfnPermission(this, 'LambdaPermission', {
  functionName: lamndaFunction.functionArn,
  sourceArn: `arn:aws:s3:::${props.picturesBucketName}`,
  action: 'lambda:InvokeFunction',
  principal: 's3.amazonaws.com',
  sourceAccount: "us-east-1"
})

Notify CDK to create Lambda function and S3 Permission construct before creating Bucket to avoid any errors.

const cfnFunctionResource = lambdaFunction.node.findChild('Resource') as lambda.CfnFunction;
pictureBucket?.addDependency(permission)
pictureBucket?.addDependency(cfnFunctionResource)

Create DynamoDB tables

We are now going to create two DynamoDB tables to persist the information about the original object and the resized image. As lambda function is already configured with dynamodb:PutItem, let's define those tables:

const tables = [props.pictureTableName, props.thumbnailTableName]
  .map(tbName => {
    return new dynamoDB.CfnTable(this, `DynamoDBTable-${tbName}`, {
      tableName: tbName,
      billingMode: 'PAY_PER_REQUEST',
      tableClass: 'STANDARD',
      keySchema: [{
        keyType: 'HASH',
        attributeName: 'ID'
      }, {
        keyType: 'RANGE',
        attributeName: 'ObjectKey'
      }],
      attributeDefinitions: [{
        attributeName: 'ID',
        attributeType: 'S'
      }, {
        attributeName: 'ObjectKey',
        attributeType: 'S'
      }],
      tags: [{key: 'Name', value: `dynamodb:${tbName}`}]
    })
  })

🥳✨woohaah!!!
We have reached the end of the article.
Thank you so much 🙂

Your can find the full source code on GitHub Repo

If you feel more comfortable with Terraform, feel free to consult the same article that I wrote for Terraform IaC. 👉🏽 Terraform

Feel free to leave a comment if you need more clarification or if you encounter any issues during the execution of your code.

Blog