An AWS Summer: EFS & Lambda + Serverless Framework
Adolfo Estevez
Posted on October 5, 2020
The autumn equinox has just passed, which is a perfect moment to look back, and review some of the features released in this last summer by AWS - in no particular order, just because I think they are cool - and useful :)
Serverless challenges
If you´ve been developing serverless applications for a while, pretty sure you have found yourself facing a few challenges, apart from the old cold start thing - which have been solved to a great extent with the Provisioned Concurrency feature.
For instance, let's say you need to load large files of rules consumed by a Lambda function, that implements a rules engine, or you need to keep data files produced dynamically by the function between invocations. Lambda provides some local space - 512MB - that you may use, but it's small and ephemeral, so is not useful for those kinds of scenarios.
Other solutions come to mind: storing in databases - RDS, DynamoDB,S3 ... but comes with a high price of development, performance and cost. What would happen if we had peaks of several hundreds - or thousands requests - per second, loading big files in the startup and writing files to a data store concurrently?
Well, at the very least, we could have a big performance hit, depending on the size of the files, the latency of retrieving the files at startup + the cold start of Lambdas - enter provisioned concurrency - plus the latency of storing the intermediate files to the datastores - it's not the same storing and retrieving from S3 than from DynamoDB.
So no alternative? Well, we are in luck, as AWS released EFS support for Lambda in June!
Amazon EFS is widely known, so I'm not going to delve depth into the service, but just to mention that Amazon Elastic File Service provides a NFS file system that escalates on demand, providing high throughput and low latency. It's very useful when shared storage, and a parallel access from the services it´s needed.
Configuration & Considerations
"With power comes responsibility", or in our case with powerful features come some configuration constraints. EFS runs in different subnets within a VPC, which means that our Lambda functions have to run within a VPC as well. That comes with a price: IP directioning, possible performance hit, loss of connection to AWS global services, therefore a NAT Gateway or Private Links / Gateway might need to be used, depending on the use case.
That constraint was vastly improved last year when Hyperplane ENI for Lambda was released, allowing that just a few ENI´s - and therefore a few IP´s - would be enough to handle a big number of Lambda invocations, decoupling function scaling from ENI´s provisioning.
Configuration - Serverless Framework
The configuration of a Lambda function running within a VPC could be fairly simple - if only needs to access the VPC resources - as in shown in the image below - under the vpc label:
A security group is needed for the Lambda function, the ID´s of subnet(s) where the ENI(s) will be placed, and permissións to create, delete, and describe network interfaces.
The Lambda function is running within our VPC now, an ENI placed in each subnet selected, but in order to access the EFS instance a few permissións will need to be provided:
Now the EFS can be created within the VPC. In order to do that, the console, Cloudformation, Serverless, AWS CLI, AWS SDK, etc ... could be used.
After creating the instance, an access point needs to be provided to allow applications access. This is a new resource: "AWS::EFS::AccessPoint". It can be created from the console, or through a cloudformation file - we will need to supply the EFS ID: ${self.provider}.
Finally, we link the file system to the Lambda Function, providing the arn of the EFS, the arn of the access point, and the local mounted path - as shown on the image below:
The EFS instance is ready to be accessed by the Lambda function :)
Solution
I have used the Serverless framework to produce the solution - but AWS SAM with Cloud 9 as the official alternative could have been used instead. I have quite experience with Serverless, having introduced it to a few companies - including Everis - with big success.
Let's create - or transfer - a rules file that can be accessed from the Lambda function :)
Different services could be used to transfer the files, like AWS DataSync, an EC2 instance, or even creating files from code. The files we might transfer from EC2 are accessible from the Lambda functions, so we´ll use this method.
After the EC2 instance has been created - a t2.micro is enough - in one of the subnets of the VPC that has access to the EFS ENI´s, a directory we´ll be needed - /efs. That directory doesn't have any link to the EFS instance, so we´ll need to mount the directory.
One way to do it is using the EFS tools:
sudo yum install -y amazon-efs-utils
An access point was created previously that we can use to mount the directory. It's easy to get the command line needed from the web console. Just go to to the Amazon EFS > Access Point > id link, and press the Attach button:
After mounting the directory - in green - the files can be transfer to the /efs directory:
At this point, the access to the directory from the Lambda function should be fully possible. I have coded a minimum Lambda function that lists the files contained in the directory:
The solution is now ready to be deployed. Keep in mind that I have only shown parts of the serverless.yml, equivalent to the cloudformation file you might use to provide the infrastructure - I will leave that to you as an exercise.
serverless deploy --stage dev --region eu-west-1
An URL link is provided by the framework, as I created an API gateway that invokes the Lambda function:
I have captured the request trace from the Cloudwatch Logs, where we can see the files in /efs: test.txt and rules.txt, and the low latency of the request.
Other Use Cases
- Loading big libraries that Lambda layers can´t handle.
- Files that are updated regularly.
- Files that need locks for concurrent access.
- Access to big files - zip / unzip.
- Using different computing architectures - EC2, ECS - to process the same files.
Posted on October 5, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.