Developing serverless REST APIs with API Gateway and AWS Lambda is now common for tasks like form submissions or backend workflows. With Amazon extending the API Gateway timeout beyond 29 seconds), it can now handle complex workflows like long-running machine learning predictions and Generative AI tasks.

In this article, we’ll explore how to leverage the AWS Lambda Power Tuning tool to optimize serverless REST APIs for both performance and cost efficiency.

What Is AWS Lambda Power Tuning?

AWS Lambda Power Tuning is a state machine built with AWS Step Functions that tests Lambda function performance with different memory allocation and recommends ways to reduce costs or improve performance.
It works with any Lambda runtime, offering broad compatibility.

Getting Started

Choose your deployment option from the guide and deploy the AWS Lambda Power Tuning tool. Once deployed, the following state machine will appear in AWS Step Functions:

Executing the Tool

Key Input Parameters

The key parameters for executing the state machine are given below

Parameter	Description
lambdaARN	Required. ARN of the Lambda function to optimize.
num	Required. Number of invocations per power configuration (min: 5, recommended: 10–100).
powerValues	Optional. Memory values to test (128MB–10,240MB). Defaults to values set at deployment.
payload	Optional. Request payload for the API. Can support a static payload for every invocation or a payload from a list with relative weights
payloadS3	For larger payloads (>256KB), it can accept a s3 object location
parallelInvocation	Runs all invocations in parallel if set to true (default: false).
strategy	It can be "cost" or "speed" or "balanced"; if you use "cost" the tool will suggest the cheapest option, while if you use "speed" the state machine will suggest the fastest option. When using "balanced" the state machine will choose a compromise between "cost" and "speed" according to the parameter "balancedWeight".
balancedWeight	Parameter that represents the trade-off between cost and speed. Value is between 0 and 1, where 0.0 is equivalent to "speed" strategy, 1.0 is equivalent to "cost" strategy. Default :0.5
preProcessorARN	ARN of a Lambda function to run before each invocation of the target function.
postProcessorARN	ARN of a Lambda function to run after each invocation of the target function.
includeOutputResults	Includes average cost and duration for each configuration in the final output (default: false).
onlyColdStarts	Forces all invocations to be cold starts

For a detailed description of these parameters, refer to the official documentation.

Inputs

Let’s explore how to configure inputs for executing the AWS Lambda Power Tuning tool in a scenario where API Gateway is configured with proxy integration to expose AWS Lambda as a REST API.

with the following input

{
  "lambdaARN": "<arn of the function being executed>",
  "powerValues": [ 128, 256, 512, 1024, 1536, 2048, 2560, 3072],
  "num": 10,
  "strategy": "speed",
  "payload": {...},
  "parallelInvocation": true,
  "includeOutputResults": true,
  "onlyColdStarts": true
}

Here's a breakdown of the input parameters and their effects:

lambdaARN: Specifies the ARN (Amazon Resource Name) of the Lambda function being tested.
powerValues: The function will be executed with the following memory allocations: 128 MB, 256 MB, 512 MB, 1024 MB, 1536 MB, 2048 MB, 2560 MB, and 3072 MB.
num: 10 parallel executions will be performed for each memory allocation.
strategy: In this case, the speed strategy is selected, meaning the tool will focus on finding the allocation that minimizes execution time.
payload: The data that will be passed as input to the Lambda function during each invocation. Refer Input format of a Lambda function for proxy integration for a full description of input for lambda. The input changes based on the HTTP Method - GET, PUT , POST, DELETE, ETC
parallelInvocation: The tool will invoke the Lambda function in parallel for each memory allocation, allowing for faster testing across configurations.
includeOutputResults: The average cost and average duration for every memory allocation will be included in the state machine output.
onlyColdStarts: Ensures that all invocations are cold starts.

Sample Inputs

The table contains GitHub links to sample input for testing various requests.

HTTP Method	GitHub URL
POST	Click for sample input
PUT	Click for sample input
GET	Click for sample input
GET (With Path Parameters)	Click for sample input
GET (With QueryString)	Click for sample input
DELETE	Click for sample input
PATCH	Click for sample input

Weighted Payloads

The tool also offers the option to define multiple payloads for HTTP methods, making it suitable for scenarios where payload structures vary significantly and can impact performance or speed. Refer Weighted Payloads in official documentation to understand how weighted payloads work

HTTP Method	GitHub URL
POST (With Weighted Payloads)	Click for sample input

Pre/post-processing functions

The tool also provides the ability to run custom logic before and after the execution of the lambda function. This logic should be implemented as lambda functions. Refer Pre/Post-processing functions in official documentation to understand how weighted payloads work

HTTP Method	GitHub URL
Post (With Pre/Post functions)	Click for sample input

Output

A sample output is shown below

{
  "output": {
    "power": 2048,
    "cost": 0.0000018816000000000001,
    "duration": 54.95933333333334,
    "stateMachine": {
      "executionCost": 0.00075,
      "lambdaCost": 0.0013002423000000002,
      "visualization": "https://lambda-power-tuning.show/#encodeddata"
    },
    "stats": [
      {
        "value": 128,
        "averagePrice": 9.345000000000001e-7,
        "averageDuration": 443.89949999999993
      },
      {
        "value": 256,
        "averagePrice": 8.358000000000001e-7,
        "averageDuration": 198.4030000000001
      },
      {
        "value": 512,
        "averagePrice": 7.56e-7,
        "averageDuration": 88.51350000000001
      },
      {
        "value": 1024,
        "averagePrice": 0.0000011256,
        "averageDuration": 66.2415
      },
      {
        "value": 1536,
        "averagePrice": 0.000001512,
        "averageDuration": 58.86233333333332
      },
      {
        "value": 2048,
        "averagePrice": 0.0000018816000000000001,
        "averageDuration": 54.95933333333334
      },
      {
        "value": 2560,
        "averagePrice": 0.0000023940000000000003,
        "averageDuration": 56.222
      },
      {
        "value": 3072,
        "averagePrice": 0.0000028728000000000007,
        "averageDuration": 56.312333333333335
      }
    ]
  }
}

A brief description of the output is given below

Key	Description
output.power	The optimal memory configuration (RAM).
output.cost	The corresponding average cost (per invocation).
output.duration	The corresponding average duration (per invocation).
output.stateMachine.executionCost	The AWS Step Functions cost corresponding to this state machine execution (fixed value for "worst" case).
output.stateMachine.lambdaCost	The AWS Lambda cost corresponding to this state machine execution (depending on number of executions and average execution time).
output.stateMachine.visualization	A URL to visualize and inspect average statistics about cost and performance. Note: Average statistics are NOT shared with the server, as all data is encoded in the URL hash, client-side only.
output.stats	The average duration and cost for every tested power value configuration. Only included if `includeOutputResults` is set to a truthy value.

Visualizing the output

The element - output.stateMachine.visualization provides a visualization URL - https://lambda-power-tuning.show/#encodeddata that can be used to visualize the result

The source code of the UI is also open source - https://github.com/matteo-ronchetti/aws-lambda-power-tuning-ui

Blog

Optimizing Serverless REST APIs with AWS Lambda Power Tuning

Sabarish Sathasivan