AWS SnapStart - Part 23 Measuring cold and warm starts with Java 17 using asynchronous HTTP clients
Vadym Kazulkin
Posted on June 27, 2024
Introduction
In the previous parts we've done many measurements with AWS Lambda using Java 17 runtime with and without using AWS SnapStart and additionally using SnapStart and priming DynamoDB invocation :
- cold starts using different deployment artifact sizes
- cold starts and deployment time using different Lambda memory settings
- warm starts using different Lambda memory settings
- cold and warm starts using different compilation options
- cold and warm starts with using different synchronous HTTP clients
In this article we'll now add another dimension to our Java 17 measurements : the choice of the asynchronous HTTP Client implementation. AWS own offering, the asynchronous CRT HTTP client has been generally available since February 2023.
I will also compare it with the same measurements for Java 21 already performed in the article Measuring cold and warm starts with Java 21 using different asynchronous HTTP clients.
Measuring cold and warm starts with Java 17 using asynchronous HTTP clients
In our experiment we'll re-use the application introduced in part 8 for this and rewrite it to use asynchronous HTTP client. You can the find application code here. There are basically 2 Lambda functions which both respond to the API Gateway requests and retrieve product by id received from the API Gateway from DynamoDB. One Lambda function GetProductByIdWithPureJava17AsyncLambda can be used with and without SnapStart and the second one GetProductByIdWithPureJava17AsyncLambdaAndPriming uses SnapStart and DynamoDB request invocation priming. We give both Lambda functions 1024 MB memory.
There are 2 asynchronous HTTP Clients implementations available in the AWS SDK for Java.
- NettyNioAsync (Default)
- AWS CRT (asynchronous)
This is the order for the look up and set of asynchronous HTTP Client in the classpath.
Let's figure out how to configure such asynchronous HTTP Client. There are 2 places to do it : pom.xml and DynamoProductDao
Let's consider 2 scenarios:
Scenario 1) NettyNioAsync HTTP Client. It's configuration looks like this
In pom.xml the only enabled HTTP Client dependency has to be:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>netty-nio-client</artifactId>
</dependency>
In DynamoProductDao the DynamoDBAsyncClient should be created like this:
DynamoDbAsyncClient.builder()
.region(Region.EU_CENTRAL_1)
.httpClient(NettyNioAsyncHttpClient.create())
.overrideConfiguration(ClientOverrideConfiguration.builder()
.build())
.build();
Scenario 2) AWS CRT synchronous HTTP Client. It's configuration looks like this
In pom.xml the only enabled HTTP Client dependency has to be:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>aws-crt-client</artifactId>
</dependency>
In DynamoProductDao the DynamoDBAsyncClient should be created like this:
DynamoDbAsyncClient.builder()
.region(Region.EU_CENTRAL_1)
.httpClient(AwsCrtAsyncHttpClient.create())
.overrideConfiguration(ClientOverrideConfiguration.builder()
.build())
.build();
For the sake of simplicity, we create all asynchronous HTTP Clients with their default settings. Of course, there is a potential to optimize there figuring out the right settings.
Using the asynchronous DynamoDBClient means that we'll be using the asynchronous programming model, so the invocation of getItem will return CompletableFuture and this is the code to retrieve the item itself (for the complete code see)
CompletableFuture<GetItemResponse> getItemReponseAsync =
dynamoDbClient.getItem(GetItemRequest.builder().
key(Map.of("PK",AttributeValue.builder().
s(id).build())).tableName(PRODUCT_TABLE_NAME).build());
GetItemResponse getItemResponse = getItemReponseAsync.join();
if (getItemResponse.hasItem()) {
return Optional.of(ProductMapper.productFromDynamoDB(getItemResponse.item()));
}
else {
return Optional.empty();
}
The results of the experiment below were based on reproducing more than 100 cold and approximately 100.000 warm starts with experiment which ran for approximately 1 hour. For it (and experiments from my previous article) I used the load test tool hey, but you can use whatever tool you want, like Serverless-artillery or Postman. I ran all these experiments for all 2 scenarios using 2 different compilation options in template.yaml each:
- no options (tiered compilation will take place)
- JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling)
We found out in the article Measuring cold and warm starts with Java 17 using different compilation options that with them both we've got the lowest cold and warm start times. We’ve also got good results with "-XX:+TieredCompilation -XX:TieredStopAtLevel=2” compilation option but I haven’t done any measurement with this option yet.
Let's look into the results of our measurements.
Cold and warm start time with compilation option "tiered compilation" without SnapStart enabled in ms:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
NettyNioAsync | 3760.75 | 3800.16 | 3898.23 | 4101.46 | 4254.09 | 4410.89 | 6.51 | 7.51 | 9.38 | 24.30 | 59.11 | 2475.66 |
AWS CRT | 2313.42 | 2346.89 | 2399.7 | 2502.56 | 2670.43 | 2812.78 | 5.68 | 6.45 | 7.69 | 20.33 | 69.90 | 975.35 |
Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) without SnapStart enabled in ms:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
NettyNioAsync | 3708.13 | 3773.56 | 3812.51 | 3854.03 | 4019.23 | 4198.23 | 6.21 | 7.16 | 8.80 | 22.81 | 57.27 | 2377.48 |
AWS CRT | 2331.25 | 2377.14 | 2451.72 | 2598.25 | 2756.01 | 2934.43 | 5.73 | 6.51 | 8.00 | 21.07 | 72.66 | 1033.18 |
Cold and warm start time with compilation option "tiered compilation" with SnapStart enabled without Priming in ms:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
NettyNioAsync | 2324.19 | 2380.61 | 2625.60 | 2864.13 | 2892.90 | 2895.29 | 6.72 | 7.87 | 9.99 | 26.31 | 1683.66 | 1991.13 |
AWS CRT | 1206.47 | 1348.03 | 1613.74 | 1716.90 | 1778.03 | 1779.76 | 5.73 | 6.51 | 8.00 | 22.45 | 692.16 | 997.82 |
Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) with SnapStart enabled without Priming in ms:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
NettyNioAsync | 2260.04 | 2338.17 | 2586.53 | 2847.01 | 2972.03 | 2972.72 | 6.51 | 7.63 | 9.53 | 25.09 | 1657.15 | 2132.46 |
AWS CRT | 1225.92 | 1306.90 | 1618.58 | 1846.86 | 1856.11 | 1857.26 | 5.64 | 6.40 | 7.87 | 22.09 | 703.24 | 1069.55 |
Cold and warm start time with compilation option "tiered compilation" with SnapStart enabled and with DynamoDB invocation Priming in ms:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
NettyNioAsync | 744.49 | 821.10 | 996.80 | 1130.58 | 1255.68 | 1256.49 | 6.21 | 7.16 | 8.94 | 23.17 | 158.16 | 351.03 |
AWS CRT | 677.05 | 731.94 | 983.93 | 1279.75 | 1282.32 | 1283.5 | 5.82 | 6.72 | 8.26 | 23.92 | 171.22 | 1169.44 |
Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) with SnapStart enabled and with DynamoDB invocation Priming in ms:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
NettyNioAsync | 697.66 | 747.47 | 967.35 | 1137.38 | 1338.63 | 1339.04 | 6.41 | 7.51 | 9.38 | 23.54 | 155.67 | 224.87 |
AWS CRT | 694.18 | 779.51 | 1017.94 | 1234.52 | 1243.19 | 1243.38 | 5.64 | 6.41 | 7.87 | 21.40 | 171.22 | 891.36 |
Conclusion
Our measurements revealed that "tiered compilation" and "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) values are close enough. The same we observed also with Java 21.
In terms of the HTTP Client choice, AWS CRT Async HTTP Client outperformed the NettyNio Async HTTP client by far for the cold start and warm start times. The only one exception was SnapStart enabled with priming where results have been quite close. The same we observed also with Java 21.
In terms of the individual comparison between Java 17 and 21 when we see lower cold starts for Java 21 for the cases where SnapStart is not enabled and it is enabled but priming is not applied. If priming is applied the cold start for Java 17 and Java 21 are very close to each other.
Warm start times between Java 17 and Java 21 are very close to each other for all use cases with some deviations in both directions for the higher percentiles which might depend on the experiment.
To see the full measurements for Java 21 please read my article Measuring cold and warm starts with Java 21 using different asynchronous HTTP clients.
Can we reduce the cold start a bit further? In the previous article Measuring cold and warm starts with Java 17 using synchronous HTTP clients in the "Conclusion" section we described how to reduce the deployment artifact size and therefore the cold start time for the AWS CRT synchronous HTTP Client. The same can also be applied for the asynchronous use case. Especially this looks promising: for the AWS CRT client we can define a classifier (i.e. linux-x86_64) in our POM file to only pick the relevant binary for our platform and reduce the size of the package. See here for the detailed explanation . In this article I measured the cold and warms starts only by using the uber-jar containing binaries for all platforms, so please set the classifier and re-measure it for our platform. Be aware that currently not all platforms/architectures like aarch_64 support SnapStart.
The choice of HTTP Client is not only about minimizing cold and warm starts. The decision is much more complex and also depends on the functionality of the HTTP Client implementation and its settings, like whether it supports HTTP/2. AWS published the decision tree which HTTP client to choose depending on the criteria.
Posted on June 27, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.