Test Observability for AWS Lambda with Grafana Tempo and OpenTelemetry Layers
Oscar Reyes
Posted on June 26, 2024
I got great feedback from my Pulitzer award-winning blog post, "Testing AWS Lambda & Serverless with OpenTelemetry". The community wanted a guide on using the official OpenTelemetry Lambda layers instead of a custom TypeScript wrapper. š
I decided to write this follow-up but to spice it up a little š„µ. Today Iām using Grafana Cloud, which has become one of my favorite tools! We use it extensively at Tracetest for our internal tracing, metrics, profiling, and overall observability.
See the full code for the example app youāll build in the GitHub repo, here.
OpenTelemetry Lambda Layers
With a decade of development experience, one thing Iāve learned is that no-code solutions help save time and delegate maintenance and implementation to a third party. It becomes even better when it's free š¤Ā and from the OpenTelemetry community!
There are two different layers we will use today:
- The Node.js auto-instrumentation for AWS Lambda enables tracing for your functions without writing a single line of code, as described in the official OpenTelemetry docs, here and on GitHub, here.
- The OpenTelemetry collector AWS Lambda layer enables the setup to be 100% serverless without any need to maintain infrastructure yourself. You still need to pay for it though š.
Grafana Cloud
Grafana Cloud has become a staple tool to store everything related to observability under one umbrella. It allows integration with different tools like Prometheus for metrics or Loki for logs.
In this case, Iāll use Tempo, a well-known tracing backend where you store the OpenTelemetry spans generated by the Lambda functions.
Trace-based testing everywhere and for everyone!
Trace-based testing involves running validations against the telemetry data generated by the distributed systemās instrumented services.
Tracetest, as an observability-enabled testing tool for Cloud Native architectures, leverages these distributed traces as part of testing, providing better visibility and testability to run trace-based tests.
The Service under Test
Who said Pokemon? We truly love them at Tracetest, so today we have a new way of playing with the PokeAPI!
Using the Serverless Framework, Iāll guide you through implementing a Lambda function that sends a request to the PokeAPI to grab Pokemon data by id, to then store it in a DynamoDB table.
Nothing fancy, but this will be enough to demonstrate how powerful instrumenting your Serverless functions and adding trace-based testing on top can be! š„
Requirements
Tracetest Account
- Sign up toĀ
app.tracetest.io
Ā or follow theĀ get startedĀ docs. - Create anĀ environment.
- SelectĀ
Application is publicly accessible
Ā to get access to the environment'sĀ Tracetest Cloud Agent endpoint. - Select Tempo as the tracing backend.
- Fill in the details of your Grafana Cloud Tempo instance by using the HTTP integration. Check out the tracing backend resource definition, here.
- Test the connection and save it to finish the process.
AWS
- Have access to anĀ AWS Account.
- Install and configure theĀ AWS CLI.
- Use a role that is allowed to provision the required resources.
What are the steps to run it myself?
If you want to jump straight ahead to run this example yourself āļø.
First, clone the Tracetest repo.
git clone https://github.com/kubeshop/tracetest.git
cd examples/quick-start-serverless-layers
Then, follow the instructions to run the deployment and the trace-based tests:
- Copy theĀ
.env.template
Ā file toĀ.env
. - Fill theĀ
TRACETEST_API_TOKEN
Ā value with the one generated for your Tracetest environment. - Set the Tracetest tracing backend to Tempo. Fill in the details of your Grafana Cloud Tempo instance by using the HTTP integration including headers looking like
authorization: Basic <base 64 encoded>
. It should be encodedbase64
with the format ofusername:token
. Follow this guide to learn how. And, check out this tracing backend resource definition. You can apply it with the Tracetest CLI like thistracetest apply datastore -f ./tracetest-tracing-backend.yaml
. - Fill theĀ
authorization
header in thecollector.yaml
fileĀ from your Grafana Tempo Setup. It should be encodedbase64
with the format ofusername:token
. Follow this guide to learn how. - RunĀ
npm i
. - Run the Serverless Framework deployment withĀ
npm run deploy
. Use the API Gateway endpoint from the output in your test below. - Run the trace-based tests withĀ
npm test https://<api-gateway-id>.execute-api.us-east-1.amazonaws.com
.
Now, letās dive into the nitty-gritty details. š¤
The Observability Setup
Instrumenting a Lambda function is easier than ever, depending on your AWS region, add the ARN of the OpenTelemetry Collector and the Node.js tracer.
# serverless.yaml
functions:
api:
# Handler and events definition
handler: src/handler.importPokemon
events:
- httpApi:
path: /import
method: post
# ARN of the layers
layers:
- arn:aws:lambda:us-east-1:184161586896:layer:opentelemetry-nodejs-0_6_0:1
- arn:aws:lambda:us-east-1:184161586896:layer:opentelemetry-collector-amd64-0_6_0:1
Next, add a couple of environment variables to configure the start of the handler functions and the configuration for the OpenTelemetry collector.
# serverless.yaml
environment:
OPENTELEMETRY_COLLECTOR_CONFIG_FILE: /var/task/collector.yaml
AWS_LAMBDA_EXEC_WRAPPER: /opt/otel-handler
The opentelemetry-nodejs
layer will spin off the Node.js tracer, configure the supported auto-instrumentation libraries, and set up the context propagators.
While the opentelemetry-collector
layer is going to spin off a version of the collector executed in the same context as the AWS lambda layers, configured by the collector.yaml
file.
# collector.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
exporters:
otlp:
endpoint: tempo-us-central1.grafana.net:443
headers:
authorization: Basic <your basic64 encoded token>
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp]
Easy peezy lemon squeezy š right? well, this is everything you need to do to start your observability journey!
For every trace, there should be a test!
After having the observability setup, now is time to go to the next level by leveraging it by running some trace-based tests. This is our test case:
- Execute an HTTP request against the import Pokemon service.
- This is a two-step process that includes a request to the PokeAPI to grab the Pokemon data.
- Then, it executes the required database operations to store the Pokemon data in DynamoDB.
What are the key parts we want to validate?
- Validate that the external service from the worker is called with the properĀ
POKEMON_ID
Ā and returnsĀ200
. - Validate that the duration of the DB operations is less thanĀ
100ms
. - Validate that the response from the initial API Gateway request isĀ
200
.
Running the Trace-Based Tests
To run the tests, we are using theĀ @tracetest/client
Ā NPM package. It allows teams to enhance existing validation pipelines written in JavaScript or TypeScript by including trace-based tests in their toolset.
The code can be found inĀ theĀ tracetest.ts
Ā file.
import Tracetest from '@tracetest/client';
import { TestResource } from '@tracetest/client/dist/modules/openapi-client';
import { config } from 'dotenv';
config();
const { TRACETEST_API_TOKEN = '' } = process.env;
const [raw = ''] = process.argv.slice(2);
let url = '';
try {
url = new URL(raw).origin;
} catch (error) {
console.error(
'The API Gateway URL is required as an argument. i.e: `npm test https://75yj353nn7.execute-api.us-east-1.amazonaws.com`'
);
process.exit(1);
}
const definition: TestResource = {
type: 'Test',
spec: {
id: 'ZV1G3v2IR',
name: 'Serverless: Import Pokemon',
trigger: {
type: 'http',
httpRequest: {
method: 'POST',
url: '${var:ENDPOINT}/import',
body: '{"id": "${var:POKEMON_ID}"}\n',
headers: [
{
key: 'Content-Type',
value: 'application/json',
},
],
},
},
specs: [
{
selector: 'span[tracetest.span.type="database"]',
name: 'All Database Spans: Processing time is less than 100ms',
assertions: ['attr:tracetest.span.duration < 100ms'],
},
{
selector: 'span[tracetest.span.type="http"]',
name: 'All HTTP Spans: Status code is 200',
assertions: ['attr:http.status_code = 200'],
},
{
selector:
'span[name="tracetest-serverless-dev-api"] span[tracetest.span.type="http" name="GET" http.method="GET"]',
name: 'The request matches the pokemon Id',
assertions: ['attr:http.url = "https://pokeapi.co/api/v2/pokemon/${var:POKEMON_ID}"'],
},
],
},
};
const main = async () => {
const tracetest = await Tracetest(TRACETEST_API_TOKEN);
const test = await tracetest.newTest(definition);
await tracetest.runTest(test, {
variables: [
{
key: 'ENDPOINT',
value: url.trim(),
},
{
key: 'POKEMON_ID',
value: `${Math.floor(Math.random() * 100) + 1}`,
},
],
});
console.log(await tracetest.getSummary());
};
main();
Get True Test Observability
Make sure to apply the Tempo tracing backend in Tracetest. Create your Basic auth token, and use this resource file for reference. View the tracetest-tracing-backend.yaml
resource file on GitHub, here.
type: DataStore
spec:
id: tempo-cloud
name: Tempo
type: tempo
tempo:
type: http
http:
url: https://tempo-us-central1.grafana.net/tempo
headers:
authorization: Basic <base 64 encoded>
tls: {}
Apply the resource with the Tracetest CLI.
tracetest config -t TRACETEST_API_TOKEN
tracetest apply datastore -f ./tracetest-tracing-backend.yaml
Or, add it manually in the Tracetest Web UI.
With everything set up and the trace-based tests executed against the PokeAPI, we can now view the complete results.
Run the test with the command below.
npm test https://<api-gateway-id>.execute-api.us-east-1.amazonaws.com
Follow the links provided in theĀ npm test
Ā command output to find the full results, which include the generated trace and the test specs validation results.
[Output]
> tracetest-serverless@1.0.0 test
> ENDPOINT="$(sls info --verbose | grep HttpApiUrl | sed s/HttpApiUrl\:\ //g)" ts-node tracetest.ts https://<api-gateway-id>.execute-api.us-east-1.amazonaws.com/import
Run Group: #618f9cda-a87e-4e35-a9f4-10cfbc6f570f (https://app.tracetest.io/organizations/ttorg_ced62e34638d965e/environments/ttenv_a613d93805243f83/run/618f9cda-a87e-4e35-a9f4-10cfbc6f570f)
Failed: 0
Succeed: 1
Pending: 0
Runs:
ā Serverless: Import Pokemon (https://app.tracetest.io/organizations/ttorg_ced62e34638d965e/environments/ttenv_a613d93805243f83/test/ZV1G3v2IR/run/22) - trace id: d111b18ca75fb6dbf170b66d963363f9
Find the trace in Grafana Cloud Tempo
The full list of spans generated by the AWS Lambda function can be found in your Tempo instance, these are the same ones that are displayed in the Tracetest App after fetching them from Tempo.
šĀ Join the demo organization where you can start playing around with the Serverless example with no setup!!Ā š
From the Tracetest test run view, you can view the list of spans generated by the Lambda function, their attributes, and the test spec results, which validate the key points.
Key Takeaways
Simplified Observability with OpenTelemetry Lambda Layers
In this post Iāve highlighted how using OpenTelemetry Lambda layers allows for automatic tracing without additional code, making it easier than ever to set up observability for your Serverless applications.
Powerful Integration with Grafana Cloud
Grafana Cloud has become an essential tool in our observability toolkit. By leveraging Grafana Tempo for tracing, we can store and analyze OpenTelemetry spans effectively, showcasing the seamless integration and its benefits.
Enhanced Trace-Based Testing with Tracetest
Tracetest is a game-changer for trace-based testing. By validating telemetry data from our instrumented services, it provides unparalleled visibility and testability, empowering us to ensure our distributed systems perform as expected.
Would you like to learn more about Tracetest and what it brings to the table? Check theĀ docsĀ and try it out today byĀ signing up for free!
Also, please feel free to join our Slack community, give Tracetest a star on GitHub, or schedule a time to chat 1:1.
Posted on June 26, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
June 26, 2024