Making a Totally Free Uptime Monitor using a Worker Runtime and OpenTelemetry
Grunet
Posted on June 3, 2024
Table of Contents
- What is an Uptime Monitor and When to Use One?
- Traditional Options
- Using a Worker Runtime and OpenTelemetry
- Takeaway
What is an Uptime Monitor and When to Use One?
An uptime monitor is a tool that periodically (e.g. every minute) checks your application or API to gauge if it’s up and healthy.
If you have true observability and are using SLOs effectively you probably don’t need to use one. But if you’re not at that level yet, an uptime monitor can be a valuable information source regarding the reliability of your application or API.
Traditional Options
There are a number of ways to run an uptime monitor. For example,
- Running a cron job on a server/VM and using bash, curl, and webhooks
- Setting up an Eventbridge cron with Container/Lambda targets and webhooks
- Paying for a 3rd party service (e.g. Pingdom)
Each of them comes with their own downsides though
- Maintenance (e.g. security patching, keeping away from end-of-life states)
- Complexity (e.g. setting up IaC, CI/CD)
- Cost
Is there an option that avoids these downsides?
Using a Worker Runtime and OpenTelemetry
I contend there is using a worker runtime and OpenTelemetry.
The High-Level Solution
The solution maps out at a high-level as follows
- Use a cron from a worker runtime
- Have the worker hit the application or API endpoint
- Gather instrumentation about the network call with OpenTelemetry
- Send that OpenTelemetry instrumentation to an observability backend
- Use the observability backend to alert on unhealthy traffic
The High-Level Setup Steps
These steps will use Cloudflare Workers for the worker runtime, but something similar can be done with Deno Deploy as well.
- Create a free Cloudflare account
-
Create a worker with the following code and the Node.js compatibility flag
import { instrument } from '@microlabs/otel-cf-workers' const handler = { async scheduled(event, env, ctx) { await fetch(env.ENDPOINT_TO_MONITOR) } } const config = (env, _trigger) => { return { exporter: { url: 'https://api.honeycomb.io/v1/traces', headers: { 'x-honeycomb-team': env.HONEYCOMB_API_KEY }, }, service: { name: env.ENDPOINT_NAME }, } } export default instrument(handler, config)
Add an environment variable named “ENDPOINT_TO_MONITOR” with the endpoint to check and add another environment variable named “ENDPOINT_NAME” with a friendly name for the endpoint
Create an environment named “Uptime Monitors” and create an ingest key
Back in Cloudflare, take that ingest key and copy-paste it into a Cloudflare Workers secret named “HONEYCOMB_API_KEY”
Add a cron of “* * * * *” to the worker
(Confirm that traces are appearing every minute in Honeycomb)
-
In Honeycomb, create a trigger (alert) based on the query
COUNT > 0 where http.response.status_code >= 400
Route the trigger’s notifications as needed (e.g. to Slack)
You should now have a functioning uptime monitor for your endpoint.
Comparison to the Other Options
Compared to the other options outlined before, this solution has
- Minimal maintenance (just a single npm package and its dependencies to monitor for security vulnerabilities)
- Minimal complexity (just the steps outlined above)
- Totally free (the usage is very much within the Cloudflare Workers free tier and Honeycomb free tier)
Takeaway
Paying for an uptime monitor service is probably preferable to this (if you’re able to).
The real takeaway is that there is this newer form of compute (worker runtimes) with a cost model that can be taken advantage of for situations similar to this.
Posted on June 3, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 14, 2024