Prettifying HealthChecks
Russ Hammett
Posted on December 18, 2020
Previously I wrote about creating Health Checks for Microsoft Orleans, but the JSON response was too minimal. In this post we’ll see about prettifying that output!
In the previous post we learned a bit about health checks, how to create them, and view their “health” from the perspective of Microsoft Orleans. The end result was a single word response of “Healthy”, “Degraded”, or “Unhealthy”; not very exciting stuff.
In this post, I’d like to quickly go over how you’d go about not only reporting on the “overarching status”, but giving details on the individual health checks that make up that overarching status.
(Note: I did have an issue out there with a hacktoberfest label, that I did get a PR for, but I wanted to go a slightly different route in the end, though I did have it integrated into master for some time.)
The Health check documentation does go into some detail about how to accomplish this health check prettifying, but I’m not a huge fan of “manually” writing out JSON; instead I opted for an anonymous object.
Prettifying Property Potentials
A few new things I want to report on from the response to the health check GET endpoint:
- Health Check Name
- Health Check Description
- Individual Health Check Status
- Some additional information specific to the health check that describes what makes the health check return “Degraded” or “Unhealthy”
Luckily all of this information can be made available to us via the HealthReport
generated as a part of the health check.
Health Check Response Writer
We’re going to introduce a new method that writes a custom response for our health check endpoint. From startup, we’ll want to provide a custom ResponseWriter
within MapHealthChecks
:
app.UseEndpoints(endpoints =>
{
endpoints.MapHealthChecks(
"/health",
new HealthCheckOptions
{
AllowCachingResponses = false,
ResponseWriter = HealthCheckResponseWriter.WriteResponse
})
.WithMetadata(new AllowAnonymousAttribute());
endpoints.MapControllers();
});
Where the referenced HealthCheckResponseWriter
is a new static class we’ll be introducing next.
the ResponseWriter
expects a method with the following signature:
You’ll notice above the method receives an HttpContext
as well as HealthReport. This HealthReport
will make available to us several pieces of data that we can report on, specific to each individual health check.
As for our actual response writer implementation, here is the original that was merged into master from L-Dogg‘s PR:
private static Task WriteResponse(HttpContext context, HealthReport result)
{
context.Response.ContentType = "application/json; charset=utf-8";
var options = new JsonWriterOptions
{
Indented = true
};
using var stream = new MemoryStream();
using (var writer = new Utf8JsonWriter(stream, options))
{
writer.WriteStartObject();
writer.WriteString("status", result.Status.ToString());
writer.WriteStartObject("results");
foreach (var entry in result.Entries)
{
writer.WriteStartObject(entry.Key);
writer.WriteString("status", entry.Value.Status.ToString());
writer.WriteString("description", entry.Value.Description);
writer.WriteStartObject("data");
foreach (var item in entry.Value.Data)
{
writer.WritePropertyName(item.Key);
JsonSerializer.Serialize(
writer, item.Value, item.Value?.GetType() ??
typeof(object));
}
writer.WriteEndObject();
writer.WriteEndObject();
}
writer.WriteEndObject();
writer.WriteEndObject();
}
var json = Encoding.UTF8.GetString(stream.ToArray());
return context.Response.WriteAsync(json);
}
The above definitely works, but I’m not huge on writing the json “manually” (if that makes sense). I wanted to write another blog post on this anyway, as I already had a branch going (and didn’t actually expect a PR :O), so here’s my solution:
internal static class HealthCheckResponseWriter
{
public static Task WriteResponse(HttpContext context, HealthReport healthReport)
{
context.Response.ContentType = "application/json; charset=utf-8";
var result = JsonConvert.SerializeObject(new
{
status = healthReport.Status.ToString(),
details = healthReport.Entries.Select(e => new
{
key = e.Key,
description = e.Value.Description,
status = e.Value.Status.ToString(),
data = e.Value.Data
})
}, Formatting.Indented);
return context.Response.WriteAsync(result);
}
}
I find it a bit more concise working with the anonymous object.
Health Check updates
We’re not currently generating “data” information from the health checks that the HealthCheckResponseWriter
would be able to make use of, so let’s take a look at what we could do there.
My intention for the “data” property of the anonymous object is to describe what would make the specific health check return a “Degraded” or “Unhealthy”, anything aside from those two statuses can be assumed to be “Healthy”.
If you recall, we already built thresholds into the health checks to represent the degraded and unhealthy statuses, now we’ll just need to provide those available to the health report.
Taking a look at the HealthCheckResult
class:
you’ll see that method takes in an optional IReadOnlyDictionary<string, object> data = null
, which happens to be the “data” member we made sure to return from our WriteResponse
method in the previous section of the post.
We will make use of this IReadonlyDictionary
to provide our “threshold” information on a per grain basis. I will be putting this threshold information into both the CPU and Memory grains, but just as an example here’s what one of those will look like:
[StatelessWorker(1)]
public class CpuHealthCheckGrain : Grain, ICpuHealthCheckGrain
{
private const float UnhealthyThreshold = 90;
private const float DegradedThreshold = 70;
private readonly ReadOnlyDictionary<string, object> HealthCheckData = new ReadOnlyDictionary<string, object>(
new Dictionary<string, object>()
{
{ "Unhealthy Threshold", UnhealthyThreshold},
{ "Degraded Threshold", DegradedThreshold}
});
private readonly IHostEnvironmentStatistics _hostEnvironmentStatistics;
public CpuHealthCheckGrain(IHostEnvironmentStatistics hostEnvironmentStatistics)
{
_hostEnvironmentStatistics = hostEnvironmentStatistics;
}
public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = new CancellationToken())
{
if (_hostEnvironmentStatistics.CpuUsage == null)
{
return Task.FromResult(HealthCheckResult.Unhealthy("Could not determine CPU usage.", data: HealthCheckData));
}
if (_hostEnvironmentStatistics.CpuUsage > UnhealthyThreshold)
{
return Task.FromResult(HealthCheckResult.Unhealthy(
$"CPU utilization is unhealthy at {_hostEnvironmentStatistics.CpuUsage:0.00}%.", data: HealthCheckData));
}
if (_hostEnvironmentStatistics.CpuUsage > DegradedThreshold)
{
return Task.FromResult(HealthCheckResult.Degraded(
$"CPU utilization is degraded at {_hostEnvironmentStatistics.CpuUsage:0.00}%.", data: HealthCheckData));
}
return Task.FromResult(HealthCheckResult.Healthy(
$"CPU utilization is healthy at {_hostEnvironmentStatistics.CpuUsage:0.00}%.", data: HealthCheckData));
}
}
You should notice in the above that we introduce a ReadOnlyDictionary
with the thresholds for degraded and unhealthy, then passed that ReadOnlyDictionary
to the data
parameter of the static method within HealthCheckResult
.
Testing it out
The only thing left to do is test it out! You may have seen the cover image which contained spoilers, but just to wrap things up, here’s what it looks like when hitting the “/health” endpoint after our changes:
References
Posted on December 18, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.