NodeJS API Circuit Breaker Pattern
Khaing Khant Htun
Posted on May 16, 2022
In our increasingly interconnected software environments, call to remote resources can fail due to a variety of reasons, such as unreliable connection, transient or permanent problems with the remote service, timeouts because of services being very busy etc. This can lead to chains or cascades of errors being thrown as the request is being made. Consequently, any additional call to the failed service may respond with a spew of errors that prove nothing more than futile, thus wasting our own CPU and computing resources in actually making further failed requests.
For example, if I were to use external data in my app (such as data for countries and towns), I will have to use a third-party API for that, since obviously, unless I work as a data collector myself, I will have no practical way of updating or maintaining such information. First, my front-end (A) would have to call my back-end API (B) for the said data, which in turn have to make a request to the resource APIs (C). Suppose if there is an error in resource APIs (C), it will return an error which any sane back-end would beautifully handle. But lets say, the requests from the front-end (A) to back-end (B) become repeated and we have a situation of calling the erroneous API (C) repeatedly which will consume our server resources and just spew back errors, we can make a break in calling the API just like a faulty wiring in the household will trigger the circuit breaker, causing a break in the circuit.
This is a rough description of above scenario. Actually, circuit breakers are more important in situations where a request would travel through many layers of service invocation chain that a failure in tail services can have quite a long error cascade.
How it works (Circuit Breaker for Dummies? I guess)
It's important to note here that circuit breaker is essentially a state machine with 3 states: Open, Half-Open, Closed.
A usual circuit breaker implementation, with the first error request, would start what is called an "Open Timer" (usually about 10 seconds), and change the breaker state from "Open" to "Half-Open". In this "Half-Open" state, the breaker monitors the number of failed requests and also the number requests that are successful. If the number of failed request exceeds some predefined threshold during this "Open Timer" period, the breaker calculates the percentage of failed requests [i.e. (failed/(failed+success))*100 ] and checks if the calculated percentage also exceeds a threshold. If it does, then the breaker changes state from "Half-Open" to "Closed".
In this closed state, the breaker will not make any remote calls when requested, and just fail or maybe return with a predefined response. The "Closed" state lasts according to "Closed Timer" (which is also usually a few seconds). After the "Closed Timer" ends, the breaker will let a call through to the remote resource and see if it's still in error or is actually successful. If the call still responds with an error, the "Closed Timer" is reset and the breaker remains in "Closed" state. If it is successful, the breaker changes to "Open" state, and the operation can continue normally.
Example Implementation in JavaScript
I would like to demonstrate an example implementation in JavaScript even though in most projects, you'll probably be better off using fully implemented packages like Opossum.
First, start a node project in a new folder. Of course, we need node and npm installed first. If not, check out Node's Official Website.
npm init -y
We are going to use an object-oriented approach to building a simple circuit breaker. Create a file called circuit-breaker.js in project root.
First, in the file, define the states that our circuit breaker can be in, we'll just use a simple object mapping, even though for bigger and real-world projects, I would recommend using typescript since it provides the strong type definitions suitable for circuit breaker implementation.
const CircuitBreakerState = {
OPENED: "Opened",
CLOSED: "Closed",
HALF: "Half",
};
Next, create the main circuit breaker class -
class CircuitBreaker {
// Circuit Breaker Options
options = {};
// Customizable request call which will return a promise
request;
// Breaker state
state = CircuitBreakerState.OPENED;
// The constructor accepts a request call that we will be wrapping our breaker around
constructor(request, options = {}) {
this.request = request;
this.options = {
openBreakerTimeout: options.openBreakerTimeout || 10000,
closedBreakerTimeout: options.closedBreakerTimeout || 5000,
minimunFailedRequestsAllowed:
options.minimunFailedRequestsAllowed || 2,
percentageFailedRequestsAllowed:
options.percentageFailedRequestsAllowed || 50,
};
}
// ...more below...
}
We first declare our class with state (of the 3 possible breaker states), options (predefined breaker timeouts and thresholds) and the request properties. The constructor for this class accepts a request function, which we will assume to be asynchronous and we are going to wrap a circuit breaker for this call.
Next, we are going to implement the method called fire(), which will be the main method that will put our breaker to work. Before that, declare properties that we'll be using to dynamically keep track of the breaker status.
// inside CircuitBreaker class
// dynamic breaker parameters
successCount = 0;
failCount = 0;
// This is the timer that will keep track when "closed timer" ends,
// allowing a call to go through to check the remote status
allowNextRequestAt = undefined;
// This is the timer to keep track of the end of "open timer"
// where the half state "finishes"
finishHalfStateAt = undefined;
// inside CircuitBreaker class
async fire(requestArgs) {
if (
this.state === CircuitBreakerState.CLOSED &&
Date.now() < this.allowNextRequestAt
) {
throw new Error("Breaker Closed! Try again later.");
}
try {
const response = await this.request(requestArgs);
return this.success(response);
} catch (e) {
return this.fail(e);
}
}
In the fire() method, we can see if the breaker is in "Closed" state and the "Closed Timer" hasn't ended, the remote call is not actually made and instead an error is thrown. We can replace the error instead with a predefined response or behavior.
If the call is allowed, our request() function that calls the remote service is invoked and another 2 important methods namely, success() and fail() are called depending on the request()'s failure or success. Let's implement these methods, which are actually the core of breaker's usefulness.
// inside CircuitBreaker class...
resetCountersAndTimer() {
this.successCount = 0;
this.failCount = 0;
this.finishHalfStateAt = undefined;
}
success(response) {
if (this.state === CircuitBreakerState.HALF) {
this.successCount++;
// If "Open Timer" is over?
if (Date.now() >= this.finishHalfStateAt) {
this.resetCountersAndTimer();
this.state = CircuitBreakerState.OPENED;
}
}
// The first success call after "Closed Timer"
if (this.state === CircuitBreakerState.CLOSED) {
this.state = CircuitBreakerState.OPENED;
this.resetCountersAndTimer();
}
return response;
}
If the request call is successful and the breaker state is "Half-Open", which means we are still tracking the statistics, we increment the successCount. In this state, we also check whether the half state "Open Timer" is over and if it is true, then we reset the timers and counts, and re-open the breaker for normal activity.
If the breaker is "Closed", then we change it to "Open" and reset the counters, since this call is actually the call right after "Closed Timer" has expired (remember we don't allow calls during closed timer, re-check fire() method implementation), the success response means the service is allowed to be used again.
Next and final method to our breaker is fail(), which will be invoked on remote call failure -
// inside CircuitBreaker class
fail(e) {
if (this.state === CircuitBreakerState.CLOSED) {
this.allowNextRequestAt =
Date.now() + this.options.closedBreakerTimeout;
return e;
}
if (this.state === CircuitBreakerState.OPENED) {
this.state = CircuitBreakerState.HALF;
this.failCount++;
this.finishHalfStateAt =
Date.now() + this.options.openBreakerTimeout;
return e;
}
if (this.state === CircuitBreakerState.HALF) {
this.failCount++;
if (Date.now() > this.finishHalfStateAt) {
this.resetCountersAndTimer();
this.failCount = 1;
this.finishHalfStateAt =
Date.now() + this.options.openBreakerTimeout;
return e;
}
if (this.failCount >= this.options.minimunFailedRequestsAllowed) {
const percentageFail =
(this.failCount / (this.failCount + this.successCount)) *
100;
if (
percentageFail >=
this.options.percentageFailedRequestsAllowed
) {
this.state = CircuitBreakerState.CLOSED;
this.resetCountersAndTimer();
this.allowNextRequestAt =
Date.now() + this.options.closedBreakerTimeout;
return e;
}
// if count is exceeded but not percentage
this.resetCountersAndTimer();
this.failCount = 1;
this.finishHalfStateAt =
Date.now() + this.options.openBreakerTimeout;
return e;
}
return e;
}
}
If the request fails, the fail() method checks the breaker's current state and act accordingly. If it is "Closed" (which means this is the first call allowed after "Closed Timer"), the breaker remains in "Closed" state (cause we're failing!) and reset the "Closed Timer" (in this case, taking another 5 seconds for "Closed" state again).
If the breaker is in "Open" state, which means that this is the first remote call that essentially "failed", the sensible thing we should do here is to start our failure tracking windows. Therefore, we start the failure counts, change the breaker state to "Half-Open" and mark the "Open Timer".
If the breaker is in "Half-Open" state, this means we are already tracking the statistics. We first increment our fail count. If the "Open Timer" has expired, but since this is the failed request, we reset the previous statistics and restart another tracking window "Open Timer". If not, it means we are still within the "Open Timer" window, therefore we check the fail counts whether it exceeds our predefined threshold and if it does, we start the fail percentage calculation. Here, either of 2 things can happen. First, both the fail count and percentage exceeds the predefined thresholds meaning it's time to close our breaker to prevent further fail requests. Another thing that can happen is that the fail count exceeds the threshold but the percentage does not, which in this case, we reset the tracking statistics, reset the "Open Timer", and we'll still be in "Half-Open" state.
Let's test the breaker with a mock API call to a small server that we'll be setting up. First, let's create a file called index.js fill the code below where we'll fire our call. By the way, let's just install axios in our project to make a quick GET request from here.
npm install axios
// index.js
const axios = require("axios");
const { CircuitBreaker } = require("./circuit-breaker");
const fetchRequest = (req) => {
return axios.get("http://localhost:8080");
};
const breaker = new CircuitBreaker(fetchRequest);
setInterval(
() =>
breaker
.fire()
.then((res) => console.log("Response : " + res))
.catch((e) => console.error("Error : " + e.message)),
1000
);
We'll make an async GET call to a webserver at localhost:8080 at 1 second intervals. Notice how we've wrapped our remote call with the CircuitBreaker's fire() method.
We don't have a server yet, so we can't run index.js yet. Let's quickly mock up a small server, create server.js. We'll just use node's http module for our basic server. Our server will randomly respond with either success (with 200 status code) or failed (with 500 status code).
const http = require("http");
// tweak this to change errors frequency
const errorRate = 0.3;
http.createServer(function (req, res) {
if (Math.random() > errorRate) {
res.writeHead(200);
res.write("Success");
} else {
res.writeHead(500);
res.write("Failed");
}
res.end();
}).listen(8080, () => console.log("Server listening at Port 8080"));
Create a new terminal and run -
node server.js
If our server is currently listening,
take a new terminal and run -
node index.js
You will see an output similar to this.
Now, we are making a call to the server every 1 second and the server randomly failing our requests. We can also see our breaker working as expected that it closes after reaching thresholds and reopening after a set "closed timer" if the exceeding call succeeds.
Now we have a basic functioning circuit breaker class, we can wrap such implementation not only for API requests like this, but also for other remote calls, IO calls that we expect failures can occur.
References -
I used an explanation from Azure Architecture Cloud Design Patterns circuit breaker pattern to study and reference this article.
I've made many references regarding implementations from
Vladimir Topolev's article on Node.JS circuit breaker pattern. I give him my sincere credits.
Posted on May 16, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 27, 2024
October 7, 2024
October 21, 2024