Non-atomic increments in NodeJS or how I found a vulnerability in express-brute package.
Roman Voloboev
Posted on April 18, 2019
TLDR: Use ExpressBruteFlexible to migrate from vulnerable express-brute package.
My aim is to provide unified package rate-limiter-flexible to manage expiring increments with flexible options and API, so any task related to counting events with expiration can be done with one tool.
I was looking for useful features across github several months ago. There are some good packages with similar purpose, I went trough their features and issues. Sometimes opened and even closed issues contain interesting ideas. express-brute has several opened issues.
Check twice. And then once again.
Warning orange light with the distinctive sound had switched on, when I read a ticket title global bruteforce count is not updating on more than 1000 concurrent requests.
Once all the 1000 requests are completed, the count in the express brute store doesn’t gets increased to more than 150 ever.
I checked number of downloads of express-brute on npm. Number was not small: more than 20k downloads per week. The issue was created more than 2 years ago. "Ok, I trust those users", - I thought and closed a browser's tab. I opened that ticket again in several days and decided to test it on my own.
Increment atomically. Especially in asynchronous environment.
I want you to understand more about express-brute package. It counts number of requests and then depending on options it allows to make request or prohibits during some number of seconds. The most important option is freeTries
, it limits number of allowed requests. If developer sets 5, it should count 5 requests, then allow 6th and stop 7th, 8th, etc during some time window. It counts requests by user name or by user name and IP pair. This way it protects against brute-forcing passwords.
You should also know, that express-brute implements get/set approach to count events. It can store data in several famous databases. Here is the process:
- Get counter data from a store on request.
- Check some logic, check limits, compare expiration and current dates, etc.
- Set new counter data depending on results from the second step.
You probably already get that. If our application processes 1000 concurrent requests, some requests are not going to be considered, because a Set operation overwrites previous Sets. It makes it clear, why somebody sees 150 instead of 1000 in a store! Slower database, more requests can be done invisibly. More threads or processes in an application, even more Set queries overwritten.
But that is not all. NodeJS event-loop makes it even more vulnerable. Let's see what happens with one NodeJS process:
- Get query is sent to a store, but result is not received yet. I/O callback is queued on event-loop level. It may be in that queue more than one event-loop tick waiting a result from a store. There may be more requests to Get data from a store during that time. Those I/O callbacks are queued too.
- Let's say, the first Get takes 10ms. Now our NodeJS process is ready to do math with results. But it also gets nine other Get results for requests made during 10ms time window. And all these Get results have the same value of counter ready to be incremented and Set.
- Math made. It is brilliant. Counter is incremented. Set queries are sent to a store. The same value is set 10 times in a row. 1 counted instead of 10.
Interested in consequences?
Stop theory, give us real numbers.
First of all I reproduced it locally. But local tests are not amazing. They are not reflection of real asynchronous web world. "Ok, let's try something interesting and real", thought I. And discovered, that Ghost open-source project uses express-brute. I was excited to make experiments on their services. No harm, honestly.
The receipt is quite simple:
- Load event-loop by some amount of requests. It should be slow to have long I/O queues. I launched a small tool to make 1000 requests per second.
- Instantly try 1000 passwords.
I was using mobile internet from other continent and a laptop with eight CPU cores. I was able to make 14 password tries instead of 5. (Edit: I was actually able to make 216 tries instead of 5 later.) "Phew, it is nothing, Roman", - you may think. It allows to make about 5 more in 10 minutes. Then again 5 in 10 minutes, then 5 in 20 minutes, etc with default Ghost settings. About 60 tries per the first day from one laptop over mobile internet with a huge latency. 1000 computers would make 60000 password tries per day.
10 minutes is default minimum delay in Ghost project. Default minimum delay set by express-brute is 500 milliseconds and maximum delay 15 minutes with 2 free tries. I didn't test, but it would allow about 500 password tries per day from one computer. It is not safe! Especially, if this attack is a part of a bigger plan.
It is important not only for banks
Users tend to use the same password across several services. If you think your application is not interesting for hackers, you may be wrong. Hackers can use a weak security of one service for increasing probability of an attack to another service.
We do not have free time to fix it!
I made it possible to migrate in a couple of minutes. There is ExpressBruteFlexible middleware. It has the same logic, options and methods, but it works with atomic increments built on top of rate-limiter-flexible package.
It is simple to migrate.
If you have any questions or stories to tell, I'd glad to chat or listen about that!
Posted on April 18, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.