MongoDB Native Driver vs Mongoose: Performance Benchmarks
Jscrambler
Posted on December 18, 2020
The time has come to put the native driver and mongoose to the test and benchmark how each one performs.
Mongoose is a huge help with MongoDB and offers a bunch of useful features in Node. For performance-sensitive code, is this the best choice? In this take, we’ll dive into benchmarks via the Apache Benchmark to measure data access strategies.
Set Up
We will use Express to make benchmarks a bit more real since it’s one of the fastest. Only relevant code will be posted but feel free to check out the entire repo on GitHub.
With the native driver, this POST endpoint creates a new resource:
nativeApp.post('/', async (req, res) => {
const data = await req.db.native.insertOne({
number: req.body.number,
lastUpdated: new Date()
})
res.set('Location', '/' + data.ops[0]._id)
res.status(201).send(data.ops[0])
})
Note there is a req.db
object available, which ties into a native database collection:
nativeApp.use((req, res, next) => {
req.db = {}
req.db.native= nativeApp.get('db').collection('native')
next()
})
This use
function is middleware in Express. Remember this intercepts every request and hooks the database to the req
object.
For Mongoose, we have similar middleware that does this:
mongooseApp.use((req, res, next) => {
req.db = {mongoose: mongooseConn.model(
'Mongoose',
new Schema({number: Number, lastUpdated: Date}),
'mongoose')}
next()
})
Note the use of a Schema
that defines individual fields in the collection. If you’re coming from SQL, think of a table as a collection and a column as a field.
The POST endpoint for Mongoose looks like this:
mongooseApp.post('/', async (req, res) => {
const data = await req.db.mongoose.create({
number: req.body.number,
lastUpdated: new Date()
})
res.set('Location', '/' + data.id)
res.status(201).send(data)
})
This endpoint uses the REST style HTTP status code of 201 to respond with the new resource. It is also a good idea to set a Location
header with the URL and an id. This makes subsequent requests to this document easier to find.
To eliminate MongoDB completely from these benchmarks, be sure to set the poolSize
to 1 in the connection object. This makes the database less efficient but puts more pressure on the API itself. The goal is not to benchmark the database, but the API, and use different strategies in the data layer.
To fire requests to this API, use CURL and a separate port for each strategy:
```shell script
curl -i -H "Content-Type:application/json" -d "{\"number\":42}" http://localhost:3001/
curl -i -H "Content-Type:application/json" -d "{\"number\":42}" http://localhost:3002/
From this point forward, assume port `3001` has the native driver strategy. Port `3002` is for the Mongoose data access strategy.
## Read Performance
The native driver has the following GET endpoint:
```javascript
nativeApp.get('/:id', async (req, res) => {
const doc = await req.db.native.findOne({_id: new ObjectId(req.params.id)})
res.send(doc)
})
For Mongoose, this gets a single document:
mongooseApp.get('/:id', async (req, res) => {
const doc = await req.db.mongoose.findById(req.params.id).lean()
res.send(doc)
})
Note the code in Mongoose is easier to work with. We put lean
at the end of the query to make this as efficient as possible. This prevents Mongoose from hydrating the entire object model since it does not need this functionality. To get a good performance measurement, try benchmarking with and without the lean option in the query.
To fire requests to both endpoints in Apache Benchmark:
```shell script
ab -n 150 -c 4 -H "Content-Type:application/json" http://localhost:3001/5fa548f96a69652a4c80e70d
ab -n 150 -c 4 -H "Content-Type:application/json" http://localhost:3002/5fa5492d6a69652a4c80e70e
A couple of `ab` arguments to note: the `-n` parameter is the number of requests and `-c` is the number of concurrent requests. On a decent size developer box, you will find that it has around 8 logical cores. Setting the concurrent count to 4 chews up half the cores and frees up resources for the API, database, and other programs. Setting this concurrent count to a high number means it is benchmarking the async scheduler in the CPU, so results might be inconclusive.
## Write Performance
For Mongoose, create a PUT endpoint that updates a single document:
```javascript
mongooseApp.put('/:id', async (req, res) => {
const { number } = req.body
const data = await req.db.mongoose.findById(req.params.id)
data.number = number
data.lastUpdated = new Date()
res.send(await data.save())
})
The native driver can do this succinctly:
nativeApp.put('/:id', async (req, res) => {
const { number } = req.body
const data = await req.db.native.findOneAndUpdate(
{_id: new ObjectId(req.params.id)},
{$set: {number: number}, $currentDate: {lastUpdated: true}},
{returnOriginal: false})
res.send(data.value)
})
Mongoose has a similar findOneAndUpdate
method that is less expensive but also has fewer features. When doing benchmarks, it is better to stick to worse case scenarios. This means including all the features available to make a more informed decision. Doing a find
then a save
in Mongoose comes with change tracking and other desirable features that are not available in the native driver.
To benchmark these endpoints in Apache Benchmark:
```shell script
ab -n 150 -c 4 -T "application/json" -u .putdata http://localhost:3001/5fa548f96a69652a4c80e70d
ab -n 150 -c 4 -T "application/json" -u .putdata http://localhost:3002/5fa5492d6a69652a4c80e70e
Be sure to create a `.putdata` file with the following:
```json
{"number":42}
Both endpoints update a timestamp lastUpdate
field in the document. This is to bust any Mongoose/MongoDB cache that optimizes performance. This forces the database and data access layer to do actual work.
Results and Conclusion
Drumroll please, below are the results:
READS | Native | Mongoose |
---|---|---|
Throughput | 1200 #/sec | 583 #/sec |
Avg Request | 0.83 ms | 1.71 ms |
WRITES | Native | Mongoose |
---|---|---|
Throughput | 1128 #/sec | 384 #/sec |
Avg Request | 0.89 ms | 2.60 ms |
Overall, the native driver is around 2x faster than Mongoose. Because the native driver uses findOneAndUpdate
, read and write results are identical. The findOneAndUpdate
in Mongoose performs identical to findById
with the lean
option. Mongoose takes a slight ding with save
but this comes with more features. Getting rid of the lean
to prevent hydration does not make a difference because the document object is small.
With these results, one takeaway is to be mindful of performance when choosing to use Mongoose. There is no real reason to exclude the native driver from Mongoose because they are also useful in unison. For performance-sensitive code, it is best to use the native driver. For feature-rich endpoints that are less performant, it is okay to use Mongoose.
Originally published on the Jscrambler Blog by Camilo Reyes.
Posted on December 18, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.