MongoDB Native Driver vs Mongoose: Performance Benchmarks

j_scrambler

Jscrambler

Posted on December 18, 2020

MongoDB Native Driver vs Mongoose: Performance Benchmarks

The time has come to put the native driver and mongoose to the test and benchmark how each one performs.

Mongoose is a huge help with MongoDB and offers a bunch of useful features in Node. For performance-sensitive code, is this the best choice? In this take, we’ll dive into benchmarks via the Apache Benchmark to measure data access strategies.

Set Up

We will use Express to make benchmarks a bit more real since it’s one of the fastest. Only relevant code will be posted but feel free to check out the entire repo on GitHub.

With the native driver, this POST endpoint creates a new resource:

nativeApp.post('/', async (req, res) => {
  const data = await req.db.native.insertOne({
    number: req.body.number,
    lastUpdated: new Date()
  })
  res.set('Location', '/' + data.ops[0]._id)
  res.status(201).send(data.ops[0])
})
Enter fullscreen mode Exit fullscreen mode

Note there is a req.db object available, which ties into a native database collection:

nativeApp.use((req, res, next) => {
  req.db = {}
  req.db.native= nativeApp.get('db').collection('native')
  next()
})
Enter fullscreen mode Exit fullscreen mode

This use function is middleware in Express. Remember this intercepts every request and hooks the database to the req object.

For Mongoose, we have similar middleware that does this:

mongooseApp.use((req, res, next) => {
  req.db = {mongoose: mongooseConn.model(
    'Mongoose',
    new Schema({number: Number, lastUpdated: Date}),
    'mongoose')}
  next()
})
Enter fullscreen mode Exit fullscreen mode

Note the use of a Schema that defines individual fields in the collection. If you’re coming from SQL, think of a table as a collection and a column as a field.

The POST endpoint for Mongoose looks like this:

mongooseApp.post('/', async (req, res) => {
  const data = await req.db.mongoose.create({
    number: req.body.number,
    lastUpdated: new Date()
  })
  res.set('Location', '/' + data.id)
  res.status(201).send(data)
})
Enter fullscreen mode Exit fullscreen mode

This endpoint uses the REST style HTTP status code of 201 to respond with the new resource. It is also a good idea to set a Location header with the URL and an id. This makes subsequent requests to this document easier to find.

To eliminate MongoDB completely from these benchmarks, be sure to set the poolSize to 1 in the connection object. This makes the database less efficient but puts more pressure on the API itself. The goal is not to benchmark the database, but the API, and use different strategies in the data layer.

To fire requests to this API, use CURL and a separate port for each strategy:

```shell script
curl -i -H "Content-Type:application/json" -d "{\"number\":42}" http://localhost:3001/
curl -i -H "Content-Type:application/json" -d "{\"number\":42}" http://localhost:3002/




From this point forward, assume port `3001` has the native driver strategy. Port `3002` is for the Mongoose data access strategy.

## Read Performance

The native driver has the following GET endpoint:



```javascript
nativeApp.get('/:id', async (req, res) => {
  const doc = await req.db.native.findOne({_id: new ObjectId(req.params.id)})
  res.send(doc)
})
Enter fullscreen mode Exit fullscreen mode

For Mongoose, this gets a single document:

mongooseApp.get('/:id', async (req, res) => {
  const doc = await req.db.mongoose.findById(req.params.id).lean()
  res.send(doc)
})
Enter fullscreen mode Exit fullscreen mode

Note the code in Mongoose is easier to work with. We put lean at the end of the query to make this as efficient as possible. This prevents Mongoose from hydrating the entire object model since it does not need this functionality. To get a good performance measurement, try benchmarking with and without the lean option in the query.

To fire requests to both endpoints in Apache Benchmark:

```shell script
ab -n 150 -c 4 -H "Content-Type:application/json" http://localhost:3001/5fa548f96a69652a4c80e70d
ab -n 150 -c 4 -H "Content-Type:application/json" http://localhost:3002/5fa5492d6a69652a4c80e70e




A couple of `ab` arguments to note: the `-n` parameter is the number of requests and `-c` is the number of concurrent requests. On a decent size developer box, you will find that it has around 8 logical cores. Setting the concurrent count to 4 chews up half the cores and frees up resources for the API, database, and other programs. Setting this concurrent count to a high number means it is benchmarking the async scheduler in the CPU, so results might be inconclusive.

## Write Performance

For Mongoose, create a PUT endpoint that updates a single document:



```javascript
mongooseApp.put('/:id', async (req, res) => {
  const { number } = req.body
  const data = await req.db.mongoose.findById(req.params.id)
  data.number = number
  data.lastUpdated = new Date()
  res.send(await data.save())
})
Enter fullscreen mode Exit fullscreen mode

The native driver can do this succinctly:

nativeApp.put('/:id', async (req, res) => {
  const { number } = req.body
  const data = await req.db.native.findOneAndUpdate(
    {_id: new ObjectId(req.params.id)},
    {$set: {number: number}, $currentDate: {lastUpdated: true}},
    {returnOriginal: false})
  res.send(data.value)
})
Enter fullscreen mode Exit fullscreen mode

Mongoose has a similar findOneAndUpdate method that is less expensive but also has fewer features. When doing benchmarks, it is better to stick to worse case scenarios. This means including all the features available to make a more informed decision. Doing a find then a save in Mongoose comes with change tracking and other desirable features that are not available in the native driver.

To benchmark these endpoints in Apache Benchmark:

```shell script
ab -n 150 -c 4 -T "application/json" -u .putdata http://localhost:3001/5fa548f96a69652a4c80e70d
ab -n 150 -c 4 -T "application/json" -u .putdata http://localhost:3002/5fa5492d6a69652a4c80e70e




Be sure to create a `.putdata` file with the following:



```json
{"number":42}
Enter fullscreen mode Exit fullscreen mode

Both endpoints update a timestamp lastUpdate field in the document. This is to bust any Mongoose/MongoDB cache that optimizes performance. This forces the database and data access layer to do actual work.

Results and Conclusion

Drumroll please, below are the results:

READS Native Mongoose
Throughput 1200 #/sec 583 #/sec
Avg Request 0.83 ms 1.71 ms
WRITES Native Mongoose
Throughput 1128 #/sec 384 #/sec
Avg Request 0.89 ms 2.60 ms

Overall, the native driver is around 2x faster than Mongoose. Because the native driver uses findOneAndUpdate, read and write results are identical. The findOneAndUpdate in Mongoose performs identical to findById with the lean option. Mongoose takes a slight ding with save but this comes with more features. Getting rid of the lean to prevent hydration does not make a difference because the document object is small.

With these results, one takeaway is to be mindful of performance when choosing to use Mongoose. There is no real reason to exclude the native driver from Mongoose because they are also useful in unison. For performance-sensitive code, it is best to use the native driver. For feature-rich endpoints that are less performant, it is okay to use Mongoose.


Originally published on the Jscrambler Blog by Camilo Reyes.

💖 💪 🙅 🚩
j_scrambler
Jscrambler

Posted on December 18, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related