Dynamically optimising and caching images via a Node.js microservice
Matt Inamdar
Posted on April 9, 2022
At Health Place we are pushing towards a media heavy platform filled with images and videos to make it more engaging for our users, enabling them to get a better idea of the support listings they may be interested in.
Along with this comes the big challenge of keeping things fast and efficient.
In this article we'll go through: where we started along with the problems we were facing, the ideal solution we had in mind, and finally the solution we ended up with.
TL;DR: We built a microservice in Node.js to serve our images with dynamic optimisation provided via query string params. We then placed this behind a CDN to cache indefinitely.
The problem
Each listing on Health Place has support for a variety of images, not limited to:
- Listing logo
- Providing organisation logo
- Listing gallery (there can be many images in here!)
- Location image (a listing can have many locations)
- Soon listings will also have a wide banner image
Let's start by discussing how these images are provided to us, and where they're stored.
All images are uploaded via our admin web app which is where organisations can login and manage their listings on the site.
This all goes through our primary PHP-based API which in turn, uploads the image to Amazon S3, our cloud storage service of choice.
Initially we had endpoints provided by that same API to serve the images. The API would need to download the image from S3 each time and return it. This quickly became an issue as PHP is a blocking language, meaning no other requests could be handled whilst the image was being downloaded and returned.
Another issue was present. The uploaded images were almost always not optimised. They were often large in resolution and file size, making them not friendly at all for consumption via the frontend web app.
To counter this, we implemented image optimisation at time of upload by pushing a job to optimise the image once uploaded. This worked, but we started introducing ever growing complexity and the time came to consider moving this logic out of PHP entirely...
The ideal solution
The ideal plan was to have a completely separate microservice that is responsible for the following:
- Uploading images (behind auth)
- Streaming images
- API to confirm image exists
- Dynamically optimising images via query string parameters
- CDN to cache images and optimised versions
The PHP API would then return a field in the JSON response, telling the client to go to the Images Microservice to get the image, e.g.:
{
"name": "Listing name",
"logo_url": "https://images.healthplace.io/image-id"
}
The client could then append some query string parameters to dynamically request a version optimised for its specific use case:
<img src="https://images.healthplace.io/image-id?format=jpg&quality=80&height=250" />
Finally, this should be placed behind a CDN resulting in only the first request having to wait for the microservice to download the image from S3 and optimise it. All subsequent requests for that image with those exact optimisations would be returned immediately from the CDN.
This greatly simplifies our workflow as now an image would only need to be uploaded once in its raw and unoptimised state. All optimisations are then achieved dynamically at time of use.
The implementation
First, a quick note:
In our first iteration, we've managed to push the streaming, optimisation, and caching logic to a newly created Images Microservice. However, the uploading of new images and persistence to S3 is still achieved through our main API. Our next iteration will push this logic into the Images Microservice.
So what did we do?
First, we created a standard express app using TypeScript (nothing special here). Then we pulled in this extremely useful package called express-sharp that wraps sharp, a Node.js image manipulation library, in an express middleware.
We then setup the middleware to listen to any route invocation that began with /_/
which would use the S3 adapter to pull the image from S3. This would allow for /_/my-image-id
to be passed to the adapter with a key of my-image-id
, correlating to the file path in the S3 bucket.
There's also a TypeScript interface provided to write your own adapters which we utilised.
Query string parameter based optimisations are provided out of the box, so no need to do anything fancy here!
We then provided two domains for the microservice:
- Origin
https://images-api.healthplace.io
- CDN
https://cdn.images-api.healthplace.io
The CDN is setup to make downstream requests to the origin domain on cache misses, and the CDN domain is also used in our API responses. As part of the CDN config, we set the query string optimisation parameters as part of the cache key to ensure we hit the cache as much as possible.
We now have a fully working Images Microservice!
We also added further endpoints and adapters to account for other useful situations, such as returning SVG's and files from the filesystem.
Building upon this, we'd like to provide support for directly uploading images to this microservice, allowing our main API to simply accept the corresponding ID's only. The Images Microservice could then provide an endpoint for the main API to validate the image ID as existing. There's also scope to dynamically add watermarks and all sorts of other manipulations!
But that's all I have for now!
Get in touch
If you have any questions, comment below, and I'll get back to you.
And if you think you'd like to work somewhere like Health Place, then message me for a chat at matt@healthplace.io.
Photo by Warren Umoh on Unsplash
Posted on April 9, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.