Robert Marshall
Posted on November 7, 2022
This article was originally posted (and is more up to date) at https://robertmarshall.dev/blog/use-yoast-sitemap-with-next-js-and-headless-wordpress/
There are many perfectly valid ways of creating sitemaps for Next JS. Especially when using WordPress as a headless CMS.
You could generate the XML yourself by grabbing all the URLs yourself with custom code. You could use a plugin like next-sitemap to do the hard work for you.
These are both good options. But only if you are not wanting to use ISR, and are happy to rebuild the whole site on each content change.
TLDR
Create a lightweight proxy that grabs the Yoast sitemap data, point them to your Next JS domain, and rewrite any internal URLS to that same domain. Solution here.
Table of Contents
The Problem (brief)
Myself and the clients I work with tend to build sites that are very content heavy, and with lots of images. As well as this, content gets published regularly. We also need the sitemap to be updated after each publish, edit or deletion of content.
This can be managed in a number of ways. We could:
Rebuild the whole site with SSG
Rebuild the whole site using SSG every time content changes. This would generate a clean and up-to-date site map.
Unfortunately this isn’t a great solution. The build times will get longer as the content grows. We would see builds failing as Vercel/Netlify (your build tool of choice) timed out. It is also overkill to re-build a whole site for one piece of content. And not at all efficient.
Run a pre-build script
This idea initially appealed to me. It shouldn’t be too much of a nightmare. Should be as simple as:
- Run a query before the whole ISR build kicks off.
- Grab all the site data with a GraphlQl query.
- Build out a new XML file.
- Push to the host.
My concern with this is that when hosting Next JS with Vercel/Netlify etc. we may not have created a directory yet. Especially if this is the first time the site has been built. We need to know where the file is to be sent. Their may be permissions issues. A rabbit hole I CBA with.
The Solution
Use the Yoast sitemap files.
Every headless WordPress site I build generally has Yoast installed. It makes handling a pages SEO far simpler, and content teams have usually worked with it before.
Yoast also generates its own sitemap XML. This is created as ‘sitemap_index.xml’ and then WordPress redirects it to ‘sitemap.xml’. So why not use this?
This solution was inspired by how the Patronage bubs-next starter handles their sitemaps.
Create a Sitemap XML Template
The first step is to create an XML template that our Yoast sitemap content can be injected into.
Download the XML file from this GitHub gist and add it to your /public
folder.
XML Template: https://gist.github.com/robmarshall/caafd8a4328db0e58cf721a307af64e1
Creating a Proxy
The next step in using the Yoast sitemap from a headless WordPress install inside of Next JS is to create a proxy.
The purpose of this is to get the sitemap file contents from WordPress and serve it on your domain.
Create the API file
Create a file named sitemap-proxy.js
within /pages/api/
. This creates a server side function that we can use via ‘/api/sitemap-proxy’.
Install a Package
For the proxy to work, you will need to install a package.
This is a helpful wrapper for fetch that works on both client/server side.
To install: npm i isomorphic-unfetch
Add the proxy code
Add the below code to your sitemap-proxy.js
file.
import fetch from 'isomorphic-unfetch';
const WORDPRESS_URL = process.env.NEXT_PUBLIC_WORDPRESS_DOMAIN; // e.g. 'https://my-wordpressdomain.com'
const FRONTEND_URL = process.env.NEXT_PUBLIC_FRONTEND_DOMAIN; // e.g. 'https://localhost:3000'
// Global regex search allows replacing all URLs
const HOSTNAME_REGEX = new RegExp(WORDPRESS_URL, 'g');
function replace(...args) {
const string = `${args[0]}`;
return args.length < 3 ? string : string.replace(args[1], args[2]);
}
export default async function proxy(req, res) {
let content;
let contentType;
// Get the page that was requested. The manual option allows us to process redirects
// manually (if we get a redirect). So the next step of this function can work.
const upstreamRes = await fetch(`${WORDPRESS_URL}${req.url}`, {
redirect: 'manual'
});
// Check for redirects.
// This allows for any internal WordPress redirect. For example the /sitemap_index.xml to /sitemap.xml.
if (upstreamRes.status > 300 && upstreamRes.status < 310) {
const location = upstreamRes.headers.get('location');
const locationURL = new URL(location, upstreamRes.url);
// Follow once only if on a wordpress domain.
if (locationURL.href.includes(WORDPRESS_URL)) {
const locationURL = new URL(location, upstreamRes.url);
const response2 = await fetch(locationURL, {
redirect: 'manual'
});
content = await response2.text();
contentType = response2.headers.get('content-type');
} else {
// If there were more than two redirects, throw an error.
throw new Error(
`abort proxy to non wordpress target ${locationURL.href} to avoid redirect loops`
);
}
} else {
// There are no redirects, get original response text.
content = await upstreamRes.text();
contentType = upstreamRes.headers.get('content-type');
}
// If the current URL includes 'sitemap'.
if (req.url.includes('sitemap')) {
// Find any URLS inside the XML content that include "sitemap",
// and replace the WordPress URL with the current site URL.
content = replace(content, HOSTNAME_REGEX, FRONTEND_URL);
// Change sitemap xsl file path to local
let sitemapFind = '//(.*)sitemap-template.xsl';
let sitemapReplace = '/sitemap-template.xsl';
const SITEMAP_XSL_REGEX = new RegExp(sitemapFind, 'g');
content = replace(content, SITEMAP_XSL_REGEX, sitemapReplace);
}
res.setHeader('Content-Type', contentType);
res.setHeader('Cache-Control', 'max-age=60');
res.send(content);
}
Add the Rewrites
The final step is to make sure that Next JS knows which URLs to pass through the proxy. For this it needs a rewrites
function adding to the next.config.js
.
This will look like:
async rewrites() {
return [
{
source: '/(.*)sitemap.xml',
destination: '/api/sitemap-proxy'
},
{
source: '/sitemap(.*).xml',
destination: '/api/sitemap-proxy'
}
];
}
This function should be added within module.exports
.
What this does is target any URL with sitemap
in the name, and passes it through the proxy.
Now if you spin your site up and navigate to ‘/sitemap.xml’ you should be shown the new, proxied sitemap from Yoast.
Hopefully this helped you, and if you have any questions you can reach me at: @robertmars
Posted on November 7, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.