Docusaurus: Understanding the sitemap feature
Omar Hussein
Posted on November 3, 2023
Docusaurus, the popular static site generator, simplifies the process of building documentation websites. Among its powerful features is the sitemap generator, a vital component for enhancing a site's search engine optimization (SEO).
Exploring the Code
1. isNoIndexMetaRoute
Function:
This function is crucial for determining whether a specific route should be excluded from the sitemap. It checks the HTML head of a page for a meta tag with the name "robots" and the content "noindex". If found, the route is excluded from the sitemap. This ensures that pages marked as 'noindex' in their meta tags are not included in the generated sitemap.
2. createSitemap
Function:
Input Parameters: The function takes several parameters, including siteConfig (Docusaurus configuration), routesPaths (an array of route paths), head (HTML head content for each route), and options (plugin options for the sitemap).
Exclusion Logic: The function filters out routes that should not be included in the sitemap. Routes ending with 404.html
are excluded, as are routes matching patterns specified in ignorePatterns
from the options. Additionally, routes with 'noindex' meta tags, identified by isNoIndexMetaRoute
, are excluded.
Sitemap Construction: This process uses a popular npm package by the obvious name of sitemap. The included routes are formatted with appropriate trailing slashes and base URLs. The function constructs a sitemap using the SitemapStream class and writes the formatted routes into the sitemap.
3. Integration and Usage:
This sitemap generator function is typically integrated into Docusaurus build processes or plugins. It generates a structured sitemap for the entire documentation website, improving its discoverability by search engines.
Understanding the Flow
Initialization: The function begins by initializing necessary variables and checking if the site's URL is provided in the configuration. If not, an error is thrown to ensure the URL is properly set.
Exclusion Checks: The function checks each route for exclusion criteria. If a route matches any exclusion condition (404 page, ignore patterns, or 'noindex' meta tag), it is skipped from the sitemap generation process.
Sitemap Generation: Valid routes are formatted, including trailing slashes and base URLs, and added to the sitemap stream. The sitemap stream is then converted to a string and returned as the final output.
Conclusion
Understanding the inner workings of the Docusaurus sitemap generator gives valuable insights into how SEO-friendly sitemaps are created for static websites. By examining the logic and flow of the code, this helps me understand how to integrate this logic into my own SSG application, which I will talk about in my next post!
Posted on November 3, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.