Visual and HTML Testing for Static Sites

abahgat

Alessandro Bahgat

Posted on October 1, 2019

Visual and HTML Testing for Static Sites

Over a year ago I switched from having my site hosted on a CMS to having it built statically and served as a collection of static pages. I have been extremely happy with the end result for all these months -- the site is very easy to update and effortless to maintain -- but I just made a few changes that made my experience even better.

Why test Static Sites

Even for sites as simple as this, it is surprisingly easy to make breaking changes without realizing. Over the time I have been maintaining abahgat.com, I ended up accidentally introduction bugs more than a few times. Here a few examples of things I ran into:

  • broken links -- by default, Hugo does not validate any of the links in the content I am editing, which means that I have to be careful and make sure all URLs and paths are valid
  • incorrect theme configuration -- the more complex the theme I am using is, the more configuration options it will offer. The more options I have to configure, the more likely I am to make mistakes.
  • bugs in theme customizations -- Hugo is great at allowing to override and customize theme templates. However, this is another source of potential issues.
  • bugs in the theme code itself -- No software is perfect, and any theme I might be using can have its own bugs and edge cases. This might be especially true for you if you are actively developing your own theme or you frequently update it to the most recent version available.

Most of the issues above still affected me when I was hosting my site on Wordpress (I did break links and styling every now and then) but one advantage of working with a statically generated site is that we can leverage many of the tools that are available to web developers to catch issues early (and potentially block deploys if any issues are detected). So I set out to find what kind of options I had to improve my workflow so that I could make changes with more confidence that I wouldn't accidentally break my site.

What can be tested

Based on the list above, I knew I was looking to set up tests to detect, in order of priority, problems such as:

  1. broken internal links
  2. invalid or malformed HTML
  3. issues with layout or presentation
  4. invalid RSS feed entries

Thankfully, I was able to find a way to cover most of these.

Testing HTML with html-proofer

Covering the first items on the list has been fairly straightforward with html-proofer.

Provided you have Ruby installed, you can get html-proofer as a gem via the command below

gem install html-proofer

and then run it via

htmlproofer --extension .html ./public

This will scan the ./public directory for any files with html extension and output a report listing any issues with the markup in those files.

When I first ran it on my site, I got a pretty good list of actionable warnings. The messages are fairly specific and easy to understand, as you can tell by looking at the snippet below:

- ./public/author/abahgat/index.html
  *  356:11: ERROR: Opening and ending tag mismatch: section and div (line 356)
- ./public/author/index.html
  *  356:11: ERROR: Opening and ending tag mismatch: section and div (line 356)
- ./public/blog/index.html
  *  829:2157: ERROR: Unexpected end tag : p (line 829)
- ./public/blog/maps-for-public-transport-users/index.html
  *  internally linking to uploads/2009/01/p-480-320-0e6ac38d-252e-47fa-be79-0ae974dad8d2.jpeg, which does not exist (line 476)
     <a href="uploads/2009/01/p-480-320-0e6ac38d-252e-47fa-be79-0ae974dad8d2.jpeg"><img class="size-full wp-image-364 aligncenter" src="/img/wp-uploads/2009/01/p-480-320-0e6ac38d-252e-47fa-be79-0ae974dad8d2.jpeg" alt="" width="200" height="300"></a>
- ./public/blog/page/2/index.html
  *  linking to internal hash #broken-priorites that does not exist (line 1456)
     <a href="#broken-priorites">The way priorities are managed is broken</a>
  *  linking to internal hash #duplicates that does not exist (line 1453)
     <a href="#duplicates">Lots of issues are duplicates</a>
  *  linking to internal hash #missing-info that does not exist (line 1455)
     <a href="#missing-info">Bug reports do not include enough information</a>
  *  linking to internal hash #processes that does not exist (line 1454)
     <a href="#processes">The system imposes over-engineered processes</a>
  *  linking to internal hash #tracker-misuse that does not exist (line 1452)
     <a href="#tracker-misuse">The issue tracking system is misused</a>

Even with default settings, html-proofer is able to catch most of the issues I was interested in detecting: the list above features a good mix of problems caused by invalid links in my Markdown sources, errors due to how I was misusing my template and bugs in the template I was using.

Fixing the issues required a combination of updating a few broken links, cleaning up the Markdown sources for my site, submitting a few bugs and Pull Requests against the theme I am using.

Overall, all the issues flagged made sense and worth fixing.

Visual Testing with Percy

As useful as html-proofer is, it does not help catching layout and presentational issues that are not due to invalid markup. I have had good experiences with visual testing and review at work and I was interested in using screenshots to detect layout issues and catch any unintended presentational changes on my own site too.

I cared about this because upgrading my Hugo theme sometimes involves non-trivial changes that could go wrong (despite George, the author, keeping really good change logs).

Also, I wanted to make customizations to the theme and having testing in place is the only way I know to make sure I don't inadvertently break anything (since I will not review every single page manually every time I make layout changes, having a way to be warned about any differences is very valuable).

I ended up settling on Percy, a tool that was clearly designed first and foremost for testing dynamic web applications but also offered an option to test static sites via a command line program.

The main idea behind a snapshot testing system is to keep a set of approved snapshots ("goldens"), capture a new set of snapshots upon change and flag any differences for review. Changes can be either intended (in which case the screenshot is approved and becomes the new golden) or accidental (in which case they are flagged as regressions and expected to be fixed before pushing a new version).

Example screenshot highlighting differences introduced by a specific commit.

Percy offers a nice interface to highlight any difference between snapshots and can be easily integrated with GitHub and other source control systems to make approving any updated snapshots part of the code review process.

Percy runs as a service, so you will need to create an account with them before being able to use it. Once you have done that you can try it by following the instructions on their documentation page and running the following command on your site (where ./public is a directory containing your static pages):

npx percy snapshot ./public

Running tests on every change via CI services

Unlike the HTML tests, which test a specific version of your site in isolation, the value of snapshot testing lies in comparing your site against a previously approved set of snapshots, which need to be kept up to date.

I then configured a simple workflow with CircleCI, having it build my site with Hugo, run html-proofer on the generated sources, grab a fresh set of screenshots on every change and flag any differences for review.

From what I could tell, many other CI services can be configured to do the same; I ended up choosing CircleCI because I thought its Docker-based setup worked better for what I was trying to do and I had little trouble finding Docker images suitable for running the steps in my workflow.

Below the resulting configuration:

version: 2.1

orbs:
  hugo: circleci/hugo@0.3

jobs:
  snapshot:
    docker:
      - image: buildkite/puppeteer:v1.15.0
    steps:
      - attach_workspace:
          at: .
      - run: npm install percy
      - run: PERCY_TOKEN=$PERCY_TOKEN npx percy snapshot ./public

workflows:
  main:
    jobs:
      - hugo/build:
          version: "0.55.6"
          html-proofer: true
      - snapshot:
          requires:
            - hugo/build

The first section sets up build with Hugo via an Orb (Orbs are CircleCI's packages of functionality that can be packaged and reused) that also runs html-proofer tests on the resulting build.

The snapshot task installs percy via npm and then invokes it on the directory containing the sources generated in the previous step. It runs on the Docker Puppeteer image, which comes with most of Percy's package dependencies already installed.

Note:
There seems to be a Docker image maintained by Percy but I could not get it to work. I suspect it is because it ships with an old version of the percy command, I did not investigate this further.

With this configuration, every commit and Pull Request will trigger a Hugo build, run your site through html-proofer and capture a new set of snapshots. If any visual differences are detected, they can be inspected and approved via Percy's web interface.

GitHub will show the latest status of your tests on every commit and Pull Request.

Note that there is no deploy workflow since I configured Netlify to automatically publish a new version of my site whenever I push to the master branch.

Tweaking the setup

If you got to this point, your configuration will feature sensible defaults and help you capture a number of issues caused by your own mistakes or any issues introduced by the theme upstream.

There are a few opportunities to make the setup more efficient, but they require making changes with the CircleCI configuration above since the Orb we used before does not expose a good way to pass flags to tweak neither the build nor test test. (This might be fixed by the time you read this).

You can click here to see a CircleCI configuration file that you can further customize based on the sections below.

Here some of the tweaks you might consider implementing.

Test pages with a future publish date and drafts

Hugo allows you mark pages as drafts or to set a publish date to a future time (for scheduled content). Neither of these pages will be built by default in your deploy workflow, but you might want to do that when running your tests so that you ensure that content passes validation even as it is being edited (as opposed to being surprised by unexpected errors just when you thought you were ready to publish).

You can do this by passing the -D and -F flags to the hugo command during the build step.

Consider enabling minification

If you are building your site with minification enabled when you are deploying, you might have to make a decision:

  • if you enable minification only on the deploy workflow (and leave it disabled for development), the version of the site you will be testing will not be identical to the version you are publishing. This might hide subtle bugs that you would not be able to track down easily (such as this one).
  • on the contrary, if you do enable minification, debugging issues flagged by html-proofer and percy might be slightly more difficult, since the resulting source code will be more difficult to read.

I do not have a firm recommendation here, I am currently working with the latter setup and it has been working fine so far but isolating the cause of an issue is slightly harder this way.

If you want try this, you need to pass --minify to the hugo command during the build step.

Skip redundant screenshots

Just like, when writing unit tests, we don't want to have multiple redundant tests that cover the same behavior, in most cases it is not necessary to take screenshots of pages that use the same template and have very similar content.

For example, if part of your site is a blog that features tags and categories (in Hugo, this would apply to any taxonomy), you will not need to take screenshot of every individual tag page as you won't get much value out of them, since they all look the same. They will rather be a burden to maintain (should your theme ever change, you'd have many more -- very similar -- screenshots to approve).

You can probably make a similar case for directory pages (say, if you have 40 pages of articles, the screenshots for the second to thirty-ninth pages are likely going to be the same.
There could be value in testing the first and last page separately since you'd imagine they would have a different configuration for the next/previous navigation elements, but that is up to you.

Thankfully, the percy command offers a way to manually exclude certain paths from being considered when grabbing screenshots. The syntax for that argument expects globs, which can take some trial and error to get right.

In case it helps, here a configuration that worked reasonably well for me so far:

npx percy snapshot ./public -i \
  'categories/!(coding|coding/**)/*.html',\
  'tags/!(amsterdam|amsterdam/**)/*.html',\
  'blog/page/!(1|2)/*.html'

What the above is doing is excluding all categories but one (Coding) and all tags excluding one (Amsterdam). It is also ignoring any page beyond the second in the /blog directory.

Capture screenshots less frequently

I have yet to run into this limitation but I could see how, if your site is very large and/or if you commit very frequently, you may be concerned about exceeding Percy's free quota (5000 screenshots/month).

I have not had to handle this in any special way so far, but here a few options:

  • Percy grabs screenshots of each page on your site in both Chrome and Firefox to ensure your site behaves well across browsers. You may decide you are comfortable with taking the risk of having smaller issues undetected and grab screenshots only on one of the two. This will mean you will consume half as many snapshots every time you run visual tests.
  • Percy will also test your site on a couple different viewport sizes. This is helpful to ensure your site works well on desktop and mobile devices. Again, you may be comfortable with just running tests on one configuration in order to reduce resource consumption by half.
  • You may configure your CircleCI workflow to toggle the snapshot step manually and run it only when you have meaningful changes to test (e.g. if you are adding new content or upgrading your theme). If you do this, you still want to make sure you refresh your screenshots based on master fairly often, otherwise you might find yourself with visual diffs that cover so many changes together that are no longer informative. And if you run this very infrequently, you might as well just choose to run the percy command locally.

Realistically, for most personal sites, you can likely go a long way with the free quota. If you are considering this for a large corporate site, I would rather consider paying for a higher tier and get more snapshots rather than trying too hard to capture fewer and have a less informative workflow.

Tests are even more valuable if you are a theme developer

If you are developing a theme that others are going to use, testing this way is likely to be even more impactful: you can save yourself quite a bit of time by having a way to catch issues before you ship a new version instead of relying on your users to report problems they run into after they upgrade.

You can apply most of the suggestions above by making sure that you have an example site (the Academic theme I use is great for this) that exercises most of the features in your theme, especially the ones that are not enabled by default. This would also likely reduce the time you spend manually inspecting your pages to make sure they still render as expected.

Conclusion

This has been a great opportunity to learn about great tools that are available out there (I will definitely consider Percy for the next app I will build in my own time) and how they can help greatly even with sites that are statically generated.

I have accomplished most of the goals I had in mind when I started playing with this. There is one item left open for future investigation (mainly, a way to ensure the RSS for my site is valid and well-formed) but the CircleCI workflow I set up gave me a good foundation I can extend to cover more tests.


This post was originally posted on my personal website, abahgat.com, where I write about software, design and human factors.

If you enjoyed this post, you can be notified of more by signing up for this newsletter or following me on twitter

💖 💪 🙅 🚩
abahgat
Alessandro Bahgat

Posted on October 1, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related