How to check for broken links using Selenium Webdriver on Node.js (automated testing)

ads-bne

ADS-BNE

Posted on May 13, 2024

How to check for broken links using Selenium Webdriver on Node.js (automated testing)

Cover photo by Miguel Á. Padriñán

So you're building automated tests using Node, Cucumber JS, and Selenium Webdriver (I have written more about this here)? Here's a way to test for pesky broken links.

One of the easiest ways to check for broken links is to simply read their status codes: 200 ✅ good, 301 or 404 ❌ bad.

However, it seems Selenium Webdriver does not have an out-of-the-box way to read status codes. No matter, since we're writing these tests on Node we can use an npm package: xhr2. xhr2 is a simple tool that uses XMLHttpRequest to send and receive data from a server. In this case we are simply asking it to give the servers response/status code when asking for a document via a given url or path.

Install xhr2 into your Node project the normal way: npm i xhr2

Now, if you're using Cucumber.JS like me, you can write your test's .feature file. In mine I'm going to cycle through some given pages and check that all links within my site's footer are working (ie, returning 200 status codes).

check-footer.feature

Feature: Check footer links

    Scenario: Check for broken links in the footer section on these pages
        Given I am checking the footer on the '<page>' page
        Then there should be no broken links on '<page>'

        Examples: 
        | page                          |
        | about-us                      |
        | contact-us                    |
        | products                      |
Enter fullscreen mode Exit fullscreen mode

Now, on my steps file:

checkFooterSteps.js

const { When, Then, Given, After } = require('@cucumber/cucumber');
const assert = require('assert');
const { Builder, By, until, Key, http } = require('selenium-webdriver');
const firefox = require('selenium-webdriver/firefox');
// Don't forget to include xhr2.
const XMLHttpRequest = require('xhr2');

var {setDefaultTimeout} = require('@cucumber/cucumber');
setDefaultTimeout(60 * 1000);

// Uses Firefox to load each page listed in the .features file.
Given('I am checking the footer on the {string} page', async function (string) {
    this.driver = new Builder()
        .forBrowser('firefox')
        .build();

    this.driver.wait(until.elementLocated(By.className('logo-image')));

    await this.driver.get('https://www.your-site.com/' + string);
});

// For each page find all <a> tags from within the .footer element.
// Get each <a> tag's href value and store in urlArr array.
// For each value in urlArr run it through the checkLink() function.
// Use assert() to check returned status value is 200.
Then('there should be no broken links on {string}', async function(string) {
    var urlArr = [];
    var footerLinks = await this.driver.findElements(By.css('.footer a'));
    for (let i = 0; i < footerLinks.length; i++) {
        var url = await footerLinks[i].getAttribute("href");
        urlArr.push(url);
    }

    if (urlArr.length  < 1) {
        console.log(`Could not find any links on ${string} page`);
    }
    else {
        for (let i = 0; i < urlArr.length; i++) {
            var respStatus = await checkLink(urlArr[i]);
            assert.ok(respStatus==200);
        }
    }
});

// xhr2 link checker function
function checkLink(url) {
    return new Promise((resolve, reject) => {
        const xhr = new XMLHttpRequest();
        xhr.open('HEAD', url, true);

        xhr.onload = () => {
            if (xhr.status >= 200 && xhr.status < 400) {
                resolve(xhr.status);
            } else {
                reject(`HTTP status code ${xhr.status}`);
            }
        };

        xhr.onerror = () => {
            reject("Network error or URL not reachable");
        };

        xhr.send();
    });
}

Enter fullscreen mode Exit fullscreen mode

What this is doing is:

  • looping through each page I've provided on the Cucumber.js file
  • for each page, gathering a list of href attribute values from each <a> tag within the .footer element
  • running each URL through the checkLink() function, which returns the HTTP status code.
  • Using Node's assert function to check if the returned value is 200.

Note, this JS code probably still needs to be improved and optimised.

💖 💪 🙅 🚩
ads-bne
ADS-BNE

Posted on May 13, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related