SEO analyzer — library for searching SEO issues
Mad Devs
Posted on October 13, 2021
Hi there!
Today, I’d like to tell you about one solution to a very common problem in team development, which eventually resulted in a whole npm package.
And as you might have guessed, we’ll talk about SEO analyzer, a tool that helps to catch SEO flaws in various development stages (and of course, maintain a good relationship with SEO specialists 😊 .)
Introduction
Just the same, the development of this tool started when we began to run into problems with SEO over and over again. Each new production release had new but still the same problems. Relationships with SEO specialists began to fall apart: there were quarrels, yelling on calls, threatening messages in private, and other unpleasant things.
Finally, we decided to figure it out and ended up with a handy and useful tool, which we’ll talk about further.
Why do you need SEO Analyzer?
The main task is to analyze the DOM tree to detect SEO issues.
Many may ask, “What’s wrong with Lighthouse?”
Lighthouse is a multifunctional and sometimes redundant tool that you don’t always want to use in a small project.
SEO Analyzer is a lightweight plugin aimed at a specific task: to keep your project valid and friendly to search engine crawlers by detecting flaws on your website pages.
If it’s important for you to get to the top on Google or any other search engine, you cannot do without this tool.
Benefits
- Easy setup;
- Launching Analyzer for SPA applications;
- Launching Analyzer for SSG and SSR applications;
- Running Analyzer in Github, Gitlab, pre-push, or anywhere else;
- 9 ready-made and most popular rules;
- Adding your own rules;
- Several options for outputting the result.
Installing the package
Let’s follow the link that will redirect us to the analyzer page on the npm website.
To the right, above the metadata, you can copy the command to install the package.
Let’s go to the project and install the library there.
npm i seo-analyzer
It’s pretty lightweight, so the installation will be instantaneous.
Setup
Next, let’s move on to configuring the package.
The first thing to do is to determine the location of the script in the project to run the analyzer. In my project, I placed the file in the root and named it seo-analyzer.js
. You can do the same.
Let’s open the file and add the necessary functionality to it.
For example, we develop a site as an SPA (single page application), in which the layout is rendered by javascript. As we know, this creates some problems in parsing, namely, the DOM tree is rendered only after the JavaScript code is ready. In this case, the settings should be as follows:
const SeoAnalyzer = require('seo-analyzer');
new SeoAnalyzer()
.ignoreUrls(['/404'])
.inputSpaFolder('/dist', 3000)
.addRule('noMoreThanOneH1TagRule')
.outputConsole();
Let’s go step by step.
At the beginning of the file, import the analyzer script, then create a new instance and start configuring:
-
.ignoreUrls(['/404'])
— a list of pages to ignore when parsing pages. You can specify any number of pages. -
.inputSpaFolder('/dist', 3000)
— specify the folder where the final html pages will be collected and specify the port on which the server will be raised for parsing and further processing of these pages. -
.addRule('noMoreThanOneH1TagRule')
— choose from a list of nine ready-made rules and add them for processing. -
.outputConsole()
—if we are not going to further handle the error report, then the output to the console will be the easiest option.
These settings are enough, SEO Analyzer is now ready to validate your pages.
To launch it, run the following command in the terminal:
node seo-analyzer.js
The result should be similar to this report:
Available methods
I’ll divide the list of methods into several parts so that their order in the chain is clear. The first in line are the methods for ignoring files, folders, and links.
They should be used depending on the input data (check it below).
-
ignoreFiles(['/dist/404.html'])
: takes the list of files to ignore in the analysis. -
ignoreFolders(['/dist/test'])
: takes the list of folders to ignore in the analysis. -
ignoreUrls(['/404', '/login'])
: takes the list of references to ignore in the analysis.
Next in the chain is the input data. They must be located below the ignoring methods.
-
inputFiles(['/dist/index.html'])
: takes the list of files to process. -
inputFolders(['/dist'])
: takes the list of folders in which all html files will be found and analyzed. -
inputSpaFolder('/dist', 3000)
: takes two parameters: the first one is the folder with the final production files, the second is the port on which the server will start for html parsing.
Next is the method for adding ready-made or your own rules.
addRule('titleLengthRule', { ... })
: takes two parameters: the first one is the name of a ready-made rule as a string or the function name of a custom rule as a function name, the second is the parameter for ready-made rules.
And the last list comprises the methods for outputting the result.outputJson(json => {})
: takes a callback function that passes the result as JSON data.outputObject(obj => {})
: also, takes a callback function that passes the result as a JS object.outputConsole()
: doesn’t take any parameters, it just outputs the result to the console. If there are errors in the result, this method will terminate the process in the terminal.
The
<outputConsole()>
method must be used at the very end because if it terminates the process, the rest of the chain will be broken and the methods after it will not be started.
List of ready-made rules
For a quick start of the analyzer, I’ve prepared nine of the most popular rules, which should be enough for a basic check. Let me tell you about them in more detail.
To add a rule to the chain, we need the addRule()
method. It takes two parameters:
- The name of the finished rule or the function name of the custom rule.
- Parameters. They are only needed for ready-made rules because there is no other way to set them up.
After selecting the desired rule, we just need to add it to the chain, between the input parameters and the output parameters, that is, like this:
.inputSpaFolder(...)
.addRule('titleLengthRule', { ... }) <----
.outputConsole(...)
Not to break the chain of handlers, you must follow the order in which you add the methods.
Now, let’s look at the entire list of the ready-made rules.
Title length rule
.addRule('titleLengthRule', { min: 10, max: 50 })
Checks the length of the tag
. Accepts two parameters:- min: minimum title length.
- max: maximum title length.
H1-H6
tags rule
.addRule('hTagsRule')
Checks the correct spacing of h headers on the page.
For example, here is a variant with a non-valid arrangement:
<h1>
- <h3>
- - <h4>
- <h2>
According to the rules, an h3 header must be placed after an h2 tag, like this:
<h1>
- <h2>
- - <h3>
- <h2>
In this case, there should be no problem.
No more than one H1
tag rule
.addRule('noMoreThanOneH1TagRule')
Checks the number of H1
tags on the page. There must be only one h1 tag.
img
tag with alt
attritube rule
.addRule('imgTagWithAltAttritubeRule')
Checks if all img
tags have alt=”…” attribute.
Tag a
with rel
attritube rule
.addRule('aTagWithRelAttritubeRule')
Checks if all a
tags have the rel=”…” attribute.
No too many strong
tags rule
.addRule('noTooManyStrongTagsRule', { threshold: 2 })
Checks the number of strong
tags on the page. Accepts one parameter:
- threshold: maximum number of tags on the page.
Meta base rule
.addRule('metaBaseRule', { list: ['description', 'viewport'] })
Checks if the page contains the specified base meta tags. Accepts one parameter:
- list: a list of required meta tags on the page.
Meta social rule
.addRule('metaSocialRule', {
properties: [
'og:url',
'og:type',
'og:site_name',
'og:title',
'og:description',
'og:image',
'og:image:width',
'og:image:height',
'twitter:card',
'twitter:text:title',
'twitter:description',
'twitter:image:src',
'twitter:url'
],
})
Checks if the page contains the specified social meta tags. Accepts one parameter:
- properties: a list of required meta tags on the page.
Canonical link rule
.addRule('canonicalLinkRule')
Checks if a canonical link exists on the page.
Adding a custom rule
If you don’t have enough ready-made rules for basic validation, you can easily add your own.
Basically, a custom rule is just a function that takes a DOM tree. This is what we are going to work with.
The rule should be a Promise note so that the rest could wait for its completion.
Let’s write our own rule. It will be simple and will only check if there are paragraphs on the page. Let’s add this code:
function customRule(dom) {
return new Promise(async (resolve, reject) => {
const paragraph = dom.window.document.querySelector('p');
if (paragraph) {
resolve('');
} else {
reject('Not found <p> tags');
}
});
}
In the arguments, we pass the DOM, which we can handle just the way we do it in the browser. That is, the object window is available to us.
Once your rule is ready, you can add it to the chain and check it out.
.addRule(customRule)
As a result, if there are no paragraphs on the page, we will get an error in the console “Not found p
tags”.
Running in CI/CD
Running SEO Analyzer in CI/CD is only necessary to catch SEO flaws during the preparation of new changes for staging or production. When building the Pull Requests, if SEO problems are found, the pipeline will drop. This will tell you there is something wrong with the changes and they need fixes.
For example, let’s run the analyzer in Github actions. This is a very easy thing to do. Let’s make sure by looking at the code below:
name: Seo Analyzer CI
on: [pull_request]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [14.x]
steps:
- uses: actions/checkout@v2
- name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v1
with:
node-version: ${{ matrix.node-version }}
- run: npm install
- run: npm run build
env:
CI: true
- run: node ./seo-analyzer.js
As I said, there is nothing complicated. We just need to configure the project build command and then run the file with the analyzer script, which we configured above.
In the root of the project, create the .github
folder and the workflows
folder in it. In the workflows
folder create the seo-analyzer.yml
file and put the above code there. After the changes are sent to github, the action and SEO Analyzer will start.
Running in pre-push or pre-commit
To prevent invalid changes from being sent to the server, I suggest that you configure the analyzer to run on a pre-push hook.
This will allow you to check the validity of the changes each time they are sent to the server. Thus, the changes will only be sent if there are no errors.
We’ll need the husky package for the setup.
Let’s install it.
npm install husky --save-dev
The settings for this plugin must be added to the package.json file. You can also create a separate file, but it’s not all that important.
{
...
"husky": {
"hooks": {
"pre-push": "npm run build && npm run seo-analyzer.js"
}
}
...
}
Now, before the changes are sent to the server, a parser will be run and your changes will be checked.
Conclusion
It’s very important to have a high SEO score as it determines the traffic on your site and, accordingly, the income. Tools such as SEO Analyzer will help to maintain these indicators in the project. Do not neglect, use and be happy!
I hope you’ll find SEO Analyzer a useful tool.
Link to github. Link to the npm package.
Thanks!
Previously published at maddevs.io/blog.
Posted on October 13, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.