SEO Part1: Help Search Engine Understand Our Website
Andrew
Posted on August 13, 2020
This is the second article in this SEO series. In this article, I will explain how to help the search engine understand our website better.
Search Engine robots analyze our website in a very different way. Although the most important criteria for a good website is still the user experience. But since the search engine needs to automate the analyzing process, it needs to use lots of extra information to understand our website. It's important to make sure that extra information matches our content. Otherwise, the search engine might get confused about the content we want to provide to our users.
Now, let's dive into the details.
Agenda
- robots.txt
- sitemap
- URL
- HTTP status code
- meta tag
- link tag
- Sematic HTML tags
- HTML tags attributes
- Structured Data
- Tools
robots.txt
robots.txt
is a guideline for search engine robots, it's used to keep part of our pages private.
If we provide the wrong setting, it might accidentally block the search engine and make our website fail to display on the SERP (Search Engine Results Pages).
We can provide the robots.txt
file for the entire website.
User-agent: *
Disallow:
Or we can also use meta
tag to specify the setting for a specific page.
<meta name="robots" content="noindex">
sitemap
We can provide the sitemap
to notice the search engine to index our pages.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.oahehc.com</loc>
<priority>0.90</priority>
<changefreq>always</changefreq>
<lastmod>2020-07-18T12:59:15.983Z</lastmod>
</url>
...
</urlset>
Once we finish our sitemap, we can submit the result to the search engine. Take Google as an example, we can send an HTTP request or handle it on google search console.
http://www.google.com/ping?sitemap=<complete_url_of_sitemap>
Another thing worth mentioning is that there is a budget when the search engine crawls a website. So if we have tens of thousands of pages on our website. We can put the most important pages at first or set priority
probably.
URL
The URL
is another criterion that the search engine might reference. Keeping the URL
in a clean directory structure not just makes it easier to manage, but also lets the search engine be able to understand.
If it's possible, including keywords and context on the URL
to make it match our website content.
HTTP status code
Now the search engine has our URL, it can send the request and get the content. But before it starts analyzing the content, it will check the HTTP status code
that our server responds to. Providing the correct status code can prevent the search engine from getting confused. There are two common mistakes:
- 301 (permanent redirect) vs 302 (temporary redirect)
- replace old website should use 301
- if we have different URLs for desktop and mobile websites and redirect our users based on their device. Then we should use 302.
- 404 page should return 404
meta tag
In this step, the search engine already has the content of our website. It will check the HTML head first.
title and description
<title>title</title>
<meta name="description" content="description">
The search engine reads the title and description to have a basic understanding of our website. So it's important to provide the proper information. For example:
(a) prevent using stop words
(b) should match the website content
(c) include the keywords but not overuse
(d) multiple tabs page should provide different title/description for each tab
social media
If we want to optimize our website on social media, we should provide proper meta tags. Taking Facebook as an example, we should including that information inside the head tag.
<meta property="og:url" content="https://www.oahehc.com.tw" />
<meta property="og:type" content="article" />
<meta property="og:title" content="title" />
<meta property="og:description" content="description" />
<meta property="og:image" content="https://cdn.oahehc.com.tw/logo.png" />
And don't forget to use their tool to make sure everything is working as we expected.
FB Sharing Debugger
link tag
It's important to prevent duplicate content on our website because this makes the search engine hard to decide which page they should provide to their users. Therefore, another information we might need to provide in the head tag is link
.
If we have different URLs for desktop and mobile websites. Then we can provide canonical link
& alternate link
to tell the search engine to treat them as the same page.
<link rel="canonical" href="https://www.oahehc.com.tw">
<link rel="alternate" href="https://m.oahehc.com.tw" media="only screen and (max-width: 640px)">
Sematic HTML tags
<div>
<div>1. xxx</div>
<div>2. ooo</div>
<div>3. ...</div>
</div>
<ol>
<li>xxx</li>
<li>ooo</li>
<li>...</li>
</ol>
With proper CSS styling, the above two examples will look exactly the same by the human eye. But it's quite different in the search engine's aspect.
There are a few basic guidelines about how to choose HTML tags properly:
- Using
h1
~h6
for heading, and we should only have oneh1
tag for each page - Structure the page by semantic elements like
header
,main
,aside
,footer
,section
,nav
, ... - Using
table/th/td/td
for table - Using
ul/ol/li
for list - Don't mix up
button
&a
Use semantic HTML tags not just help the search engine understand our webpage better, sometime we might gain an extra bonus from that.
Google search provides featured snippet if there is a proper answer for the search question. If our content is summarized in the table or list, then it will have a higher chance to be choosing as one of them.
HTML tags attributes
Except choose proper HTML tags, sometime we might have to provide extra attributes to add more information.
For example, add alt
for img
tag to explain the image not just help the search engine understand better, but also make our website more friendly for people who use a screen reader to browse our website.
<img src="https://cdn.oahehc.com.tw/dog.png" alt="dog" />
When we use an image as the content for a hyperlink, add title
attribute to provide extra information is also a good practice.
<a href="./dog" title="dog list">
<img src="https://cdn.oahehc.com.tw/dog.png" alt="dog" />
</a>
Structured Data
If we want to provide more information for the search engine, we can add structured-data
on our website.
There are a few different categories of structured-data
that Google will display in richer features in search results.
We can check this article to know more detail - Explore the search gallery.
To add structured-data
on our website, we can directly add into the HTML tags:
<ol itemscope itemtype="http://schema.org/BreadcrumbList">
<li itemprop="itemListElement" itemscope
itemtype="http://schema.org/ListItem">
<a itemprop="item" href="https://example.com/dresses">
<span itemprop="name">Dresses</span></a>
<meta itemprop="position" content="1" />
</li>
<li itemprop="itemListElement" itemscope
itemtype="http://schema.org/ListItem">
<a itemprop="item" href="https://example.com/dresses/real">
<span itemprop="name">Real Dresses</span></a>
<meta itemprop="position" content="2" />
</li>
</ol>
Or we can create a script tag and set all the information in json format:
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BreadcrumbList",
"itemListElement":
[
{
"@type": "ListItem",
"position": 1,
"item":
{
"@id": "https://example.com/dresses",
"name": "Dresses"
}
},
{
"@type": "ListItem",
"position": 2,
"item":
{
"@id": "https://example.com/dresses/real",
"name": "Real Dresses"
}
}
]
}
</script>
Once we finish the structured data, don't forget to test it and make sure the format is correct - Structured Data Testing Tool.
Tools
You might feel overwhelmed after seeing so many extra errands we have to deal with. Luckily, there are tools that can help us.
ESLint
ESlint not just can help us prevent syntax error, adding a proper extension, it can also point out the missing attribute on our HTML tags.
lighthouse
lighthouse
is a built-in feature in Chrome, we can use it to identify common problems on our website. Now we can focus onAccessibility
&SEO
. We just need to clickGenerate report
, thenlighthouse
will let us know how to fix the problems on our website.
google search console
Once we publish our website, definitely start usinggoogle search console
. It doesn't just provide the information to measure search traffic. Moreover, it points out all the problems that Google finds out when they are trying to crawl our website. Fixing those problems can make the search engine understand our website better.
Conclusion
Now we know how to make the search engine easier to understand our website. In the next article, I will focus on the most important question - how to improve the user experience for our real users - SEO Part2: Improve User Experience & SEO
Reference
Posted on August 13, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.