Search Engine Tags, Attributes, Commands and Suggestions: Robots.txt and rel=“canonical”

Part 1

When I prompted some friends for their questions to help me write this blog, one of my favorite responses was, “First I would have to understand what the heck you’re writing about to begin with!” Duly noted. And I’ve noticed in my work with companies large and small that there’s a big knowledge gap when it comes to the following:

  • Robots.txt disallow
  • rel=”canonical”
  • 301 redirect
  • “nofollow” & “noindex”

Some top questions I’ve heard about these tags and attributes are:

  • Which are best for SEO?
  • What’s the difference between a 301 and a 302 redirect?
  • Why is my page still in search when I removed it from my site?
  • Where on my page do I put the tag?
  • What are these things?

They’re actually really important signals that tell a search engine what to do with your site’s content, and if you haven’t used any of these on your website, I’m fairly confident you have some issues, namely duplicate content and 404 errors. Let’s break these down Classroom 101 style. In this blog, I’ll cover robots.txt files and rel=”canonical” tags.

  1. Robots.txt Disallow

Pronounced “robots dot text,” a robots.txt file tells a search engine which parts of your site it should not crawl.

Why wouldn’t you want something to be crawled? One reason might be that you have content on your site that just isn’t useful to users, and you don’t want it to be found. Note that you should not use a robots.txt to hide “bad” content from search engines. These files are public and can be seen if a webmaster wants to view your file. I set up a robots.txt file for our website because our CMS was creating pages in an /uploads file that were being crawled and indexed and cluttering our web presence. I excluded this folder from being crawled using a robots.txt file—Ta-da! Problem solved.

To create a robots.txt file look to Webmaster Tools  if you have an account for your website (You should!).  If you use WordPress, you can download the Yoast SEO plugin and create a file easily. Here are instructions for setting up a robots.txt file.

It will look something like this:

User-agent: *

Disallow: /oldfile/

Disallow: /dumbfile/thisparticularfile.htm

  1. rel=”canonical”

Pronounced “Rel Canonical,” and sometimes referred to just as “a canonical,” this tag tells a search engine which version of a URL it should index in search results. If you don’t tell a search engine which version to use, it will choose one on its own, and search engines should never be left to their own devices.

Let’s look at Home Depot. Here’s a URL for a bathroom sink faucet I found by navigating through their site:

http://www.homedepot.com/p/Glacier-Bay-Constructor-4-in-Centerset-2-Handle-Low-Arc-Bathroom-Faucet-in-Chrome-7032E-B6101/202043756?N=5yc1vZc8d3

Here’s the same faucet with a different URL (reachable from a Google search result page):

http://www.homedepot.com/p/Glacier-Bay-Constructor-4-in-Centerset-2-Handle-Low-Arc-Bathroom-Faucet-in-Chrome-7032E-B6101/202043756

Or let’s try this. A popular Greenville restaurant is Pomegranate on Main. Because they have not specified which homepage URL is preferred, they have at least two live versions:

Other homepages might have even more versions:

Why is that a problem? Because as people link to your homepage, they might link to different URLs, spreading out value that should be consolidated to one page.

You could do a 301 redirect here (I’ll talk about that next), or you can set up a canonical tag. Basically, as Google explains it:

Add a <link> element with the attribute rel=”canonical” to the <head> section of the non-preferred pages:

<link rel=”canonical” href=”http://usethisurl.com” />

Stay tuned for more in Part 2! I’ll be covering 301 redirects and rel=”nofollow” and rel=”noindex” tags.

Enjoyed this post? Read more by Laura here

Laura Lee – Account Manager

Related Posts

Smiling mixed race woman using credit card for ecommerce on digital tablet at home

Future-Proof Your E-commerce: Optimize Your Product Pages for the Rise of AI-Powered Shopping

Just in time for peak online holiday shopping, Google has revitalized its e-commerce platform, aptly named Google Shopping. The platform

A woman uses a laptop to chat with an artificial intelligence chatbot

How Can Businesses Show Up in AI Answers?

This is part two of a two-part series on AI and digital marketing. The way we search for information is

A man looks at a computer screen in an open plan working office. Type is being added to the screen by an Artificial intelligence, AI, chatbot.

AI and Search: What Businesses Need to Know

This is part one of a two-part series on AI and digital marketing.  Artificial intelligence (AI) is no longer just