Carpe Diem: Get a ‘Cleaner View’ with Advanced Rel=Canonical Practices


Life is Better at the Top!

dublicate titles

“Canonicalization” may sound pretty scary to some of you, right? It may be even more so for those who haven’t figured out how to use it properly yet. But the good news is that it’s a simple fix once you know what you are fixing. But before diving into it, we should probably get through the basics first.

What is Rel=canonical?

For those of you who are ‘rel-confused’, a Rel=”canonical” is an HTML attribute that goes in the <head> section of a webpage to help deal with duplicate or very similar content by identifying your preferred URL for search engines. In short, Rel=canonical has been designed to determine whether or not a Sanctuary is ‘kosher’ (as per Canon law).

Difference between Canonicalization and a 301 Redirect:

Though Canonicalization and 301 redirect both serve the same purpose, there is a slight difference between the two of them.
A 301 redirect is a permanent direction from old URL to new URL, whereas a Canonicalization is the method of choosing the best URL out of several name choices.

301 redirect

Reasons Why Rel=canonical Tags are Being Used Sparingly:

There are a lot of reasons why the link HTTP header has not had a ton of traction in the SEO industry. Some of those reasons are as follows:

  • SEOs are heavily focused on traditional URL consolidation for text/html content-types.
  • A Canonical HTTP header is more difficult to implement as compared to a link HTML tag.
  • Its implementation may require additional access where privileges may be limited.
  • Implementation may also require additional server modules to be enabled or installed.
  • Using a rel=canonical tag can easily create server errors, if not handled correctly.
  • It runs faster.
  • Presents fewer maintenance headaches.
  • Places a smaller load on server and bandwidth resources.

Benefits of Rel=canonical:

Five Easy Ways to Fix a Nasty Problem:

For those of you born without a left brain, here are five easy ways to fix the canonicalization problem (of your website, not your brain.)

  • Don’t use tracking IDs in organic links from other sites. If you get a link on another site, and want it to help with your SEO, don’t put a tracking ID in that, either.
  • Be careful with pagination. Many sites have pagination, where visitors can click on 1, 2, 3 etc. to jump to later pages in search results, product lists or articles. That’s fine, but make sure that each page has a single URL.
  • Don’t use tracking IDs in internal site navigation. A lot of sites add stuff like ‘?source=blog’ in their navigation. That lets them use their analytics reports to track user movement within, to and from their website. Instead, learn to use your web analytics referrer and navigation path reports.
  • Exclude ‘e-mail a friend’ pages. Most content management systems that have ‘e-mail a friend’ options direct the user to a unique page that has the same form and content. Use robots.txt and the meta-robots tag to exclude these from search engine crawls.

Avoid Dupes and Normalize URLs:

Duplicate title tags

Recently, the Googlebot has gotten a lot ‘cleverer’ and is following a lot more parameterized links on a website-upshot. Just like duplicate Meta description, duplicate title tags will affect the relevance of the search results on search engines. Your page could even be considered as duplicate content. Of course, this would be very detrimental in terms of SEO, and will have an impact on the ratings that are given according to the keywords by users.

Duplicate title tags can help you see through Google Webmaster Tools. The easy way of doing this is to login to Google Webmaster Tools, enter your user ID and password. Then select the dashboard page Optimization which is on the left sidebar and click the link HTML Improvements .On the page, there will be a parameter which indicates the existence of duplicate title tags that were found by search engine crawlers crawling your website or blog. If the link is open (duplicate title tags), you will see which posts are indicated as duplicate title tags.

Rewrite rules for IIS (Internet Information Services):

To perform this, you need administrative access to the IIS Server Application. If you do not have this, then you can do the redirects in ASP or PHP.

<p style="text-align: justify;"><rule name="pdf processor" stop Processing="true"></p>
<p style="text-align: justify;">    <match URL="^(.+).pdf" ignore Case="false" /></p>
<p style="text-align: justify;">    <action type="Rewrite" URL="/pdf.php?file={R:1}" /></p>
<p style="text-align: justify;"></rule></p>

Also:


<rule name="Redirect to www" patternSyntax="Wildcard" stopProcessing="true">
 <match URL="*" />
 <conditions>
 <add input="{HTTP_HOST}" pattern="domain.com" />
 </conditions>
 <action type="Redirect" URL="http://www.domain.com/{R:0}" />
 </rule>

PHP Canonical Redirect: (non-www to www)


<?php
 if (substr($_SERVER['HTTP_HOST'],0,3) != 'www') {
 header('HTTP/1.1 301 Moved Permanently');
 header('Location: http://www.'.$_SERVER['HTTP_HOST']
 .$_SERVER['REQUEST_URI']);
 }
 ?>

Redirects & Canonical Redirects for Active Server Pages (ASP)

Single Page Redirect:
This is placed on the page you are redirecting and is triggered when someone tries to access the page.


<%@ Language=VBScript %>
 <%
 Response.Status="301 moved permanently"
 Response.AddHeader "Location", "http://www.domain.com/"
 %>

Apache Server Canonical Redirect: (non-www to www)

This should always be done as some search engines do not properly understand the difference and may count it as a duplicate page. Additionally, if you use domain.com/index.html or domain.com/index.php, the Search Engine might get lost. Since Google is the most popular Search Engine, it also happens to be a bit archaic in its handling of canonical issues and redirects.

Make all your links Absolute (Full URL) at least to your home page, so any link on your site should always link to “http://www.yourdomain.com/”. Be sure to use the trailing slash as some servers will pass a redirect to the trailing slash if you do not, and this will be a 302 redirect.

ALWAYS:

<a href=”http://www.yourdomain.com/”>

NEVER:

<a href=”http://www.yourdomain.com/index.html”> -Not direct to the domain. <a href=”http://www.yourdomain.com”> – No Trailing Slash

ASP Canonical Redirect: (non-www to www)
This code should be inserted to include a file or any script which is executed for every page on the site before the page output begins. If your programmers are using ASP.net and/or they are not using a global file, you need to have them address this issue.

<%</pre>
<p style="text-align: justify;">If InStr(Request.ServerVariables("SERVER_NAME"),"www") = 0 Then
 Response.Status="301 Moved Permanently"
 Response.AddHeader "Location","http://www."
 & Request.ServerVariables("HTTP_HOST")
 & Request.ServerVariables("SCRIPT_NAME")
 End if
 %></p>

<pre>

Rel=canonical Tag for non-text/html types:

</p>
<p style="text-align: justify;"><Files “file.pdf”></p>
<p style="text-align: justify;">   Header add Link “http.example.com/page.html>; rel=”canonical””</p>
<p style="text-align: justify;"></Files></p>

<pre>

I would recommend using the traditional link rel=canonical tag for webpage’s but if you have PDF files, you may consider using this for non-text/html types.

Can you use Rel=”canonical” and rel=prev/rel=next together?

Yes, you can use rel=canonical and rel=next/rel=prev together. Bing also supports rel=prev/rel=next but given prior statements made by the Bing team regarding usage of self referential canonical tags, I don’t recommend using them together is a good idea from the Bing point of view.

Ending Note:

Penguin

When handling with the http headers and htaccess, a single mistake can cause an entire website to go down. Don’t try this directly on the live server, implement this on a test environment first.

If you do not have a testing environment, one way to take precaution and minimize potential downtime in .htaccess edits is to simply keep a backup copy of the file before doing any edits. Always keep it in a handy place so that if anything goes wrong, you can simply replace it with your latest functional backup copy.

Author bio

Roman Viliavin, vice CEO at Promodo SEO Company

Unconventional Thinker and candidate master of chess. Roman has been working in the field of search engine optimization since 2005 and is the moving spirit of the company. Participant and speaker of all major events in SEO business. Roman has successfully completed dozens of projects and gladly shares his experience with SEO community via articles and various online and offline publications. Follow Roman on Twitter and Facebook.

Let me send you free updates!

»


What do you think? Share your thoughts below!

If you want a pic to show with your comment, go get a gravatar!
 

Author / Blog Name
E-mail (optional)
URI
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.


CommentLuv badge

<<         >>