Big news! Last week, Google, Yahoo and MSN announced a new tag that allows you to specify the canonical URL for any page on your site. If you’re not sure what that means, let me explain in real life terms.
McClatchy Corporation owns dozens of newspapers across the country. Many of the daily papers also have smaller weekly or niche papers. All of those newspapers have a presence on the web. This allows their markets to share content in a way that makes sense from a business point of view. Let’s look at one scenario. The Miami Herald is a hot spot for hurricane information because they are in the path of so many, but also because the National Hurricane Center is in their city.
Miami may “publish” their hurricane story into multiple sections. They know that it will live in their hurricane/storm section, but because the storm is coming right towards them, they could also publish the exact same content into “Breaking News”, “Local News” and “Weather”. Are you seeing the problem yet?
Now Miami has the exact same story with 3 different urls. That’s a problem.
Thankfully, we can now fix this problem, and many similar ones.
<link rel="canonical" href="http://www.herald.com/storm/story/228361.html" />
Ahhhh,that’s better. Putting a tag like the one above on each of those pages tells the search engines where the “real” page for that story is. Now when the search engines come across the exact same content in various sections, they can understand that it is just a copy of the original. And Google also says, “Additional URL properties, like PageRank and related signals, are transferred as well.”
The last part I was hoping for was a canonical URL on a completely different domain, so that when Miami shares it hurrican coverage with Raleigh’s News & Observer we could further reduce duplicate content, but no luck on that front. Google explains:
Can this link tag be used to suggest a canonical URL on a completely different domain?
No. To migrate to a completely different domain, permanent (301) redirects are more appropriate. Google currently will take canonicalization suggestions into account across subdomains (or within a domain), but not across domains. So site owners can suggest www.example.com vs. example.com vs. help.example.com, but not example.com vs. example-widgets.com.
This is very good news, stuff like “print version” or urls with additional parameters can be herded to one url.
Let the CMS hacking begin!
A well documented post over at SEOmoz has lots of good information, and fantastically bad robot drawings to help.