There are two types of sitemaps, and both are important for optimizing your site for search.
Matt Cutt’s from Google explains which one Google likes better.
HTML Sitemaps
The first type of sitemap is a standard HMTL sitemap. It is sometimes a standalone page, or stuck down in the footer of a page, it is totally visible to the users and offers another way for both humans and search robots to access sections of your site.
On a large site, an HTML sitemap will not have a link to every page, but will highlight the most important sections as shortcuts to the good stuff.
Google suggests that your site make an HTML sitemap available. You don’t really want your sitemap indexed by the search engines, so adding a <meta name=”robots” content=”noindex,follow”> will take care of that.
XML Sitemaps
The second type of sitemap is an XML sitemap. XML sitemaps are not intended for human consumption, they are strictly for the search engines. There are a few different types that you can use.
You can give specific information in your XML sitemap that can help the SE’s understand what pages are important. The most basic type of xml sitemap is available for all of the pages on your site. It follows this format:
<?xml version=”1.0″ encoding=”UTF-8″?>
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The only required tag is <loc>, which is the location of the file. The other fields allow you a way to supply more information. <lastmod> is the last time the file was modified, <changefreq> is how often the file changes (usually you use either lastmod or changefreq, not both), and priority is how important the page is for your site on a scale from 0.0 to 1.0 – So for example your homepage might be 1.0, major sections a 0.9, important sections 0.8, run of the mill pages 0.5 and so on.
When building out your xml sitemap, you can include every single page of your site, but you don’t have to. The search engines will use your sitemap as a guide, and may stop processing urls at some point if the SE feels it’s no longer getting quality results – so order it from most important to least important if possible.
Googler John Mueller responded to a question about timing with the following:
The Sitemaps processing pipeline can be delayed depending on its overall load at any given time. The pipeline prioritizes Sitemaps to be processed based on the best use of Googles resources, dependent on multiple factors. For example sites hosted on servers with slow response times or sites that do not meet the Webmaster Guidelines may experience delays. It generally shouldn’t cause you great alarm.
Google also reminds you, “Please note that submitting a Sitemap doesn’t guarantee that all pages of your site will be crawled or included in our search results.”
News Specific Sitemaps
Google allows you to submit your freshest news to their spiders via a news sitemap. Recently Google updated the information you can have in your news sitemap to make it even better!
A News Sitemap uses the Sitemap protocol, with additional News-specific tags as defined below. Here is an example of a News Sitemap entry using News-specific tags:
Google allows you to submit your freshest news to their spiders via a news sitemap. Recently Google updated the information you can have in your news sitemap to make it even better!
A News Sitemap uses the Sitemap protocol, with additional News-specific tags as defined below. Here is an example of a News Sitemap entry using News-specific tags:
<?xml version=”1.0″ encoding=”UTF-8″?>
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″
xmlns:n=”http://www.google.com/schemas/sitemap-news/0.9″>
<url>
<loc>http://www.example.org/business/article55.html</loc>
<n:news>
<n:publication>
<n:name>The Example Times</n:name>
<n:language>en</n:language>
</n:publication>
<n:access>subscription</n:access>
<n:genres>pressrelease, blog</n:genres>
<n:publication_date>2008-12-23</n:publication_date>
<n:title>Companies A, B in Merger Talks</n:title>
<n:keywords>business, merger, acquisition, A, B</n:keywords>
<n:stock_tickers>NASDAQ:A, NASDAQ:B</n:stock_tickers>
</n:news>
</url>
</urlset>
A News Sitemap lists only news articles which have been published on your site within the past two days. You shouldn’t include older articles. You can remove articles older than two days from the News sitemap, but they will remain in the News index for the regular 30 day period.
A News Sitemap must contain a publication date for each article, which refers to the date that the article first appears on your site. Please make sure not to set this to the “current” time. There are a lot of errors that can happen, they’re explained in more detail in Google’s Webmaster Help Forum.
Video Sitemaps
Google can crawl the following video file types: .mpg, .mpeg, .mp4, .mov, .wmv, .asf, .avi, .ra, .ram, .rm, .flv. All files must be accessible via HTTP. Metafiles that require a download of the source via streaming protocols are not supported at this time.
Below is an example video sitemap entry:
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″
xmlns:video=”http://www.google.com/schemas/sitemap-video/1.1″>
<url>
<loc>http://www.example.com/videos/some_video_landing_page.html</loc>
<video:video>
<video:content_loc>http://www.site.com/video123.flv</video:content_loc>
<video:player_loc allow_embed=”yes”>http://www.site.com/videoplayer.swf?video=123</video:player_loc>
<video:thumbnail_loc>http://www.example.com/thumbs/123.jpg</video:thumbnail_loc>
<video:title>Grilling steaks for summer</video:title>
<video:description>Get perfectly done steaks every time</video:description>
<video:rating>4.2</video:rating>
<video:view_count>12345</video:view_count>
<video:publication_date>2007-11-05T19:20:30+08:00.</video:publication_date>
<video:expiration_date>2009-11-05T19:20:30+08:00.</video:expiration_date>
<video:tag>steak</video:tag>
<video:tag>meat</video:tag>
<video:tag>summer</video:tag>
<video:category>Grilling</video:category>
<video:family_friendly>yes</video:family_friendly>
<video:expiration_date>2009-11-05T19:20:30+08:00</video:expiration_date>
<video:duration>600</video:duration>
</video:video>
</url>
</urlset>
Carlos Duplar says
Nice article, Phil. What are the best practices about pinging Google with new videos and news sitemaps? AFAIK they should be rebuilt, republished and pinged whenever a new video/news item is published, right?
Regards, Carlos