Basic tutorial on sitemaps.(Part 1)
Do you have a website?
Do you want a way of informing search engines like Google, Yahoo and MSN about your website?
Have you heard something about Sitemaps?
Sitemaps allow website owners to inform certain search engines the pages in their site. Unfortunately it doesn't guarantee the pages will be indexed neither does it promise more traffic. However, it is an easy way of at least letting them know.
So what does a sitemap look like? Sitemaps use a scripting language called XML, it is similar to HTML so even a novice can use it. In fact there a number of free tools available on line which will even build the XML document for you. I will list these further down the document.
So what does a sitemap look like? Below is a very simple example:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The first 2 lines and the last line are all standard. Then for each page on your site you need to have a URL entry. Below is the tags you can use within the URL tag.
| Attribute |
|
Description |
<urlset> |
required |
Encapsulates the file and references the current protocol standard.
|
<url> |
required |
Parent tag for each URL entry. The remaining tags are children of this tag.
|
<loc> |
required |
URL of the page. This URL must begin with the protocol (such as http) and end with a trailing slash, if your web server requires it. This value must be less than 2,048 characters.
|
<lastmod> |
optional |
The date of last modification of the file. This date should be in W3C Datetime format. This format allows you to omit the time portion, if desired, and use YYYY-MM-DD.
Note that this tag is separate from the If-Modified-Since (304) header the server can return, and search engines may use the information from both sources differently.
|
<changefreq> |
optional |
How frequently the page is likely to change. This value provides general information to search engines and may not correlate exactly to how often they crawl the page. Valid values are:
- always
- hourly
- daily
- weekly
- monthly
- yearly
- never
The value "always" should be used to describe documents that change each time they are accessed. The value "never" should be used to describe archived URLs.
Please note that the value of this tag is considered a hint and not a command. Even though search engine crawlers may consider this information when making decisions, they may crawl pages marked "hourly" less frequently than that, and they may crawl pages marked "yearly" more frequently than that. Crawlers may periodically crawl pages marked "never" so that they can handle unexpected changes to those pages.
|
<priority> |
optional |
The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawlers.
The default priority of a page is 0.5.
Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages. Search engines may use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your most important pages are present in a search index.
Also, please note that assigning a high priority to all of the URLs on your site is not likely to help you. Since the priority is relative, it is only used to select between URLs on your site.
|
If you don't fancy doing this yourself, how about use one of the many tools to do it for you. Google has a list of links to both online and offline tools at Google Code
OK so you now have your sitemap, part 2 will go through the options you have to submit it to the search engines.
For more detailed documentation on sitemaps visit sitemaps.org.