XML

Summary

This lesson started out with somewhat of an unlikely topic, the optimization of web sites with respect to search engine web crawlers. Of course, XML factored heavily into the discussion as it forms the basis of Google's Sitemaps service that allows you to seed Google's web crawler with information about the pages on your web site. You not only learned how Google Sitemaps works and how it can help you, but you also learned how the XML-based Sitemap protocol language is structured. From there, you created a Sitemap by hand and then learned how to validate it and submit it to Google for processing. After you learned how to create a Sitemap the hard way, I shared with you a much easier technique that involves generating Sitemap documents automatically using online tools. Hopefully you've left this tutorial with a practical web trick up your sleeve to try out on your own web sites.

Q&A

Q.

I still don't quite understand why having a web site crawled more frequently doesn't automatically improve its search ranking. What gives?

A.

Having your web pages crawled more frequently doesn't directly affect the search ranking of the pages, at least not in terms of improving an existing ranking. This is because more frequent crawling simply means that the content of the pages is reindexed. The content of the pages is what determines the search ranking, not how accurately they are indexed. So, having your pages more frequently crawled should result in your pages matching up more accurately with searches but not necessarily a higher search ranking.

Q.

Does Google Sitemaps support any document formats other than the Sitemap protocol?

A.

Yes. Three other document formats are supported by Google Sitemaps: OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting), RSS (Really Simple Syndication)/Atom, and plain text. None of these formats are recommended over the Sitemap protocol unless you already happen to use one of them to describe or syndicate your site. And if you happen to have a text file with a list of URLs for the pages in your site, you should consider using a tool such as Google's Sitemap Generator to convert it to the Sitemap Protocol.

Q.

How do I know that Google has successfully used my Sitemap to crawl the pages on my web site?

A.

You don't. In fact, Google makes no promises in regard to how a Sitemap improves the crawling of your pages. However, there is no negative to using a Sitemap, meaning that you only stand to gain by Google potentially crawling your pages more regularly and thereby improving the accuracy of search results that are related to your web content.