Sitemaps Autodiscovery With Robots Text File
Ever since the beginning of internet and search engines, the robots.txt file has been how website owners and their webmasters could tell search engine crawlers like GoogleBot which pages and content should be ignored and left out of search results.
This was the situation for many years until Google created Google Sitemaps. (This was later named XML Sitemaps Protocol as other search engines joined.)
New functionality called Sitemaps Autodiscovery was added to robots.txt file that made it possible to point search engines to your XML sitemaps. Thus search engine bots can, when they have downloaded and read the robots.txt file, automatically discover and retrieve XML sitemap files located on websites.
Note: In this tutorial we are creating sitemaps with our own tool, A1 Sitemap Generator
Submitting Your XML Sitemap
When Google first accounced their sitemaps, it was necessary to create and verify a Google Webmaster Tools account associated to the website containing the site map file. In addition you had to submit the site map files manually through their web interface. Now, instead of doing it manually for each search engine, you can essentially submit your sitemaps simply by updating them.
When done scanning your website and building the XML sitemap, the sitemapper software can also make the robots.txt file with the correct and full path to your XML sitemap.
As shown above, to include support for XML sitemaps auto discovery in the robots.txt file, all you need to do is add the fully qualified XML sitemap file path like this:
Sample robots.txt File for XML Sitemaps Autodiscovery
If have created a standard sitemap file:
If you have created a sitemap index file, you can also reference that:
Sitemap: Sitemap: http://www.example.com/sitemap-index.xml
Manual XML Sitemaps Submission
There are still valid reasons why submitting your sitemaps manually the first time can be a good idea. One such reason is to get started using the different webmaster tools provided by the search engines:
Cross Submit Sitemaps for Multiple Websites
In the beginning, and for a long time after that, it was not possible to submit sitemaps for websites unless the sitemaps were hosted and located on the same domain as the websites. However, now some include support for new ways of managing sitemaps across multiple sites and domains. The requirement is that you need to verify ownership of all websites in Google Webmaster Tools or similar depending on the search engine:
- Sitemaps protocol: Cross sitemaps submit and manage using robots.txt.
- Google: More website verification methods than sitemaps protocol defines.
Ping Search Engines When Updating XML Sitemaps
After the initial submission of your website XML sitemap file, here is a list of the steps you need to do when updating your website and sitemap: