< Back to Blog

Adding Pages to robots.txt Takes Time to Work
Mon, 17 Dec 2007 13:15:04 by Kerry Dye

I wrote a while ago about ways to exclude parts of your site from the search engines. In the section about robots.txt removal I noted that the response was not instant, but I didn't elaborate any more than that, so I thought I would revisit that subject having observed the response to some of my robots.txt changes over the last few months.

In Google, the bottom line on removals is that the page won't get removed until the spider revisits the page. On a small site that is visited often, this happens really quickly. However on a larger site with many pages, it may be months before the spider revisits the page.

You might have thought that it would work in a different way - that if you added a directory to the robots.txt file, then the first time that robots.txt file was downloaded, Google would go "Aha!" and remove from its index anything that matched that rule. But that isn't what happens; it is done on a page by page basis as the spider finds the page, which is then matched to the rules in the robots.txt.

I have had a lot of SEO success with removing "low value" pages from Google - pages that are almost-duplicates, but because of the way that this is implemented, the effects can take different amounts of time to show themselves. With a small site, the results are pretty instant - just days can pass before the site races up the rankings with its new more relevant page selection. In the case of larger sites with tens of thousands of pages, the result is far more gradual, as each page is revisited less often, and the removal process is much slower.

Is there a solution? Well, although Google Webmaster Tools allows you to do removal requests for URLs that you want removed, each one has to be entered by hand, this is time consuming for more than a handful of pages (ask my colleague Pete - he removed nearly 300 URLs for a client). However, this is the only quick way to do it (and it still takes a couple of days to be implemented).

If your removal pages are deeplinks, which are low down the crawling hierarchy on the site, a possibility for speeding these up is to provide a site-map like page of those links accessed temporarily from your home page (which you remove when it has done its job).

The final option is just patience - something that search engine optimisers are quite good at - eventually the links will be removed and your site should climb the results and the page ranking for the remaining pages is improved as a result.



Kerry Dye
Campaign Delivery Manager


Subscribe

Archives

Related Blogs
Combating the Google Gravity Rankings Drop
Tue, 7 Oct 2008 14:01:39 by Kerry Dye
Friday Fun - Make Google Talk Like The Swedish Chef
Fri, 3 Oct 2008 09:11:59 by Emily Mace
Search Engine Optimisation - Things to Avoid
Thu, 2 Oct 2008 09:11:23 by Emily Mace
Improving Offline Conversions with your PPC Campaigns
Wed, 1 Oct 2008 17:37:04 by Matt Hopkins
Google Search Interface Changes
Wed, 1 Oct 2008 16:05:56 by Kerry Dye
Search Engine Optimisation in Google from 2001
Wed, 1 Oct 2008 14:11:07 by Pete Handley
SEO Tip for ECommerce Product Sites
Mon, 29 Sep 2008 16:23:43 by Matt Hopkins