Need Help with your website?

For free advice call 01276 505958 or 07921 859802 or leave your name and telephone number and I'll be in contact

Your Website:

Your Name:

Contact Telephone:

Search Engine Spiders

All search Engines use the same method of generating listings: Search Engine Spiders. These are automated programs that trawl through the billions of webpages in search of data for inclusion in the search engine's database. They will also follow links and carry out validity checks of the data. Spiders are also known as crawlers, agents, bots and robots.

If you review your weblogs you will find out which spiders have been visiting. Whilst spiders are good for getting listed, not all of them are friendly. Some will download the whole thing for various nefarious means, not least in order to plagiarise the data. This can have a detrimental effect on your bandwidth if your site uses a lot of images. Two such spiders are Teleport Pro and WebStripper. If you discover them in your weblogs then it means someone has been attempting to strip your site off the server. The good news is that there is a simple method of preventing this. If you haven't already, you can implement the Robots Exclusion Standard. To implement exclusions, you need a robots.txt file in you site root directory.

If you don't have a robots.txt file then create one in notepad. The default is:

# robots.txt for http://www.yoursite.com

User-agent: *

Disallow:

This means allow all spiders to search the site. To exclude a spider add the following lines:

User-agent: NameOfAgent

Disallow: /

Ensure that you enter the name of the agent exactly as it appeared in your reports/logs e.g. Teleport Pro/1.29. Ensure you have a separate entry for each agent.

Back to the top