The
robots exclusion standard, also known as the
robots exclusion protocol or simply
robots.txt, is a standard used by
websites to communicate with
web crawlers and other
web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned. Robots are often used by
search engines to categorize web sites. Not all robots cooperate with the standard;
email harvesters,
spambots and
malware robots that scan for security vulnerabilities may even start with the portions of the website where they have been told to stay out. The standard is different from, but can be used in conjunction with
Sitemaps, a robot
inclusion standard for websites.