Introduction
Robots.txt is used to give an instruction to web robots or search engine for which folders that they allowed or not allowed to crawled. For most blogs, some of the folder are irrelevant to search engine to index such as admin folder or images. This is where robots.txt takes part. There are few types of web robots that crawled into your blog such as Google, Google Images, Alexa, Altavista, Inktomi, WebCrawler etc.
Blog owner can create a robots.txt files and put it in the root(top-level) folder of the website. This will prevent search engine to index certain folder. For example, if you don’t want all of the web robots to indexing your page, this is how it goes.
User Agent: *Disallow: /
“User Agent : *” – This section applies to all web robots.
“Disallow : /” – This means that web robots should not visit any of this website.
Details
Using robots.txt, you can prevent search engine to index certain unwanted folder in your blog directory. Here are some example of how you can do it.
User Agent: * Disallow: /cgi-bin/ Disallow: /tmp/Take note that you need a separate line for every folder you want to exclude from search engine. For example, you can’t have “Disallow : /cgi-bin/ /tmp/” in a single line.
Here are a few of other example of how you can use robots.txt
To exclude all robots from the entire server
User Agent: * Disallow: /To exclude single robot
User Agent: Alexa Disallow: /To allow single robot
User Agent: Google Disallow: User Agent: * Disallow: /
These are few of the example of how to create and use robots.txt files. If you have any ideas or feedback, please use the comment box below to share it.
Related articles
- Learn About Robots.txt with Interactive Examples (seomoz.org)
Tagged: Altavista, google, Inktomi, Robots exclusion standard, robots txt, technology, User agent, web robots, Web search engine, WebCrawler



[...] How To Create A Robots.txt Files (revoblogr.wordpress.com) [...]