Meta Robots and robots.txt
# September 7, 2013 at 12:49 pm
This reply has been reported for inappropriate content.
Please excuse my ignorance, but I still have some uncertainties regarding the robots.txt file.
Consider the following server structure:
public_html .htaccess (file) 403.shtml (file) cgi-bin (folder) domain2.com (folder) domain3.com (folder) domain4.com (folder) images (folder) index.html (file) robots.txt (file) subdomain (folder)
Consider the following robots.txt file:
User-agent: * Disallow: /cgi-bin/ Disallow: /images/ Disallow: /domain3.com/something.js Disallow: /subdomain/ Disallow:
Now Some questions:
1.) Will the above robots.txt account for all the domains, subdomain, and subdirectories? Or is it necessary to have a robots.txt in each domain folder?
2.) According to the above robots.txt, all directories and files are crawled with the exception of cgi-bin, images, something.js, and subdomain. Is the last empty Disallow: statement needed, or will that default to crawl if not specified?
All help is welcome…Thank you.
P.S. I’ve read the documentation at robotstxt.org, but it didn’t seem to cover some things.
You must be logged in to reply to this topic.