Grow your CSS skills. Land your dream job.

Meta Robots and robots.txt

  • # January 17, 2013 at 5:54 am

    Is robots.txt needed if one uses the Meta Robots, and vice versa? Also, does the robot.txt reside in the root directory, or does each sub-directory for individual sites also need one…assuming the file contains the same permissions.

    # January 17, 2013 at 9:30 am

    You just set your robots.txt once in the root folder, then you choose in the txt file what folders that located inside the root to be allow/disallow

    # September 7, 2013 at 12:49 pm

    Please excuse my ignorance, but I still have some uncertainties regarding the robots.txt file.

    Consider the following server structure:

    public_html
    
        .htaccess (file)
        403.shtml (file)
        cgi-bin (folder)
        domain2.com (folder)
        domain3.com (folder)
        domain4.com (folder)
        images (folder)
        index.html (file)
        robots.txt (file)
        subdomain (folder)
    

    Consider the following robots.txt file:

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /images/
    Disallow: /domain3.com/something.js
    Disallow: /subdomain/
    Disallow:
    

    Now Some questions:

    1.) Will the above robots.txt account for all the domains, subdomain, and subdirectories? Or is it necessary to have a robots.txt in each domain folder?

    2.) According to the above robots.txt, all directories and files are crawled with the exception of cgi-bin, images, something.js, and subdomain. Is the last empty Disallow: statement needed, or will that default to crawl if not specified?

    Any experts out there with some incite?
    @paulie_d
    @theDoc
    @senff
    @traq

    All help is welcome…Thank you.

    P.S. I’ve read the documentation at robotstxt.org, but it didn’t seem to cover some things.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.

*May or may not contain any actual "CSS" or "Tricks".