Forums

The forums ran from 2008-2020 and are now closed and viewable here as an archive.

Home Forums Other How to remove subdirectories using mod_rewrite in .htaccess

  • This topic is empty.
Viewing 15 posts - 1 through 15 (of 26 total)
  • Author
    Posts
  • #170247
    ewisely
    Participant

    Below is my directory structure

    Root (example.com)/
                       index.htm
                       contact.htm
                       privacy.htm
                       disclaimer.htm
                       cat/
                           play/
                                fun.htm
                           rest/
                                sleep.htm
    

    I managed to remove the file extension and add a trailing slash with:

    RewriteEngine on
    RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$
    RewriteRule (.*)$ /$1/ [R,L]
    

    But I also want to make it in such a way that when people go to http://www.example.com/fun/ they’re able to access http://www.example.com/cat/play/fun.htm without redirecting, which means, in the address bar it still shows http://www.example.com/fun/.

    I know I can use the direct approach like:

    RewriteRule ^fun/$ /cat/play/fun.htm [L]
    RewriteRule ^sleep/$ /cat/rest/sleep.htm [L]
    

    But I’ll be adding more files to these 2 subdirectories (/cat/play/ and /cat/rest/), so I was wondering if there’s a single rewrite rule to perform the rewrites for these files instead of having to enter 100 rewrite rules for 100 files under those 2 subdirectories. Please enlighten.

    Appreciate your help.

    #170274
    __
    Participant

    Use a backreference, like in your first example*.

    RewriteRule ^([A-Za-z]+)/$ /cat/play/$1.html [L]
    

    * though that example doesn’t “remove file extensions,” as you claim…? perhaps I misunderstood you.

    #170507
    ewisely
    Participant

    Thanks for your input, traq. But if I were to do what you suggest, I would need to create every rewrite rule for each file. I hope to create just a nifty rewrite rule for multiple files.

    #170509
    __
    Participant

    But if I were to do what you suggest, I would need to create every rewrite rule for each file.

    …no

    The example I gave above matches the URL against alphabetic characters followed by a slash, and rewrites it to the /cat/play directory with a .html extension. That’s what gathered you wanted, from your description:

    when people go to http://www.example.com/fun/ they’re able to access http://www.example.com/cat/play/fun.htm without redirecting, which means, in the address bar it still shows http://www.example.com/fun/.

    This example meets that description, to the letter. Did you try it out?

    If you want to rewrite to different directories (e.g., play or rest, as in your third code example), but without any way to distinguish which directory from the URL, then yes: you will have to make a new rule for every file. mod_rewrite cannot “guess” what you want it to do.

    You might consider adding an extra part to the url, for example:

    RewriteRule ^play/([A-Za-z]+)/$ /cat/play/$1.html [L]
    RewriteRule ^rest/([A-Za-z]+)/$ /cat/rest/$1.html [L]
    

    If there is something else I am not understanding about your question, please describe it further.

    #170833
    ewisely
    Participant

    Hi traq,

    RewriteRule ^([A-Za-z]+)/$ /cat/play/$1.html [L]
    

    Yup, I tried your example and that works for a single file. Like you said,

    mod_rewrite cannot “guess” what you want it to do

    So, I guess I just have to create every rule for each file if I were to place them in different subdirectories. I will also try out your example and see how well it works for my case:

    RewriteRule ^play/([A-Za-z]+)/$ /cat/play/$1.html [L]
    RewriteRule ^rest/([A-Za-z]+)/$ /cat/rest/$1.html [L]
    

    Thank you very much for your tips and effort in replying. Appreciate that a lot. :D

    #170837
    __
    Participant

    Yup, I tried your example and that works for a single file.

    Maybe I am misunderstanding what you are saying. Take these URLs, for example:

    example.com/fun/
    example.com/exercise/
    example.com/competition/
    example.com/games/
    

    You do not need to create new rules for each of these URLs. The single rule:

    RewriteRule ^([A-Za-z]+)/$ /cat/play/$1.html [L]
    

    Will handle all of them.

    Now, if you throw in some other URLs:

    example.com/sleep/
    example.com/bedtime/
    

    …mod_rewrite won’t know that they are meant to go to the “rest” directory. It will send them to the “play” directory because that’s the only one it knows about.

    If there is only one parameter in the URL, then that’s the only one mod_rewrite can make decisions based on.

    If you add a “category” parameter, then it could make that decision:

    example.com/play/fun/
    example.com/play/exercise/
    example.com/play/competition/
    example.com/play/games/
    example.com/rest/sleep/
    example.com/rest/bedtime/
    

    with

    RewriteRule ^play/([A-Za-z]+)/$ /cat/play/$1.html [L]
    RewriteRule ^rest/([A-Za-z]+)/$ /cat/rest/$1.html [L]
    

    Are we talking about the same thing, or am I mixed up about your question?

    #170944
    ewisely
    Participant

    Hi traq,

    RewriteRule ^([A-Za-z]+)/$ /cat/play/$1.html [L]
    

    This example works for all files under the same category. Sorry. Was my mistake. Because I only created one file for testing at first. Then I tried creating more files under a single category, your code worked perfectly. Yup, you got what I want. I completely understand what you mean. Thank you so much for your patience in explaining.

    By the way, I’ve stumbled upon a code like this:

    RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$
    

    Could you explain to me what this line of code is exactly doing? Appreciate that.

    #170969
    __
    Participant

    By the way, I’ve stumbled upon a code like this:
    RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$

    A RewriteCond is a conditional statement. It’s like an if: if the condition is true, then the following rule will be applied.

    This one compares the URI that the user requested (the Apache variable%{REQUEST_URI}) to a pattern.

    ! means “not”. Basically, it means we are looking for URIs that do not match this pattern, instead of ones that do.

    \. means a literal dot (.)
    (The backslash is an escape character: in a regular expression, . means “any character.” Escaping it makes it match an actual dot.)

    [a-zA-Z0-9]{1,5} means 1, 2, 3, 4, or 5 alphanumeric characters.

    | means “or”: so, we’re matching either the first part of the pattern, or the next.

    / is just a literal slash.

    $ is an “end-of-line” anchor. Effectively, it means that nothing else can come between the previous pattern and the end of the URI.

    Put all together, this means we are looking for a URI that “does not end with a dot followed by 1-5 alphanumeric characters or a slash.” My guess is that it’s looking for urls that (1) are not directories and (2) do not include file extensions.

    Why is another issue; this pattern seems vague enough that it probably has false results at times. Is it something that you’re trying to use? if so, what for?

    #170974
    ewisely
    Participant

    Thank you for your super clear explanation. I totally get what you mean.

    Oh, I’m thinking of using the code to add a trailing slash in case someone types in, for instance, example.com/rest/bedtime

    So, instead of returning a 404 Not Found error code, it will add a forward slash behind to make it look like this example.com/rest/bedtime/ which will return the content of example.com/cat/rest/bedtime.html

    Since you said it may produce false results at times, I wonder if you have a better solution for that?

    Thanks once again.

    #170976
    __
    Participant

    Simple solutions are the best ones:
    I would just make the trailing slash optional in the first place.

    RewriteRule ^([A-Za-z]+)/?$ /cat/play/$1.html [L]
    

    This way, you will match both the example.com/keyword/ and example.com/keyword forms.

    #170980
    ewisely
    Participant

    Thanks for your suggestion. But example.com/keyword/ and example.com/keyword will be considered as 2 different pages in the eyes of Google and the Page Rank assigned to them will be different too.

    I still prefer to standardize my url to all having a trailing slash. So, how do I go about doing that if I don’t want to use this RewriteCond %{REQUEST_URI} !(.[a-zA-Z0-9]{1,5}|/)$ since I’m afraid it may produce false results, as you said?

    By the way, I went over to look for the difference between REQUEST_URI and REQUEST_FILENAME, http://httpd.apache.org/ doesn’t explain well. Other sites simply parrot what it says. I would like to hear from you. Thanks.

    #170982
    __
    Participant

    example.com/keyword/ and example.com/keyword will be considered as 2 different pages in the eyes of Google…

    You can solve this by including a canonical meta tag on the page (this is good practice anyway).

    I still prefer to standardize my url to all having a trailing slash.

    Note that your rule would not do so. Rewriting a URL does not change what appears in the address bar unless you specifically tell the browser to re-request the URL (i.e., by specifying an external domain, or by using the R flag).

    To illustrate, assume the user types “example.com/some/url” in their address bar.

    • case #1:
    RewriteRule ^(.*)$ /myPage.html [L]
    
    # apache rewrites the url to "example.com/myPage.html".
    # the user sees "myPage.html",
    # but address bar *still reads* "example.com/some/url".
    
    • case #2:
    RewriteRule ^(.*)$ http://other.example.com/theirPage.html [L]
    
    # apache rewrites the url to "other.example.com/theirPage.html".
    # because this is on a different host, apache sends a Redirect header
    # (by default, "302 Found") to the user's browser.
    # the user's browser will request the rewritten url,
    # and the address bar will then read "other.example.com/theirPage.html".
    
    • case #3:
    RewriteRule ^(.*)$ /myPage.html [L,R=302]
    
    # apache rewrites the url exactly as in case #1,
    # but because you used the [R] flag will also send a Redirect header as in case #2.
    # the user's browser will request the rewritten url,
    # and the address bar will then read "example.com/myPage.html".
    

    To be clear, I’m not suggesting you force the user to redirect. There’s no need. Rewrite internally, include a canonical reference in your html, and save everyone the trouble of turning one request into two.

    I went over to look for the difference between REQUEST_URI and REQUEST_FILENAME, http://httpd.apache.org/ doesn’t explain well.

    Do you understand the difference between a URI (or URL, which is a type of URI) and a filename?

    A URI is like a name/ nickname for some resource (which might be a file, or might not). It defines what you want.

    A filename is the actual name of a file on your computer. It is like an address on your hard disk — it defines where something is.

    Many times, on websites, a URI will map directly to a particular file in a very predictable 1:1 manner. But it does not have to, and they are certainly not the same thing.

    Even in this simple usage, the two will be different: for example, the URI directory/file.html does not map to the filename /directory/file.html. It maps relative to your site’s document root, so the actual filename will be something like `/users/ewisely/webserver/public_html/directory/file.html’.

    #170984
    ewisely
    Participant

    Alright, I’ll go for the canonical solution. Thanks.

    Sorry, I still don’t get what you mean regarding that uri and filename things.

    So, when do I use REQUEST_FILENAME or REQUEST_URI?

    #171018
    ewisely
    Participant

    I change my directory structure to:

    Root (example.com)/
                             index.html
                             contact.html
                             privacy.html
                             disclaimer.html
                             routine/
                                             play-games.html
                                             do-exercise.html
                                             eat-food.html
    

    I decided not to use RewriteCond %{REQUEST_URI} !(.[a-zA-Z0-9]{1,5}|/)$ anymore so how should I rewrite my code to perform these:

    Address bar shows: example.com/play-games or example.com/play-games/
    User sees: example.com/routine/play-games.html

    Address bar shows: example.com/eat-food or example.com/eat-food/
    User sees: example.com/routine/eat-food.html

    etc…

    As I’m still confused with when and how to use REQUEST_FILENAME and REQUEST_URI, I don’t know what sort of RewriteCond I should put in.

    RewriteCond ?????
    RewriteRule ^(.*)/?$ /routine/$1.htm [L]
    

    And I would like to perform these as well:

    Address bar shows: example.com/contact or example.com/contact/
    User sees: example.com/contact.html

    Address bar shows: example.com/privacy or example.com/privacy/
    User sees: example.com/privacy.html

    I know I have to do something extra but I just don’t know what conditions I should include. Please advise.

    Appreciate your help in this. Thanks.

    #171022
    __
    Participant

    …I’m still confused with when and how to use REQUEST_FILENAME and REQUEST_URI

    Well, there’s not much to it besides the fact that they are different things. Use REQUEST_URI when you want to compare a pattern to the URI. Use REQUEST_FILENAME when you want the filename.

    URI stands for “Uniform Resource Identifier.” It’s what you use to tell a website what you want to see.

    A filename is …a filename. It’s the path to the actual file on your computer.

    A URI might locate an actual file. For example, /file.html maps directly to file.html in your website’s document root. This is called a URL (“Uniform Resource Locator”).

    A URI might also identify some resource —and not necessarily a file. (For example, here on css-tricks, the URI /forums/topic/how-to-remove-subdirectories-using-mod_rewrite-in-htaccess/ doesn’t identify a file on a server: it identifies a group of records in a database.)

    But even a URL (like /file.html) is not a filename. (At least, I would hope it’s not: you shouldn’t have html files in your system’s root directory.) The file that the URL locates is in your server’s document root directory. So, if your document root is something like /users/ewisely/webserver/public_html, then the filename that the URL /file.html locates would be /users/ewisely/webserver/public_html/file.html.

    how should I rewrite my code to perform these…

    Well, this brings us back to your original situation: Apache has no way to know which directory to look in to get the file, because the pattern is the same for each case.

    (if your routine/ files will always have a dash in the name, and there will never be other files with dashes in the name, then you might be able to write a rule to recognize that. But that’s a big “if,” and I would not recommend this approach.)

    You could add another part to the URI that indicates which directory to look in, or you can write a specific rule for each case.

Viewing 15 posts - 1 through 15 (of 26 total)
  • The forum ‘Other’ is closed to new topics and replies.