Not a day goes by when I don’t see a 404 error in JournalXtra’s access logs for missing image files. Today I rose to the challenge and conquered them. It took a couple of goes because I kept forgetting to clear my browser cache while adjusting .htaccess directives. But anyway, after figuring out perhaps 30 different ways to do this, I’ve finally managed to strip trailing slashes from backlink URLs for images.
The Situation
JournalXtra has hundreds of images within its pages. It is produced with WordPress hosted on an Apache server. Apache likes directory URLs to have trailing slashes and it likes file URLs to not have trailing slashes. Some people (or poorly written software) like to stick trailing slashes at the end of every URL whether it leads to a directory or a file. For example:
http://example.com/directory/ tells Apache to load the index file in the directory named “directory” if an index file has been set or to list the contents of the directory if directory browsing is enabled.
http://example.com/file.png tells Apache to load the file “file.png”.
http://example.com/file.png/ tells Apache to load the directory named “file.png”. Pay attention to the final “trailing” slash at the end of the URL.
When Apache meets a URL with a trailing slash it looks for a directory not a file. When someone or a bot looks for a file, such as an image file, with a trailing slash at its end, Apache looks for a directory not a file. Consequently, Apache can’t find the “directory” so returns a 404 error.
When people link to images and put a trailing slash at the end of the URL, Apache sticks it finger up to anyone following the URL and tells them the image doesn’t exist.
The Solution
Assassinate anyone that links to files as though they are directories. In the meantime, until it’s legal and morally acceptable to do that, stick this in your .htaccess file:
RewriteEngine on RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} /$ [NC] RewriteCond %{REQUEST_FILENAME} .*\.(png|jpg|gif|jpeg)$ [NC] RewriteRule ^(.*)/$ $1 [R=301,L]
The above rewrite conditions check any requested URL for an image file. If the URL ends with a trailing slash and the requested file is an image file, the rewrite rule strips the trailing slash from the URL.
How It Works
Line by Line:
RewriteEngine on
Enables mod_rewrite. This line is only required if mod_rewrite is not already enabled.
RewriteCond %{REQUEST_FILENAME} !-d
Checks the requested document is not a directory. If it is not, Apache moves to the next line.
RewriteCond %{REQUEST_URI} /$ [NC]
Checks the requested URI has a trailing slash to be removed. If it does, Apache moves to the next line.
RewriteCond %{REQUEST_FILENAME} .*\.(png|jpg|gif|jpeg)$ [NC]
Checks that the requested document ends with any of .png, .jpg, .gif, or .jpeg. The flag [NC] tells Apache to ignore character case i.e upper and lower case characters are treated equally. If you need to add other file types, place them within the parentheses and separate them with a pipe, |.
RewriteRule ^(.*)/$ $1 [R=301,L]
The rewrite rule tells Apache to rewrite the whole URL but without the trailing slash. ^ means “the start” of the line, $ means “the end” of the line, (.*)/$ means “capture everything but the trailing slash at the end”, $1 means “return everything caught by the parentheses” (i.e the URL without the trailing slash).
With this snippet placed toward the top of your .htaccess file, any request for one of the stated image file types will always load that image file whether it has a trailing slash or not, provided it exists.
Over to You
Do you have any tips or tricks for getting rid of redirections or do you know a way the above snippet can be improved? Comment please :)