I was talking to my friend Steve at Carrot-Top Industries this afternoon about rewriting url’s because of a hosting move. He is a savvy web marketer but was unfamiliar with the power of using an .htaccess file to fix 404 errors.
What that means in layman terms is, the .htaccess file allows you to override the server-wide apache directives for just your stuff.
In Steve’s case, he wanted to do some simple redirections from old url’s that no longer exist to existing pages. Here’s how to do that.
- Create a new file in your root directory named .htaccess (yes, the file name starts with a dot).
- Make sure the redirects are wrapped within a <IfModule mod_rewrite.c></IfModule> tag. That allows other directives to sit in that file without impacting the rewrites.
- Understand that the file is read from top to bottom, so putting things in the wrong order can make very strange things happen.
- Understand the flags you can use to impact processing, like [L] for last and [R=301] for a permanent redirect.
- Try learning regular expressions if you dare – it makes redirecting via patterns much easier
- Save the file and test every change right away. As soon as the file is saved, the redirections will be live, there is no lag.
Let’s look at some examples to make this easier.
You want to change the “About Us” page from /about.html to now be /about-us.php without breaking links that may be pointing at the old url.
<IfModule mod_rewrite.c>
RewriteEngine On
Redirect /about.html about-us.php [R=301]
</IfModule>
The redirect above is about as simple as it gets, the addition of the [R=301] flag indicates that it’s a permanent redirect, and thus, Google will continue to flow PageRank to the “new” page. Without the 301 redirect, requests that come in get a 302 redirect which means Google still thinks the “old” page should be getting any PageRank.
You can also use the following syntax:
<IfModule mod_rewrite.c>
RewriteEngine On
Redirect permanent /about.html about-us.php
</IfModule>
Both versions do the same thing, one just seems a little easier to read.
Here’s one that I had to use recently when I stripped the dates out of the url’s on this blog. Although I am doing this one using the Redirection Plugin for WordPress I’ll still explain it as if it was in the .htaccess file.
<IfModule mod_rewrite.c>
RewriteEngine On
RedirectMatch 301 /(d*)/(d*)/(.*) /$3
</IfModule>
So the one above only does it’s redirect if it matches the first part. The first part is a regular expression that is looking for the following pattern: a slash, followed by a pattern starting with a digit, followed by a slash, followed by another pattern starting with a digit, followed by another slash, and finally anything. In this case it was /year/month/title-of-post and I needed to strip out the date part.
So using regular expressions means I can pick and choose what I want, in the above redirect I use 3 parenthesis. so the redirection says take whatever gets caught in the third one, and use that as the new url.
There’s a lot more you can do with .htaccess, and more complicated rewrites, but these two will handle the majority of the rewrites you’ll need.
I’ll stress 3 things: “test, test, test”. It’s easy to break stuff and completely bring down your site, so always test every new rewrite. If you break it, just go back to the previous version and replace it. You did back up the file right?
One final word about .htacess if your using WordPress. WordPress writes it’s own .htacess file, don’t write inside the wordpress specific rewrites, because your changes will get overwritten. Here’s what mine looks like:
# BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]# END WordPress
RewriteEngine On
more yada yada yada
WordPress looks for the #BEGIN and #END tags, just put your stuff under theirs.
i’ve been looking for some document like this one.
I started to fix my 404 url’s with apache and Webmaster Tools Options.
Thank you sir 🙂
Thanks for sharing this valuable information
I’m trying to get rid of several pesky 404 errors that keep appearing in Screaming Frog (it’s an SEO spider tool) – these are all nonsense links for some reason, and i just don’t see how to tell whatever is generating these links to destroy or ignore them…
would it be possible to do so in a similar way to a 301 redirect as you’ve mentioned above by writing to the .htaccess file?