Enabling Gzip compression on your website

on 19th November
  • gzip
  • http compression
  • gunzip
  • .htaccess
  • compression
  • perl
First of all, what is Gzip? Gzip is a compression utility that has gained in popularity in recent years for a few reasons, notably it's open source software and the algorithms it uses are free from patents. This might not appear to be a big problem but if you've ever had to pay for a AES-512 encryption licence, you'll probably appreciate it that bit more.

Gzip has increasingly become a standard employed by Linux/Apache system administrators to compress their static content and reduce the physical amount of bytes that they need to transfer on each HTTP request. What does this mean for the end-user? Well, primarily it means that your site becomes much faster for your customers. Often Gzip HTTP compression can result in file size reductions of 90%+.

This is especially important if you're targeting customers in rural locations where broadband connections aren't as prevalent. We have one particular client who operates in the West of Ireland and after implementing gzip compression, they've increased their 'time on site' metric by 60%, a clear sign that customers are finding their website easier to use. This is easily attributable to snappier loading times.

How does this all work? Well, a customer visits a website and if they have a supported browser (any of the last few incarnations of the most popular browsers) minus Safari who have only implemented it in their most recent release, and when they request a page on your website, their browser sends a Accept-Encoding: gzip header along with their request. This basically tells our server that everything is going a-ok and we can feel free to fire some gzipped content down the line.

Our server then sends back some gzipped content, usually filename.ext.gz and tells the client browser that it has sent back compressed content with a Content-Encoding: gzip header. The file that the server sends back is significantly smaller than the original file so it arrives (and is processed by the client) in a fraction of the time that the original would have taken. The client decompresses the file back to its original form and displays or loads it.

What sort of content is suitable for gzipping? We recommend to clients to not compress items on the fly (as this takes up valuable time) but to compress suitable files on a periodic basis, probably when they're updated by developers or designers in the company. Also, it's important to note that some file formats are already compressed to some degree. Image formats are a good example of this. A .jpg or .jpeg file is actually compressed quite well; ok, not as well as Gzip would compress it but the savings are marginal for the work involved. Because of this, we recommend compressing files that don't change very often and which have lots of 'space' in them, for example, CSS files and JavaScript files are our prime candidates.

Ok, let's get into it, how do I enable HTTP compression on my website? We have a 2 step process:

1. Shell script performs the compression
2. Apache .htaccess file serves compressed content if available

Let's have a look at our shell script which is actually written in Perl. In our script, we have a few hard-coded directory paths which point to our static gzipable content, in practice this is our CSS and JS folders on our server. Here's the code which we save to a file called compress_static.pl.


use strict;

sub compress
{
my ($dir) = @_;
opendir(DIR, $dir);
foreach (readdir(DIR))
{
my $file = $dir."/".$_;
if (-d $file && $_ !~ /^\.$|^\.\.$/)
{
compress($file);
}
elsif ($file =~ /\.(js|css)$/)
{
`gzip -c $file > $file.gz`;
}
}
closedir(DIR);
}

compress('/path/to/my/css/files');
compress('/path/to/my/js/files');


The code is quite straightforward, it whips through the directories that are specified at the end of the file and looks for (js|css) (.js or .css) files and if it finds one, it gzips it up and outputs it to the filename.gz.

After this script is executed with the command perl ./compress_static.pl, check your CSS and JavaScript directories and you'll see that you now have 1 .gz file for every .js or .css file in the directory. This is your compressed content and it's now ready to be served to your website visitors.

In your .htaccess file, add in the following lines:


AddEncoding gzip .gz
<FilesMatch .*\.js\.gz$>
ForceType text/javascript
</FilesMatch>
<FilesMatch .*\.js\.css$>
ForceType text/css
</FilesMatch>
RewriteCond %{HTTP:Accept-Encoding} (gzip.*)
RewriteCond %{HTTP:HTTP_USER_AGENT} !Safari
RewriteCond %{REQUEST_FILENAME} !^.+\.gz$
RewriteCond %{REQUEST_FILENAME}.gz -f
RewriteRule ^(.+) $1.gz


We'll just run through these .htaccess directives quickly in case they conflict with anything else you have in there. The first line says, ok, recognise files with a .gz extension as gzip compressed files. The next 6 lines just indicate to the client that a .js.gz is ultimately a JavaScript file and a .css.gz is ultimately a CSS file.

Then we have 4 conditions which must be satisfied for us to serve up the gzipped content. The first is that the client must accept gzipping (as discussed earlier). The second is that we'll ignore Safari users. It's a cold cold world out there and until Safari reliably implements gzip support, we're not going to take the time to make a special case for it.

The next line RewriteCond %{REQUEST_FILENAME} !^.+\.gz$ says, don't serve compressed content of this file if this is a request for a file which has a .gz extension already, i.e. don't get stuck requesting myfile.gz.gz.gz.gz &c. The final condition line just confirms that we actually have some gzipped content to serve up. If all of those conditions are true, we RewriteRule the request to the gzipped content and hey presto, we're done. You are now serving gzipped compressed content to your visitors.

Even though everything has presumably gone smoothly, best to check that your compression is actually working. Here's a handy tool which can test http compression. Just enter in the URL of one of your normal JavaScript or CSS files and it will tell you if it's compressed or not. Note that you don't need to request the *.js.gz file directly, you can just request *.js and the *.js.gz content will be automatically served.

That's it and remember the search engines will love you for speeding up the internet! You must also remember to re-run your compress_static.pl file every time your static content changes. It's easy enough to set up a script to perform this for you which can be trigger automatically or you can just run the Perl script manually.

As with all articles on Celtic Productions, this article is protected by international copyright laws. It may be linked to (we are of course most grateful of links to our articles), however, it may never be reproduced without the prior express permission of its owners, Celtic Productions.