Web Server Distribution Performance Script

What does the distribution performance perl script do ?

The script will collect the time in seconds your web server took to distribute an object to a user from the web logs. In essence, we count the number of items successfully served in 0.1 seconds, then in 0.2 seconds and so on. The times are then tallied and graphed on the command line in ASCII format. ASCII output was chosen because it can be seen in a basic xterm over ssh and does not need x-forwarding or graphical libraries.

Web server distribution time is the full transaction time it took from the moment the client requested the data to the time when our server completed the transaction. Note that the remote client can influence this result by the speed of their connection or the speed of their system. For example, sending a 100KB picture to a user over a modem connection is going to take a lot longer to complete then a user with a fast FIBER connection. The same analogy is true for an old Pentium 4 rending a page from an SSL site compared to a faster quad-core gaming machine.

What we are looking for is the average amount of time it takes to serve data to our real world users. And "real world" is the key here. We do not care about synthetic benchmarks or how fast our web developers load the page on the local LAN. It also does not matter that our web server is optimized and running on the latest hardware with a gigabit internet connection if the majority of our users have modems connected to Commodore 64's. Our users connection speeds are just as important to web page distribution as our's is. This concern is not only important so a user has a good experience on the site, but also to rank higher in search engines like Google, Bing or Yahoo. If a user sees your site as slow they may go somewhere else or just avoid you; this is bad for you and it is especially bad if you are a business. Search bots like fast sites too. Google, for example, uses the speed your webserver can serve data in its PageRank calculations. So all things being equal, a site who serves data faster is going to rank higher in the search results.

This perl script was made so we at calomel.org could check the serving speeds our our pages. We are passionate about security, but we also want to make sure people can load our pages quickly. We do this by keeping track of the average amount of time every object on our site is served per 24 hour period.

The script is very flexible and can show you how fast all objects are sent or only HTML pages or how about how fast Googlebot can access your pictures. These are just a few of the options you have.

What does the output of the script look like ?

This is the output of the script without any arguments. It shows the times it took to distribute all files successfully to all client ips. The script is only looking for code 200 responses in the web server logs, The file types are any object which was successfully transferred to the client like html, css, png, jpg or even the favicon.ico file.

Times on the X-axis are in tenths of a second and represent the total time it took to complete the client request. We limited the X-axis to a maximum of 9.0 seconds as this is the point of diminishing returns; if objects on your site take more then just a few seconds to load people will probably just avoid your site. The Y-axis is the amount of objects served within that time and it is dynamically scaled.

In the example, we see below that approximately 826 items were delivered within the 0.3 second mark and 323 were loaded within the 1.0 second mark. This is a good example of what you want your graph to look like. The majority of your traffic should be on the left side of the graph and the values should drop off quickly. You will have some clients who take an especially long time to complete the request, but those requests should be the exception to the rule.

user@machine:  ./calomel_http_log_distribution_performance.pl

   .:.  Calomel Webserver Distribution Performance Statistics

         Log lines: 10000, Search string(s):   
       ________________________________________________________________________________________________
  1042 .|..............................................................................................
       .|..............................................................................................
   970 .|..............................................................................................
       .|..............................................................................................
   898 _|______________________________________________________________________________________________
       .||.............................................................................................
   826 .||.............................................................................................
       .|||............................................................................................
   754 .|||............................................................................................
       _|||____________________________________________________________________________________________
   682 .||||...........................................................................................
       .||||...........................................................................................
   610 .||||...........................................................................................
       .||||...........................................................................................
   539 _||||___________________________________________________________________________________________
       .||||..||.......................................................................................
   467 .||||..||.......................................................................................
       .||||.|||.......................................................................................
   395 .||||||||.......................................................................................
       _|||||||||______________________________________________________________________________________
   323 .|||||||||......................................................................................
       .||||||||||.....................................................................................
   251 .||||||||||.....................................................................................
       .|||||||||||....................................................................................
   179 _||||||||||||___________________________________________________________________________________
       .||||||||||||...................................................................................
   107 .|||||||||||||.|................................................................................
       ||||||||||||||||..|.............................................................................
    35 ||||||||||||||||||||....||......................................................................
       ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||_|||||||||||||
 Time: |^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^
       0   0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  4.6  5.0  5.5  6.0  6.5  7.0  7.5  8.0  8.5  9.0

The script's USER defined variables:

Lets go through a few of the user defined variables explaining what they are for and what settings you may want to set them for. Remember that every site is going to be different so you will want to use the DEBUG option and modify them accordingly. Lets look at some options, then the script itself and then an example of how to run the script.

my $log is the full path to the web server log file you want to watch. The format of the log file is standard Apache "common" or "combined". The best part about having you server use a standard logging format is this script can be used with Apache, Nginx, Lighttpd or even thttpd. What the script is looking for is an ip address of the remote client as the first string of the line with a space separating it from the rest of the line. For example, this is Googlebot accessing a site, "66.249.72.6 calomel.org - ... 0.152" with the processing time as the last string on the line.

my $debug_mode is the ability of the script to print out each of the steps to the console. This will allow the script to print out the times of every log line it finds. You really only need this for testing.

Getting the script running

To get the script running you first need to copy and paste the following Perl code below. We are going to call the script calomel_http_log_distribution_performance.pl, but you can call it anything you like.

We have tried to clearly comment every line and logic path in the script. Because of this the script might look at little long. We clearly commented the script so, if you wanted to, you could modify the script to better meet the needs of your environment.

Once you have the script in a file and executable you need to edit the user defined options as stated in the section above. Then, take a look at the section after the code block for a discussion on how to run the script.

#!/usr/bin/perl 

use strict;
use warnings;

## Calomel.org .:. https://calomel.org
##   name     : calomel_http_log_distribution_performance.pl
##   version  : 0.01

##   usage: ./calomel_http_log_distribution_performance.pl log_lines regex1 regex2 regex3

# description: the script will parse the logs and collect the web server
# distribution times at the end of the log line. It will then count the number
# of requests completed per time frame in 0.1 second increments. The last step
# is to display the results in an ASCII graph on the command line. 

## which log file do you want to watch?
  my $log = "/var/log/nginx/access.log";

## user defined number of log lines to tail 
  my $lines = 10000;
  $lines = $ARGV[0] if ((defined$ARGV[0]) && ($ARGV[0] =~ /^[+-]?\d+$/));

## declair the user defined search string #1
  my $search1 = "";
  $search1 = $ARGV[1] if defined($ARGV[1]);

## declair the user defined search string #2
  my $search2 = "";
  $search2 = $ARGV[2] if defined($ARGV[2]);

## declair the user defined search string #3
  my $search3 = "";
  $search3 = $ARGV[3] if defined($ARGV[3]);

## do you want to debug the scripts output ? on=1 and off=0
  my $debug_mode = 0;

## declair some internal variables and the hash of the time values
  my ( $distime );
  my %seconds = ();

## clear the environment and set our path
$ENV{ENV} ="";
$ENV{PATH} = "/bin:/usr/bin:/usr/local/bin";

## open the log file. we are using the system binary tail which is smart enough
## to follow rotating logs. We could have used File::Tail, but tail is easier.
  open(LOG,"tail -$lines $log |") || die "Failed!\n";

  while(<LOG>) {
       ## process log line if it reports a successful file distribution and if it includes
       ## the optional user defined search string.
       if ( ($_ =~ m/( 200 )/) && ($_ =~ m/$search1/) && ($_ =~ m/$search2/) && ($_ =~ m/$search3/) ) {

         ## extract the distribution time and round to tenths of a second. If the last value
         ## of the log line is NOT a number an error will print for every bad line.
          $distime = sprintf("%.1f",(split ' ')[-1]);

         ## initialize the hits value to avoid warning messages.
         $seconds{ $distime }{ 'hits' } = 0 if not defined $seconds{ $distime }{ 'hits' };

         ## increment the hits counter for every time value.
          $seconds{ $distime }{ 'hits' } = $seconds{ $distime }->{ 'hits' } + 1;

         ## DEBUG: show detailed output
         if ( $debug_mode == 1 ) {
           print $_;
           print "search1: $search1 search2: $search2 search3: $search3  dist time: $distime  hits: $seconds{ $distime }{ 'hits' }\n";
         }
       }
  }

## declair some constants for the ASCII graph.
use constant MAX     => 50;
use constant Height  => 30;
use constant Indent  => 7;
use constant Periods => 96;

## Enter the data collected from the logs into a Data array we can graph from
my @Data;
my $Element = 0;
my $TenthSec = 0.0;
while($Element < Periods)
   {
   $Data[$Element] = 0;
   $Data[$Element] = $seconds{ sprintf("%.1f",$TenthSec) }{ 'hits' } if length($seconds{  sprintf("%.1f",$TenthSec) }{ 'hits' } //= '');
   $TenthSec=$TenthSec + 0.1;
   $Element++;;
   }

## Print out the ASCII graph by calling the ASCII_Graph method.
print "\n   .:.  Calomel Webserver Distribution Performance Statistics\n\n";
print "         Log lines: $lines, Search string(s): $search1 $search2 $search3\n";
print(ASCII_Graph(Height, Indent, Periods, @Data));
print "\n\n\n";

## This is the method to make the ASCII graph from the Data array. Special 
## thanks to the Perl Monks site for the examples. This is an array of 96 values
## which represent 0.0 to 9.0 seconds. Though you will have requests which take
## longer then 9 seconds to serve, this should be exception. Google for example
## use page load times in their Google Pagerank, so if your site takes longer then
## a few seconds to load you will not be ranked very high compared to your competition.
sub ASCII_Graph
{
   my ($Height, $Indent, $Periods, @Data) = @_;
   my $HighestValue = 0;
   my @Rows = ();
   for my $Period (0 .. $Periods - 1) { $HighestValue = $HighestValue > $Data[$Period] ? $HighestValue : $Data[$Period]; }
   my $Scale = $HighestValue > $Height ? ( $HighestValue / $Height ) : 1;
   for my $Row (0 .. $Height) {
      if($Row % 2) { $Rows[$Row] = sprintf("%" . ($Indent - 1) ."d ", $Row * $Scale) . ($Row % 5 == 0 ? '_' : '.') x $Periods; }
      else { $Rows[$Row] = sprintf("%" . ($Indent - 1) ."s ", ' ') . ($Row % 5 == 0 ? '_' : '.') x $Periods; }
      for my $Period (0 .. $Periods - 1) {
         if ($Data[$Period] / $Scale > $Row) {
            substr($Rows[$Row], $Period + $Indent, 1) = '|';
            }
         }
      }

   return(join( "\n", reverse( @Rows ), ' Time: ' . '|^^^^' x ( $Periods / 5 ), ' ' x $Indent . '0   0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  4.6  5.0  5.5  6.0  6.5  7.0  7.5  8.0  8.5  9.0'));
}
#### EOF ####

Want more speed? Make sure to also check out the Network Speed and Performance Guide. With a little time and understanding you could easily double your firewalls throughput.

Getting the web server logs to report processing times

Once you have the script in place and executable we need to setup your web server log lines with the request time. The following Nginx log line shows Googlebot accessing our home page. If you look at the very last value on the line you will see the floating point value, "0.152". This is 0.152 seconds or 152ms which is the time it took our web server to serve the page to Googlebot and complete the transaction. Our script is looking for such a number at the end of your log lines. The format of the rest of the log line is not important, just that the time be the last value.

66.249.67.164 calomel.org - [01/Sep/2011:01:01:01 -0400] "GET / HTTP/1.1" 200 2619 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" AES256-SHA 0.152

Nginx has a log line variable called "$request_time" which we can use. The following shows the "$request_time" at the end of the log format declaration in our example nginx.conf file. This is the same log format we use in our Nginx "how to" - Fast and Secure Web Server example.

  log_format  main  '$remote_addr $host $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $ssl_cipher $request_time';

Apache has a log line variable called "%D" which is the request time is microseconds. Just add this to the end of your LogFormat line in the apache2.conf .

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D" combined

How to run the script and test the settings:

Once you have the webserver logging in the correct format for a while you can run the script. The script can be run without arguments, like we did in the first example at the top of the page. The perl script will look at a default 10,000 log lines for all code 200 client requests and graph the results.

The perl script will also take some optional arguments. The first argument is the amount of log lines you want to query. The second, third and forth optional arguments are regular expression strings you may want to filter the log lines with. For example, we could look at 40,000 log lines and only ask for those lines serving HTML files to Googlebot. Here is the output example from our logs. Notice the script will report the options you passed to it at the top of the graph. From the results seen here, we sent Google Bot any requested HTML page in an average of 0.3 seconds. This might seem a bit high, but calomel.org is an SSL only site which encrypts all of our pages with at least the Advanced Encryption Standard (AES) at 256 bit. Non-SSL based sites should average at the 0.1 second mark.

user@machine:  ./calomel_http_log_distribution_performance.pl 40000 "html " "Googlebot"

   .:.  Calomel Webserver Distribution Statistics

         Log lines: 40000, Search string(s): html  Googlebot 
       ________________________________________________________________________________________________
    48 ...|............................................................................................
       ...|............................................................................................
    45 ...|............................................................................................
       ...|............................................................................................
    41 ___|____________________________________________________________________________________________
       ...|............................................................................................
    38 ...|............................................................................................
       ...|............................................................................................
    35 ...|............................................................................................
       ___|____________________________________________________________________________________________
    31 ...|............................................................................................
       ...|............................................................................................
    28 ...|............................................................................................
       ...|............................................................................................
    25 ___|____________________________________________________________________________________________
       ...|............................................................................................
    21 ...|............................................................................................
       ...|............................................................................................
    18 ...|............................................................................................
       ___|____________________________________________________________________________________________
    15 ...||...........................................................................................
       ...||...........................................................................................
    11 ...||...........................................................................................
       ...||...........................................................................................
     8 ___||___________________________________________________________________________________________
       ...|||..........................................................................................
     5 ...|||..........................................................................................
       ...|||..........................................................................................
     1 ...|||..........................................................................................
       __|||||_________________________________________________________________________________________
 Time: |^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^|^^^^
       0   0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  4.6  5.0  5.5  6.0  6.5  7.0  7.5  8.0  8.5  9.0

So what can I do to get my site to serve data faster ?

Now that you have a tool to figure out how fast you are actually serving data, it is time to optimize. Please take a look at our page titled, Webserver Optimization and Saving Bandwidth which we offer as a general guide. The page covers picture optimization and compression as well as good layout methods.

Can you give some more examples of what we can query ?

Here are some examples of items we like to query the logs for.

Look in the last 5,500 lines for any JPG's Google bot has downloaded. We also specified Google's common ip block of 66.249/16 because many clients falsely claim to be Googlebot to get around security rules.

./calomel_http_log_distribution_performance.pl 5500 ".jpg " "Googlebot" "^66.249"

Query the log for the default 10,000 lines by using empty double quotes. Then search for clients who use Linux.

./calomel_http_log_distribution_performance.pl "" "Linux"

Search the last 100,000 lines for client access times, but only for September 1st and only from Internet Explorer 8.0 clients.

./calomel_http_log_distribution_performance.pl 100000 "01/Sep" "MSIE 8.0"

Show the graph for the last 9,000 log lines looking for clients who accessed our Apple Touch icon.

./calomel_http_log_distribution_performance.pl 9000 "apple-touch-icon"

How fast did clients in the last 900,000 log lines from the ip address set 77.7.70.167/24 access our GIF pictures using the Opera browser. Since we are using regular expressions we prefix the ip address with a carrot "^" to signify the ip is to start at the beginning of the line.

./calomel_http_log_distribution_performance.pl 900000 "^77.7.70.167" ".gif " "Opera"

If you run a https site then look at access speeds depending on encryption type. What is the access speed of the last 10,000 clients connecting to us using the Advanced Encryption Standard 256bit (AES256-SHA) SSL encryption scheme. How about those clients negotiating with our site using Elliptic curve Diffie Hellman and AES 256bit encryption (ECDHE-RSA-AES256-SHA) ?

./calomel_http_log_distribution_performance.pl 10000 "AES256-SHA"
./calomel_http_log_distribution_performance.pl 10000 "ECDHE-RSA-AES256-SHA"

Questions?

How can I compare two or more graphs ?You can roughly compare two, three or more graphs using the command line a simple while loop. We say roughly because the scale on the Y-axis is probably going to be different between the graphs; but you can get the idea.

Lets say we want to flip between a graph of Linux clients and MSIE clients. First we output the graph of both searches, linux and msie, and save them to two different files. For example, we query for the last 10,000 (remember empty quotes for the the default amount of lines) and search for the OS string.

./calomel_http_log_distribution_performance.pl "" "Linux" > linux_output 
./calomel_http_log_distribution_performance.pl "" "MSIE" > msie_output

Now, we are just going to cat both files to the xterm with a one second delay between each file's output. This will give the illusion that we are flipping between the linux_output and msie_output files every second. Just execute the following line on the command line (CLI). When you are ready to stop just hit Ctrl-c to break out of the loop.

while true; do clear;cat linux_output; sleep 1; clear; cat msie_output; sleep 1; done

If you wanted to compare more graphs just output the graph to a file and add the output file to the loop. Nice and easy. Plus you can use this tool over a remote ssh connection since it requires minimal graphics. ASCII graphs really are nice in this regard.

How much RAM and CPU time does the script use ?The script is efficient. Depending on how many log lines you are going through it should take just a second or two to complete the graph. Memory usage should only be a few megabytes.