Benchmarking HTTP performance

Deployment of Rails application is a subject that tends to raise some hot discussions, leading to many misunderstandings. That’s why I decided to try different deployment strategies and check for myself how they perform.

To make any reasonable comparisons it is crucial to measure performance of different configurations. The most common metric is the number of requests processed per second (RPS). This metric (and many others) can be measured by HTTP benchmarking tools like ab and httperf.

The first tool, ab, comes bundled with Apache and is very easy to use, so it is a good option to start with. You can provide a total number of requests to perform (-n) and a number of concurrent requests (-c). If you like you can also give maximum time to wait for a response (-t), as real users won’t wait for a page to load more than just a few seconds.

For example to issue 1000 requests with concurrency of 100 you might run (remember about a trailing slash in the URL, it is necessary)

% ab -n 1000 -c 100 http://www.example.com/

httperf is a slightly more complex tool with more features. The most important is a possibility to issue multiple request per connection (--num-calls command line option) and support for replaying sessions that imitate real use cases. The tool is also believed to be more robust and give more reliable results. The basic use might look like

% httperf --server www.example.com --num-conn 1000 \
          --num-call 10 --rate 10

This will issue 1000 connections with a rate of ten connections per second (and no more), passing ten requests through each connection before it is closed. So the total number of requests will be 10000. Be sure to remember the distinction between connections and requests, otherwise this can lead to confusion when interpreting results. Another tricky part is the actual meaning of the rate command line option. Rate is not a number of simultaneous connections at a given time (like concurrency in ab), but rather a number of new connections made per second. This means your RPS cannot exceed rate given multiplied by number of requests per connection. So httperf has to be ran multiple times with increasing rate to find the saturation point of the server.[1]

When benchmarking HTTP performance don’t just accept the first results blindly. Think for a minute what you are actually measuring. Check the status of the replies — if most of requests fail it is a sign that something is wrong, if you are getting 3xx redirects probably you should rather test the URL the redirects point to. If many requests have timed out the concurrency you requested might be too high.

Never perform such tests from your desktop machine far away from the server. In the perfect world you should run the benchmark from an independent machine in the same network segment as the server, and make sure the network is not saturated during the test. If you have to run the tests on local machine, remember that the load caused by the test itself can skew the results (note that from my experience ab causes considerably smaller load than httperf).

Finally consider where the URL you provided points to. If this is a static page or file, you can easily achieve thousands of RPS, as the performance is bounded mostly by disk operations. On the other hand if you measure a dynamic page running multiple SQL queries you might get very low results, as the database will be the bottleneck. Many recommend to benchmark a simple dynamic “hello world” application that doesn’t communicate with the database. But if you want to measure performance of the application, not a web server, you can measure and compare different URLs.

In my benchmarks I found out that three Mongrel instances load-balanced by Pound are about 10-20% slower than three static[2] FastCGI processed running from a vanilla Apache installation. It is probably due to the fact that the front-end server communicates with Mongrels through TCP connections, which are considerably slower than UNIX sockets used by FastCGI. On the other hand this architecture makes scaling Mongrels easier, because one load balancer can proxy requests to multiple machines.

It looks like there are reasonable arguments for both strategies, and I find it a bit surprising that the whole Rails community is voting against FastCGI, calling it a legacy solution. It’s true that FastCGI can be tricky to setup correctly — but at the end of the day it performs better, and there are other benchmarks showing similar results (as shown on this chart).

[1] More information on good HTTP benchmarking practices and the usage of httperf can be found in the Linux HTTP Benchmarking HOWTO.

[2] Never use dynamic FastCGI processes for production purposes. Dynamic processes are killed when unused and due to timing issues users can get internal server errors. Moreover every request assigned to a fresh process is delayed, as it has to wait for the new process to boot.

Trackbacks & Pingbacks 1

  1. From Adam Byrtek - Thinner Ruby deployment on 10 Mar 2008 at 8:08 pm

    […] the post on benchmarking HTTP performance I mentioned that according to my tests a cluster of Mongrels performs about 10-20% worse than the […]

Post a Comment

Your email is never published nor shared. Required fields are marked *