Scaling WordPress
Blogging, Web Development August 13th, 2007 - 11,072 views
WordPress seems to have a bad reputation when it comes to scalability. Maybe it’s deserved, since a default WordPress installation doesn’t really scale well. But making WordPress scale isn’t hard. I recently hit the Digg home page and got roughly 70,000 pageviews in under 12 hours. Another post hit the home page later the same day, and another 10,000 clickthroughs followed. As a result, I’ve been asked by a few people how I managed to keep my site up under that sort of stress. Honestly, I haven’t done anything that fancy. But for future reference I figured I’d document my configuration, and let people in on one trick that saved my butt.
First, I run WP-Cache. I started using WP-Cache after my first Digg experience and, for me, the performance improvements far outweigh the compatibility issues that may arise. If you’re a real performance junkie you may be interested in a simple hack to enable gzip with WP-Cache.
If you have some content that must be fully dynamic (like the pageview counters that I recently added here under the post title) you can take advantage of the mfunc functionality built into WP-Cache. To include some dynamic code in a cached page, use the following syntax:
<!--mfunc function_name('parameter'); -->
<?php function_name('parameter'); ?>
<!--/mfunc-->
Caching a page, then selectively adding dynamic functionality where it’s necessary will drastically reduce the load on your server.
This site is hosted on a dedicated server, along with another project I’m working on. A second server handles the database back-end for both sites. This setup is capable of handling a tremendous amount of traffic. The bottleneck, surprisingly enough, is the front-end web server (not the database).
In retrospect, it probably would have been wise to upgrade the front-end server to at least 1GB of ram (it currently has only 512MB). With Apache processes running upwards of 20MB each, RAM limits the number of child processes that can be spawned before the system starts thrashing. And since an Apache process can only handle one connection at a time, the number of Apache child processes places an upper bound on the number of concurrent requests the server can handle. Additional requests are backlogged, and when the backlog builds up system performance suffers.
I’ve been experimenting with using mod_backhand in combination with Amazon EC2 to offload some traffic from my primary web server to an elastic compute cloud during periods of heavy traffic. My success with Amazon S3 has fueled my interest in EC2, but I’m still skeptical of EC2’s ability scale rapidly enough to handle traffic spikes (it takes a minute or two to provision an EC2 server — in that time, a front page story on Digg could send upwards of 1,000 page requests your way).
Finally, my little trick. If you’re not running WP-Cache, and one of your posts is about to go viral, try caching the post manually. It’s a simple process:
http://immike.net/blog/08/13/post-title would be something like /var/www/blog/08/13/post-title). Name the static copy of the post index.html and store it in your newly created directory.If you’re using the standard WordPress rewrite rules (the ones WordPress auto-generates), static HTML files will override dynamically generated content. Thus, as soon as the static copy of your post is in place, Apache will start serving it instead of passing requests off to WordPress. Even on modest hardware Apache can handle hundreds of requests for static content per second, so this trick should keep your server up through the storm. Once things calm down, simply remove the files/directories and WordPress will take over once again.

August 13th, 2007 at 10:18 pm
I like the last trick. :-)
Dzamir Says:August 14th, 2007 at 5:10 am
The last trick is very tricky! :P
Richard Crowley Says:August 14th, 2007 at 3:59 pm
So you create a static copy of the post, that’s nice and fast. But when people comment, how do you regenerate the HTML to reflect this? I wrote a blog engine long ago that generated static files every time a post changed (for example, when a comment was added). However I’d imagine that at load this would be nearly as expensive as a database call to get the post.
mike Says:August 14th, 2007 at 4:19 pm
That’s basically what WP-Cache does. It creates a static copy of the page (and some metadata) for each “logged in” user of the system. When a comment is posted it automatically invalidates the cache.
The problem is, under heavy load the server can become unresponsive when the cache is invalidated. During one period of heavy load my server was handling around 30 requests per second. Page generation took around one second under normal load. So when the cache was invalidated 30 requests came in before the cache was restored. This, of course, slowed down page generation so it took longer than one second. It took around 10 minutes before the server could catch up.
Another approach is to generate a new copy of the page when a comment is submitted/approved. The atomicity of the OS level copy command could be relied on to put the updated copy in place without ever having to dynamically generate a page. The problem with this approach is that WP serves slightly dynamic pages to users who have commented on the blog before (or anyone else who is logged in, for that matter). So serving a single static copy of the page screws things up, and creating a cached copy for every person who has commented on the site is not feasible.
The approach WP-Cache takes is really ideal because it takes into account temporal locality. Some sort of locking mechanism that backlogs requests while a new page is being generated might help. What’s really needed, though, is more aggressive caching integrated with WordPress at a lower level.
Anyways… to answer your first question, re: comments w/ the manual caching approach — it’s a problem. Comments will be stored because they’re submitted to special page, but users won’t receive any notification that their comment was received/needs to be moderated. Regenerating the page with new comments is a pain in the butt: you have to remove the temporary static file and allow WordPress to generate a new one. The simple answer is that this is a hack. It’s suitable for an emergency, but I wouldn’t reccommend relying on it under normal circumstances.
Paul Stamatiou Says:August 14th, 2007 at 4:38 pm
I edited my WP-Cache to gzip my files as well. It will put some extra strain on page generation but I think the faster loading times and less bandwidth use make up for it.
Here’s what you’d need to do:
Turn off gzip in the Options.
Edit wp-cache-phase1.php and add this line:
if ( extension_loaded(’zlib’) ) ob_start(’ob_gzhandler’);
before this line:
foreach ($meta->headers as $header) {
header($header);
Then edit /wp-content/advanced-cache.php to add:
if ( extension_loaded(’zlib’) ) ob_start(’ob_gzhandler’);
Before this line:
foreach ($meta->headers as $header) {
Also, probably not a good idea to copy and paste from this comment because of little issues that could have been avoided if I took the time to use html entities
Paul Stamatiou Says:August 14th, 2007 at 4:39 pm
ah I’m a goober, you already mentioned this above.
mike Says:August 14th, 2007 at 4:44 pm
Yea b/c I knew that if I didn’t you would =p. It’s a nice little hack though, should be an option for WP-Cache.
Ronald Heft Says:August 14th, 2007 at 8:55 pm
Paul (and other interested in gzip), instead of doing it that way, I would recommend following this guide for caching gzip pages.
What you’re currently doing is gzipping the contents of the page on each load. This takes away the advantage of using cache. The method I linked to caches the gzip output (as well as a non-gzip version incase someone doesn’t support gzipping) speeding up page loads even more.
links for 2007-08-17 « Mogore, une femme en or Says:August 17th, 2007 at 11:17 am
[…] Scaling WordPress I’m Mike (tags: wordpress programming php howto billet aout2007 lang:en) […]
Arie Says:August 18th, 2007 at 4:35 am
How about installing Squid in front of Apache?
mike Says:I’ve heard that Squid can be used as a caching accelerator, but I haven’t seen it in action.
August 18th, 2007 at 1:14 pm
Squid is an awesome reverse proxy (and normal proxy), and it works well for a lot of applications. Haven’t tried it with WordPress though b/c it would break a lot of functionality that I want. Some parts of my blog are fully dynamic, and Squid would break that stuff.
hvm Says:August 20th, 2007 at 6:01 am
agree, wp reputation on scaling is very bad. and the last trick sort of cute :Þ
Callum Jones Says:August 28th, 2007 at 6:08 am
For the last trick it would be easier to edit the .htaccess to run a ModRewrite trick
PG Says:November 28th, 2007 at 1:01 am
Those of you who manage your own servers would do well to consider eAccelerator (eaccelerator.net - one “L”). We use it in-house for our critical projects, including a very busy independent music torrent tracker. It caches all php scripts in a configurable amount of RAM and it seriously speeds up PHP/MySQL sites and enables the server to handle huge loads.
The nice thing about it is it will cache ALL PHP scripts from ALL applications on your server, and it does so at a higher level than if you are using something like wp-cache, though it is fully compatible with application caching such as used by Joomla or WP. eAccelerator also will dump unused or low-traffic scripts out of RAM to make room for new requests.
I’ve used it quite successfully on a server that runs high-traffic phpbb’s, Joomla’s and WP’s concurrently, along with fairly heavy MP3 downloads. It’s a dual Xeon with 4GB RAM running CentOS 5, so it can handle quite a load.
I think some folks have had bad luck with it due to not understanding how to install or configure it… a little t&e/r&d goes a long way…