Caching Makes Your Brain Explode

Posted by Craig Ambrose on November 13, 2007 at 04:20 AM

I’ve been spending a lot of time recently trying to make boxedup.com scale. Before I started, I’d watched the right screen-casts, read the right books, and I thought I knew what had to be done to speed up rails applications when the need arose.

Boy, was I wrong.

A quick look at the three methods of caching rails pages reveals that page caching is of no use to a site which insists on displaying the current user on all pages (as most of them seem to). Next up is action caching, which does let me execute before and after filters, allowing me to handle to logged in user, but caches the entire rendered action, including the layout, so once again I can’t display the currently logged in user. There are possibly some ways around this, but since action caching is really just a specialised form of fragment caching, lets talk about that.

Fragment Caching

Fragment caching does work. In fact, my first attempts at it benchmarked so well in my simplistic “load this page 100 times in httperf” tests that I dived in head first. The books on this subject, particularly the pragprog one, give the impression that this is pretty straight forward. It’s not. There are some massive gotchas that will bring even a fairly low traffic site to it’s knees if you don’t watch out for them.

It’s All About Expiry

You can’t consider caching without thinking about cache expiry. In rails, this is typically done with cache sweepers. For fragment caching, the sweepers call the expire fragment method. This can take a string, which matches the fragment name exactly, or it can take a regular expression.

Gotcha #1, Don’t Use expire_fragment With A Regex

First up, this doesn’t work with memcache anyway, it only works with the file system cache. There’s nothing that wrong with the file system cache. Reading from it is faster than rendering a template. Expiring from it, however, is pretty slow. Expiring from it using a regex is absolutely appalling. The reason why is better explained in this article by Adam Doppelt.

So, if you can’t expire it with a regex, that leaves you the following options for expiry:

  • Time based expiry. There are some plugins that add this feature to the file system store. Memcache gives it to you for free, and if you’re relying on this heavily, I’d use memcache.
  • Being in one of those good situations where the number of possible fragments is known, and you can expire them each explicitly. This didn’t work for me in some of the critical areas that I needed to cache.
  • Storing a list (in the database) of the caches that you built up which need to be expired if a certain thing is changed.

Don’t Expire, Just Render it Obsolete

This article so far doesn’t really capture how much pain this stuff has caused me, and I’ll try and cover some other points in other articles. For now, lets jump straight to the good bit.

I’ve read a lot of articles on caching, but this is the best, go read it:

The Secret To Memcached – by Tobias Lütke

Tobias also struggled with expiry, and his solution is to take advantage of the fact that if you’re using memcache, then you can never cache too many items. The oldest ones get pushed out when you run out of of space.

So, here’s my first bit of advice. If you’re building a real site, go straight to memcache. If you’re not building a site for big traffic, don’t cache, just optimise any really stupid queries that are giving you trouble. If you’re using memcache, be sure to run monit too.

So, we’re running memcache, and we don’t want to expire our fragments. Instead, try and find fragment keys that don’t need to be expired, because they will be replaced if the data changes.

The one I’ve just implemented was a stream of recent activity, much like facebook. Each little type of activity had a different template, and the rendering of this took up a lot of time. Fetching the data was also non-trivial. However, if I wanted to expire a cache of the activity stream, then I’d need to do so anytime something occured on the site that triggered an activity for this user.


<% cache ["activity_stream", @latest_activity.id, @user.id].to_s do %>
... render the activities
<% end %>

There’s the code. The real example had a few more parameters, but you can see here that the magic is in the fact that I used @latest_activity.id as part of the key. I’m still having to query that from the database, but it’s pretty simple to do, and all I really need is one little integer, instead of all the activities and their associated objects. If a new activity is created for this user, then this id will have changed, and so I’ll me asking for a different cache key.

Benchmarks for this are looking really good. I’ll let you know how it goes in the wild, but I’m not expecting too many problems as most of my previous troubles have been to do with expiry, and this code doesn’t need any expiry. No sweepers, no regular expressions. It’s simple, and it scales.