I sold my cow and all I got were these url helpers
Posted by Craig Ambrose on April 20, 2008 at 11:19 PM
In rails applications, we link to other pages in our application by generating a url which maps to a particular controller (class) and action (method) using a rule which we call a route. Back in rails 1.0, we would do something like this:
<%= link_to 'Edit User', {:controller => 'users', :action => 'edit', :id => @user} %>
This is not an article on routing for dummies, I presume you already know this stuff. However, I want to recap why we do this, in case anyone has forgotten the reason for all this.
To give a point of comparison, lets assume that there were no routes in rails, and that code was directed to a particular place based on the default rules of ”:controller/:action/:id?:other_params”. A link might look like this:
<%= link_to 'Edit User', "/users/edit/#{@user.to_param}" %>
That’s actually shorter than the above. Clearly brevity isn’t the main goal here. So what are the goals of customisable routed?
Human (and SEO) Friendly URLS
If we need further parameters, we don’t want to introduce a question mark into the url. We want it to keep looking like a directory structure. If we are creating a new user inside group 5, we want a url like /groups/5/users/new, instead of /users/new?group_id=5. This goal is probably not one you have forgotten, so lets jump on to the next one.
A Single Point of Change For Url Mappings
It’s a well known bad smell in any piece of software if changing your mind about one simple concept requires you to make changes all through the code (Martin Fowler calls this “Shotgun Surgery”). If our client says, “can you change all references to ‘users’ in the urls to saying ‘people’ instead”, or “can you prefix all admin urls with /admin” then we would expect to be able to do so without too much trouble.
The beautiful thing about routing in rails is that the routes control both the generation and the parsing of urls. Back when I wrote PHP apps, I had code to parse the urls (big ugly case statements) and in some cases I had code to generate the urls, but I never bothered to create one simple system for doing both.
So, with those two goals in mind, lets travel back to the present and look at resource routes in rails 2. When we create a route with map.resource, a bunch of special helper methods are also created. This allows us to replace our initial example with
<%= link_to 'Edit User', edit_user_path(@user) %>
Lets look at the pros and cons of that.
Firstly, it’s much shorter. The hard coded string was a fair bit shorter too, so we know that brevity isn’t always the main goal, but short is generally not a bad thing.
It’s a little bit more english-like, in that it contains less symbols. However, this also means that it is less semantic. It’s easy to learn how to read urls that are specified as a hash of controller, action and params. I know how to read the resource helpers too, but there are a few different rules to learn in order to parse them mentally, and I find it takes new rails programmers a little while to figure them out.
It’s overloadable. Since it’s a method, we can declare a helper of the same name and do something completely different. This can be handy, although in practice it’s a bit dangerous since there are other ways to declare the route, so you’re not guaranteed to intercept all calls. Also, it would then give behaviour that our programmer who has now worked so hard to figure out how the restful routing helpers work something of a surprise. The principle of least surprise is worth considering.
By in large, I’m still kind of in favour of the new notation at this stage. When I first learnt it, I thought, “wow, that looks much nicer”, and that feeling is a very important argument in it’s favour. We’ve gotten a lot with this new functionality. I just want to mention the one feature we sold off without even noticing.
When generating a url we are new directly linking the view code to a single routing rule.
“WTF?”, I hear you say.
Haven’t we already established that the rails routing system decouples url generation from the controller code that it maps to, allowing us to configure the interface between them in one place (routes.rb). Why am I now saying that I’ve linked the url generation in my view to a single route and lost my ability to vary it at will?
Lets say that we wanted to map all uses of the user edit action to a totally new url. It could look like anything we wanted, previously, we had the power to do so because our request to generate a url just gave the keys and values, and our action just accepted key/value parameters, and the string we used for the url in between was totally up to the routes.
Now we’re using a method unique to this action to generate the url. By calling edit_user_path(@user), we’re not actually giving up the flexibility to decide what that method does, but if we wanted to make it map to anything other than the edit action on the users controller, nested inside no other resources, then we’d be violating all the conventions that we’d built up in order to understand the user of these helpers.
So, if we want to do something like move this action to a different resource, we find that we need to go through and use a new set of helper methods for all the links. Since we need to change each link to do this, we’re really not much better of that if we’d used hard coded strings in the links.
If you wanted to rename the users resource to ‘people’, it’s quite a tricky operation. I’ve done it many times and without foolproof refactoring tools, you need to search and replace for strings like user_ and look at each method call to see if it’s a url helper which should be renamed to edit_people_path or similar.
Recently, I’ve been experimenting with going back to expressing things as a hash. Also, in doing so I follow a set of conventions.
- Always put spaces either side of the
=>operator. - Always use symbols for the keys
- Always use single quotes as string delimiters for the values, if they are string literals.
This gives much more precise things to search and replace for. Lets say that I want to rename the users resource to people. I can search for all strings matching :controller => 'users'.
I’m not necessarily saying that this approach is the best, but I think that we should all consider that the goal here is simplicity. The simplest code isn’t necessary the shortest. The simplest code is the easiest to read, the easiest to learn it’s full meaning, and the easiest to change. When we got all excited about named url helper methods, I’m not sure it was at all clear how much we were giving up in return.
Using Model factories with RSpec
Posted by Craig Ambrose on February 23, 2008 at 10:12 PM
I had a lot of questions and concerns about RSpec when I first made the switch a few months ago. I’d seen a few talks on BDD in general, and watched the peepcode screencasts on rspec, and I could see that BDD tended to encourage a fairly good style of testing. I managed to find answers for most of my questions, so here’s a little list of the challenges and resolutions that I faced when I made the switch.
When I first started on rails, like a good little agile developer, I mocked out everything for my unit tests and ensured that I never actually hit the database and thus never required fixtures. I used the mocha library predominantly, but after a month or so of this I had gotten totally tangled up.
The problem was that active record objects are a bit like an iceberg where only a part of them is visible in the code, and the rest is lurking below the murky waters of your database. My tests were missing heaps of bugs caused by queries generating invalid SQL, but reliance upon properties which may or may not exist depending on the database schema, and were also far too complicated as they required lines and lines of setup code to handle all the wacky ways in which a model could interact with a database.
My solution eventually was to give in, and test things the way DHH does. Not particularly elegant, certainly not “correct” in terms of what I’d been taught regarding how to test, but it did work. The upside of these tests which often integrated a number of layers together was that I got great coverage. The downside was that I’d often get a raft of test failures if I broke a single line of code, and some things were just a bit too much effort to test and tended to motivate me not to bother. Also, the dependence on fixtures meant that as the test situations got more complex, I’d have to add more fixtures, and in doing so break other tests.
When moving to RSpec, I threw all this away, and started again on mocking everything out. RSpec helped out with a few nice extra methods, like mock_model, which can sensibly provide a mock ActiveRecord object that has the bare essentials already (like the id and to_param methods). This time I did better, but I still got a little tangled up, and I found it really hard to test methods where I was querying for record. With a mock based approach, all I could test was that I’d sent the query that I expected, not that it was actually valid SQL or that it would fetch any sensible results.
The resolution, as it turns out, lies somewhere in between.
I use mocking pretty heavily still, particularly when I’m creating or updating records. But, I also often want to deal with real data, particularly when it’s being fetched rather than created. Fixtures were a bad idea, because they created a great big data set that had to be valid for all tests. There are some plugins to provide different sets of fixtures for different scenarios, but I think that having the data used for a test in a different file is also downright confusing. Setting up for each test needs to be easy.
The solution for me was the use of a model factory, which is a pattern in some use amongst my colleagues at Cogent. I’d left the worlds of C# and C++ with a bit of a dislike of factories, being a pattern that, like dependency injection, seemed to add a great deal of complexity to a problem that is much more easily solved in a dynamic language. However, a simple model factory for testing is a different kettle of fish.
Lets say I’m testing that I can’t create a user if their email address already exists. If I was using fixtures, I’d create a user with the email address I was going to try out, but that would affect all other tests. If I created a user directly in the test then I’d need to update my test each time I changed the information needed to create a valid user. Instead, a model factory makes it look like this:
describe User, "when being created" do
it "should require email address to be unique" do
model_factory.user(:email => "craig@craigambrose.com")
user = User.create(:email => "craig@craigambrose.com")
user.should_not be_valid
user.errors.on(:email).should == "must be unique"
end
end
The user method on model_factory creates and returns a valid user. It can be called as many times as I want, and it will always return a valid user. If User contains fields that must be unique, then the data I use to populate it contains integers which increment each time. The factory methods also take arrays of options which can be used instead of these default values, which I use whenever I want to set an attribute that I care about in the test. This way, if I change what is required to create a user, all I need to change is the model factory, not every single test.
I don’t use the model factory all the time. I still make heavy use of mock_model. For each case, I try and determine which method will yield the simplest test that actually forces me to write the code that is needed, rather than just appearing to give coverage over the lines of code. Sometimes I use some mocking and some concrete objects in the same test.
I haven’t provided the code for the model_factory itself. It’s very straightforward and I’ll leave it as an exercise for the reader. Drop me a line if you have a particularly clever implementation that you’d like to share or are interested in seeing some of my code.
Storing Images from remote URLs
Posted by Craig Ambrose on February 22, 2008 at 12:17 AM
Here’s a handy little modification to attachment_fu to allow the model to have it’s file data set from a remote url. I’ve been doing this for a while but recently altered it to compute the mime-type from the downloaded file, rather than just basing it off the extension. This is because some web sites host files (such as images) at urls without a file extension.
As this is a monkey patch for attachment_fu, I follow Chris Wanstrath’s evil twin convention and put it in a plugin called attachment_fu_hacks.
This code is also dependent on the mimetype-fu plugin. Mimetype-fu calculates the mime type of a file using the *nix “file” command, rather than using file extensions as the mime-types gem does. If you’re using OSX, then this wont work unless you change the two occurances of file -bir in mimetype_fu.rb to file -br —mime (which is compatible with both OSX and linux). I’ve submitted that change to the author so hopefully it will be incorporated into future versions..
respond_to.email, or how to handle incoming emails in rails RESTfully
Posted by Craig Ambrose on February 09, 2008 at 04:49 AM
There’s a bunch of information around on how to handle incoming emails with your rails application, in particular the wiki page, but I have some concerns with the methods that are being suggested, and in this article I present an alternative which I’ve been trying out and I really like.
Handling incoming email is, in essence, very simple. All you need to do is get the email, which is a big chunk of text, parse it with a ruby email class, such as TMail (which is used by ActionMailer), and perform some action. If you’re only handling a few specific addresses, it might be best to fetch the email via POP3, and I’ve done that before using a daemon to regularly poll the pop account.
POP3 is not a viable solution if you want to handle all email for a certain domain. At this point, we probably want to talk about SMTP.
A Very Short Guide to SMTP
Simple Mail Transport Protocol is pretty damn cool if you ask me. It’s dead simple, basically the client can only say “hello, here’s an email from X to Y”. Just like HTTP, it’s fully push based. There’s no polling, emails get pushed across the internet. Just like HTTP, it has a hole stack of response codes which are of course appropriate to trying to send an email, rather than talk to a web resource.
Using Postfix Mail Filters to Call Ruby
Postfix is a common open source SMTP server. Before I looked at it, it was big and scary. After a few hours of expert help, I wonder what seemed so complicated. One of the basic ways that we can use postfix to push mail to our rails app is by specifying a command like script which gets executed whenever postfix gets an email. This is the first option presented on the rails wiki, and they suggest using a script which calls the receive method of one of your ActionMailer classes.
My Concern
If we’re going to use ActionMailer to parse an email, and then presumably fire off a bunch of ActiveRecord code to make changes to your database as a result, clearly we’re loading the entire rails stack. Every time we get an email we’re loading the entire rails stack. This seems like how we handled web requests back in the day when there was only mod_cgi. No shared resources between requests, a big performance hit for loading all of rails and then getting rid of it each time, and the concern that we can only handle as many incoming emails as we have RAM on our server as the rails code takes up a bunch of memory.
What I Want
I don’t want to have to worry about the resources I need to scale my email server, I already do that with my application servers. I want to handle emails in a way that re-uses an in-memory copy of the rails classes and called be scaled in a predictable way.
That sounds a lot like a mongrel cluster.
We all have one of those already right. So why not handle incoming mail over HTTP? It’s dead easy, it scales well, and the result is really Rails-ish.
respond_to.email
I was hoping to get a plugin out of this. It’d be so handy that people would queue for miles to download it. The trouble is, it’s actually not even enough code to bother, it’s only about three lines of ruby and the same of postfix config. So, lets call this a pattern. I’ll describe how to do it, and you can all run off and do it yourself.
Step One: Install Postfix
Install postfix on one of your servers. For any sizable rails site, I like to have a little VPS just for daemons, cron jobs, scripts, and the mail server, to keep it separate from all the web stuff. On ubuntu, this was as easy as “sudo apt-get install postfix”. For the default configuration type, I chose “internet site”.
Step Two: Setup Your MX Record
For mail to start arriving at your mail server, you need to add a MX record to your DNS which points at the url of your server. Depending on your host, you probably have a web interface to do this, and it’s probably dead easy.
The Magic Script!
Create a file called mail_handler.rb, and pop it somewhere in your rails project. I created a /bin directory for it. Don’t use a rake task, the goal here is not to load in any unecessary stuff. Here’s the contents.
#!/usr/bin/ruby
require 'net/http'
require 'uri'Net::HTTP.post_form URI.parse('http://www.craigambrose.com/emails'), { "email" => STDIN.read }
If ruby is somewhere else on your machine, change the line at the top to be correct (try “which ruby” on that machine to see where it is). I’ve chosen to hardcode in the url that I want to post the email to so that I don’t have to load any other files. If you have more deployment environments to worry about, you might want to put the target url in a yml file and parse it here. Just don’t load your rails environment file, that’s the whole point of this.
Configuring Postfix to Call the Script
In this example, the domain that I want to handle email for is “craigambrose.com”. Everywhere you see this, replace it with your own domain name. Most of the commands below need root access.
In /etc/postfix/main.cf
mydestination = localhost.localdomain, localhost, craigambrose.com
virtual_maps = hash:/etc/postfix/virtual
alias_maps = hash:/etc/aliases
In /etc/postfix/virtual (this is a file, you may need to create this)
@craigambrose.com rails_mailer
The above says to redirect any address at craigambrose.com to the alias “rails_mailer”, which I’ll create next. You could run multiple rails apps of the same server by giving them all unique aliases. On the left, you can use a regular expression to match addresses if you only want to match some of them.
To apply this change to virtuals, run:
postmap /etc/postfix/virtual
In /etc/aliases
rails_mailer: "|/var/www/apps/craigambrose/current/bin/mail_handler.rb"
That’s the alias we created on the left. On the right is the path to my script, change as necessary. The pipe character before the script path means “the following is a shell command, not an email address”.
To apply this change to aliases, run:
postalias /etc/aliases
To apply the main configuration changes to postfix, run:
/etc/init.d/postfix reload
Testing the Setup
I should be able to send an email now to “someaddress@craigambrose.com”. To see it get process by postfix, we might want to watch the postfix info log:
tail -f /var/log/mail.info
When the mail is process, you should see a line like:
to=<someaddress@craigambrose.com>, orig_to=<root>, relay=local, delay=2, status=sent (delivered to command: /var/www/apps/craigambrose/current/mail_handler.rb)
Then, go peek at your rails app logs. You should see that the mail has been passed through by the script. Even if you haven’t written an action to handle it yet, the log entry should be there.
Troubleshooting
If you didn’t see the correct line in your postfix logs, then perhaps there’s a problem with your DNS Set. You could try talking to postfix directly. Mail servers listen on port 25, and you can telnet into them and speak directly. Try “telnet YOUR_SERVER_IP 25” And the try typing in what the client says in the sample SMTP communication on wikipedia with the example address changed to the domain that you want to test. If that works, but sending email didn’t, you’ll need to investigate your DNS setup.
Handling the Rails Action
The target url I put in my mail script was http://www.craigambrose.com/emails, so the mail is going to get POSTed to that resource. With normal rails resource routes, that means that we’re expecting to handle the email in the create action of the EmailsController. That seems very sensible to me. My script puts the unparsed email into params[:email].
To parse it with TMail, all you need to do is:
require 'tmail'
email = TMail::Mail.parse(params[:email])
Alternatively you could pass it to the “receive” method of any ActionMailer derived class, which does the above automatically.
I’ve had some reports that TMail is both a little slow, and also not quite up to parsing all the possible ways that an email might be encoded in the big bad world. That’s a subject for another blog post.
Final Performance Note
When postfix is calling your script, it makes so that only a certain number of calls are occurring concurrently, the default is 20, which seems pretty good to me. If you’d like to tweak this, use the following setting in main.cf (and don’t forget to reload postfix afterwards).
default_destination_concurrency_limit = 30
Acknowledgments
Setting up servers is not my area of expertise. Many thanks to Andrew Snow of Octopus for the postfix help and Pete Yandell for sharing some of the lessons learned on his great mailing list site 9cays
Leaving the beaten track with RSpec
Posted by Craig Ambrose on December 05, 2007 at 07:16 AM
I’ve spent today learning RSpec. To get my head around it, I decided to spec a controller action which I’d already written and tested with Test::Unit. It’s a pretty basic CRUD action, requiring login and ownership of the record.
Here’s the spec and test code side by side:
I’m not sure which one was easier. The RSpec version is certainly a lot longer. Having said that, the Test::Unit one uses fixtures, which aren’t included in the dump.
In RSpec’s favour, is that the specifications run much faster (as they don’t use the database), and they also better decouple the controller from the model, as it’s a more true “unit” test. Having said that, I believe that the RSpec version might actually be a little bit more fragile, as it would fail if the code under test changed to perform the same operation using different methods (such as calling ActiveRecord create, instead of new and save).
Also, some of the things I specified in terms of expectations are actually order dependent, which is not captured in my specs, so if I deliberately messed up the order of some of the code, I might get false positives with my specifications.
For the record, here’s the code under test.
before_filter :login_required, :only => [:new, :create, :edit, :update]
before_filter :load_user
before_filter :current_user_can_edit_user, :only => [:new, :create, :edit, :update]
before_filter :load_user_base_tabs
def create
@photo_set = PhotoSet.new(params[:photo_set])
@photo_set.profile = @profile
success = @photo_set.save respond_to do |format|
format.html do
if success
redirect_to user_photo_set_path(@user, @photo_set)
else
render :action => 'new'
end
end
end
end
Image cropping with Mini Magick and attachment_fu
Posted by Craig Ambrose on December 03, 2007 at 01:07 AM
As I mentioned in my last post, I’ve recently switched a lot of my code from RMagic to Mini Magick. The API provided by mini magick isn’t quite as nice, and you have to jump through a few hoops to get it to do what you want, so I thought I’d post an example.
Whenever clients talk about image thumbnailing, what they actually want is for the image to be cropped square, and then resized down to thumbnail size. They never, ever, want you to provide a thumbnail that isn’t square, or to scale the image out of proportion in order to stretch it into a square shape. It’s always crop and scale.
Strangely, this isn’t what attachment_fu does out of the box, with either rmagick or mini magick. Others have discussed this with regard to rmagick, but I couldn’t find a good mini-magick solution that worked, so here’s why I’ve come up with.
Replace the resize_image method in attachment_fu’s mini_magick_processor.rb file with the following:
def resize_image(img, size)
size = size.first if size.is_a?(Array) && size.length == 1
if size.is_a?(Fixnum) || (size.is_a?(Array) && size.first.is_a?(Fixnum))
if size.is_a?(Fixnum)
resize_and_crop(img, size)
else
size[0] == size[1] ? resize_and_crop(img, size[0]) : img.resize(size.join('x'))
end
else
img.resize(size.to_s)
end
self.temp_path = img
end
def resize_and_crop(image, square_size)
if image[:width] < image[:height]
shave_off = ((image[:height] - image[:width])/2).round
image.shave("0x#{shave_off}")
elsif image[:width] > image[:height]
shave_off = ((image[:width] - image[:height])/2).round
image.shave("#{shave_off}x0")
end
image.resize("#{square_size}x#{square_size}")
return image
end
The resize_image method is changed a little to ensure that resize_and_crop is called if both requested dimensions are the same. This occurs if you specify it as a single number, or as an array of two numbers that are the same.
eg::thumb => [80,80]
Image management that will scale
Posted by Craig Ambrose on November 27, 2007 at 02:59 AM
There is a lot of conflicting information around about handling user uploaded images in rails applications. I’ve done it a number of different ways, and the good news is that it’s not too hard to move from one system to another. However, dealing with scaling issues is a pain and it’s nice to get it right first go. So, here are some problems that I’ve encountered recently, along with some solutions.
Files Per Directory Limit
Depending on which OS you use for hosting, you’ve probably got a limit to the number of files (or directories) you can put inside a given directory. It’s usually about 32,000. While this seems like a long way off, if your site accepts user content then hopefully this will eventually become a problem for you. There have been various talks and articles written about different hashing systems for file names, but it’s worth mentioning that this is basically a solved problem, and you shouldn’t have to tackle it yourself.
If you’re still using file_column, as I am for a few things, then this one might bite you. The simplest solution, I think, is to migrate to attachment_fu. The file system store for attachment_fu implements file name based hashing, and the s3 and database stores don’t suffer from the problem at all. Also, the way in which attachment_fu handles pluggable storage classes means that you could also slip in your own custom storage system later without having to change the way that you use attachment_fu in your models.
If you’re thinking of making the switch, here’s an article I wrote on migrating from file column to attachment_fu.
RMagick Memory Leaks
RMagick is really handy, and so just about every rails image handling tutorial on the internet recommends it’s use. I’m using it all over the place. My advice to you, is don’t ever do this. It turns out that RMagick leaks memory every time it manipulates an image. I haven’t measured the amount myself, but I’m told it’s quite a bit. Certainly I’ve been having resource consumption problems with scripts using RMagick heavily. So, say goodbye to it.
DHH recommended just using the image magick binaries manually. That’s basically a good idea, but a slightly easier way of doing that is to use the mini_magick gem. Mini magick provides a ruby API, but under the hood it just calls the image magick command line tools. Attachment_fu comes with a mini magic processor, so you can just add ”:processor => :MiniMagick” to your call to has_attachment and you’re in business. Khamsouk Souvanlasy wrote a good tutorial on using mini_magick with acts_as_attachment.
Cropping
The one thing I noticed in using attachment_fu instead of file column is that file column resizes images and crops them nicelly, the way that you would expect. By default, attachment_fu tends to stretch them. This has been covered better by other people, so I just want to mention it because stretching is almost certainly what you want, and until Rick fixes it, I’d suggest making a small change to the plugin yourself. There are a number of articles on the subject, but I think the best one is probably over at toolman tim’s blog.
Don’t just go with Tim’s solution though, have a look at the comments, and you will find options for the different image processesors. I used “labrat’s” suggested fix for mini magick (paste here)
Amazon S3
Amazon S3 appears to be a great solution for handling user generated images, and I’m starting to use it a fair bit. One word of warning however, is that I’ve already started to encounter an occasional communication error with amazon, as discussed in this thread and I don’t yet know how serious it is or how easily fixed. I’ll post some more on this subject when I’m better informed.
Caching Makes Your Brain Explode
Posted by Craig Ambrose on November 13, 2007 at 04:20 AM
I’ve been spending a lot of time recently trying to make boxedup.com scale. Before I started, I’d watched the right screen-casts, read the right books, and I thought I knew what had to be done to speed up rails applications when the need arose.
Boy, was I wrong.
A quick look at the three methods of caching rails pages reveals that page caching is of no use to a site which insists on displaying the current user on all pages (as most of them seem to). Next up is action caching, which does let me execute before and after filters, allowing me to handle to logged in user, but caches the entire rendered action, including the layout, so once again I can’t display the currently logged in user. There are possibly some ways around this, but since action caching is really just a specialised form of fragment caching, lets talk about that.
Fragment Caching
Fragment caching does work. In fact, my first attempts at it benchmarked so well in my simplistic “load this page 100 times in httperf” tests that I dived in head first. The books on this subject, particularly the pragprog one, give the impression that this is pretty straight forward. It’s not. There are some massive gotchas that will bring even a fairly low traffic site to it’s knees if you don’t watch out for them.
It’s All About Expiry
You can’t consider caching without thinking about cache expiry. In rails, this is typically done with cache sweepers. For fragment caching, the sweepers call the expire fragment method. This can take a string, which matches the fragment name exactly, or it can take a regular expression.
Gotcha #1, Don’t Use expire_fragment With A Regex
First up, this doesn’t work with memcache anyway, it only works with the file system cache. There’s nothing that wrong with the file system cache. Reading from it is faster than rendering a template. Expiring from it, however, is pretty slow. Expiring from it using a regex is absolutely appalling. The reason why is better explained in this article by Adam Doppelt.
So, if you can’t expire it with a regex, that leaves you the following options for expiry:
- Time based expiry. There are some plugins that add this feature to the file system store. Memcache gives it to you for free, and if you’re relying on this heavily, I’d use memcache.
- Being in one of those good situations where the number of possible fragments is known, and you can expire them each explicitly. This didn’t work for me in some of the critical areas that I needed to cache.
- Storing a list (in the database) of the caches that you built up which need to be expired if a certain thing is changed.
Don’t Expire, Just Render it Obsolete
This article so far doesn’t really capture how much pain this stuff has caused me, and I’ll try and cover some other points in other articles. For now, lets jump straight to the good bit.
I’ve read a lot of articles on caching, but this is the best, go read it:
The Secret To Memcached – by Tobias Lütke
Tobias also struggled with expiry, and his solution is to take advantage of the fact that if you’re using memcache, then you can never cache too many items. The oldest ones get pushed out when you run out of of space.
So, here’s my first bit of advice. If you’re building a real site, go straight to memcache. If you’re not building a site for big traffic, don’t cache, just optimise any really stupid queries that are giving you trouble. If you’re using memcache, be sure to run monit too.
So, we’re running memcache, and we don’t want to expire our fragments. Instead, try and find fragment keys that don’t need to be expired, because they will be replaced if the data changes.
The one I’ve just implemented was a stream of recent activity, much like facebook. Each little type of activity had a different template, and the rendering of this took up a lot of time. Fetching the data was also non-trivial. However, if I wanted to expire a cache of the activity stream, then I’d need to do so anytime something occured on the site that triggered an activity for this user.
<% cache ["activity_stream", @latest_activity.id, @user.id].to_s do %>
... render the activities
<% end %>
There’s the code. The real example had a few more parameters, but you can see here that the magic is in the fact that I used @latest_activity.id as part of the key. I’m still having to query that from the database, but it’s pretty simple to do, and all I really need is one little integer, instead of all the activities and their associated objects. If a new activity is created for this user, then this id will have changed, and so I’ll me asking for a different cache key.
Benchmarks for this are looking really good. I’ll let you know how it goes in the wild, but I’m not expecting too many problems as most of my previous troubles have been to do with expiry, and this code doesn’t need any expiry. No sweepers, no regular expressions. It’s simple, and it scales.
Migrating from file_column to attatchment_fu
Posted by Craig Ambrose on September 09, 2007 at 03:25 AM
About a year ago, file_column was one of the most popular plugin for storing files, particularly images, in rails applications. These days, the most popular plugin is Rick Olsen’s attachment_fu.
The main advantage of attachment_fu is it’s ability to store the files either on the file system, in binary fields in the database, or on amazon s3. The pluggable nature of the code also makes it fairly easy to support some other storage service.
I’ve avoided moving over because file_column actually provides more comprehensive image manipulation features, but there comes a time in the life-cycle of most applications where file system storage just doesn’t cut it in a multi-server environment.
There are already tutorials on using attachment_fu, I’m presuming that you already know how to use it. I’m just going to help you make the switch. Lets start with some code, here’s my migration for moving across the data:
class CreateProfilePhotosFromFileColumn < ActiveRecord::Migration
def self.up
for profile in Profile.find(:all)
image_filename = select_value "SELECT image FROM profiles WHERE id = #{profile.id}"
unless image_filename.blank?
image_path = RAILS_ROOT + "/public/system/profile/image/#{profile.id}/#{image_filename}"
image_file = File.open(image_path, 'r') photo = ProfilePhoto.new(:profile_id => profile.id)
photo.set_from_file(image_file)
photo.save!
end
end
end def self.down
execute "DELETE FROM profile_photos"
endend
In this example, I previously used file column in the field called “image” of my Profile model. Now, I have a new model called ProfilePhoto, which belongs_to Profile.
Although I’m happy to loop through all profiles using regular active record finders, note that I didn’t use the image method of profile. I know that after this works, I’m about to remove everything to do with file column, and so to play it safe, I use assert_select to fetch the file column image name directly from the database. This is ugly, but good policy in general for producing migrations that keep working after the code changes.
The other trick here is the call to “set_from_file”. This method doesn’t exist in attachment_fu, and was the first (of several) glaring omissions that I noticed in this plugin. To make this migration work, you’ll need to make a few changes to attachment_fu.rb yourself.
The following goes inside the InstanceMethods module:
def set_from_file(source_file)
source_file_extension = File.extname(source_file.path).reverse.chomp('.').reverse
source_file_name = File.basename(source_file.path)
self.content_type = self.class.mime_type_from_extension(source_file_extension)
self.filename = source_file_name
self.temp_data = source_file.read
end
The following goes inside the ClassMethods module:
def mime_type_from_extension(extension)
MIME::Types.type_for(extension).first.simplified
end
And the following goes at the top of the file:
require 'mime/types'
You’ll also need to “gem install mime-types”, although you will already have this if you installed the amazon s3 library.
This code is not hugely error tolerant. It presumes that all your records with file columns contain valid image files that are going to be accepted by your attachment_fu model. It also assumes that you set up attachment_fu correctly of course.
If it works, I would then add further migrations to remove the old file_column field from the Profile model, and to remove the file_column files themselves from the hard disk.
You’ll probably find the set_from_file method to be a valuable addition to attachment_fu for other purposes too. Our applications often receive their data in ways other than just HTTP post, and being able to save a file object seems like a pretty obvious addition.
In Response to Dave Thomas' RailConf 07 Keynote
Posted by Craig Ambrose on May 25, 2007 at 11:37 PM
Conference keynotes often introduce plenty of new ideas and challenges to a development community, but they are a bit of a one way dialogue if we don’t have a think about them and share our views. In particular Dave Thomas, of the Pragmatic Programers, is not shy about trying to inspire programmers to take new directions, and I always look forward to his keynotes in particular.

For RailsConf 07, Dave’s talk centered around the idea of “cargo cults”. Cargo cults were popular on Pacific island nations, who emulated the activities of the Americans during the war, to the point of building runways, control towers, and trying to summon planes with coconut headphones, in order to make the cargo come. The metaphor here is that we, as rails developers, need to stop doing things just because they have worked for others in the past, and start being truly creative.
While it’s hard to disagree with a call to be creative, I’m going to go out on a limb and say that I don’t think this is a particularly good direction for the Rails community to take at present.
Ruby on Rails has several strengths that make it quite unique. Some of these relate to the software itself, but these are minor compared to the strengths of the Rails community. There have been plenty of other open source ecosystems with a strong community, so what makes the Rails one different to, for example, the python community?
My belief, if you’ll excuse me for stretching David’s metaphor perhaps a little too far, is that it’s our cargo cults. The rails community demonstrates a particular unity of purpose. While we are all out building our own crazy web applications, the opportunities for re-use of code and design ideas is particularly high. If you ask three python programmers how to build web applications, you’ll get three different answers. If you ask three ruby programmers, statistically speaking, you’re liable just to hear about Rails three times.
This goes further than just the dominance of Rails within the Ruby community. Even within Rails, we have our cargo cults. REST is a good example, and one mentioned during David’s talk. Sure, there’s life beyond REST, but the benefit of picking such an approach as a community, and sticking to it (at least for the next year or two), is staggering. If I build a RESTful application, it’s no big deal, but if Rails developers everywhere build them, then suddenly there’s an ecosystem of useful applications that can talk to each other in such a practical and simple way that it makes Soap developers weep.
Not only do we have our cargo cults, but we also have our high priests. The whole REST thing happened because at RailsConf 06, DHH said that we should do it. This year, he was a little more reserved, but was still quite happy to go out on a limb and pick some particular technologies which were more right than others. One slide of his talk listed the friends and allies of the rails community. Our friends, we are told, include OpenID and Atom. There’s no reason why the Ruby on Rails framework need to be particularly associated with either of these technologies, and from a technical perspective, it probably wont be. But DHH says that they are our friends. Suddenly, OpenID support is appearing in Rails apps all over the place. This is an amazing result which you can normally only see happen within a hierarchical organisation. Here we are seeing unity of purpose within a very anarchistic open source community.
So I say, investigating new ideas is fine, but be aware that there is great benefit to be gained by following the path that we’re all traveling on together. People with different ideas from the Rails core team tend to write plugins, and if people think that the ideas are good enough, they eventually gain traction. Just take a look at the adoption of rspec within the rails community, even though it is quite different from the default testing system.
You can think of this as a cargo cult if you like, or laugh and say that rails developers have swallowed too much of their own hype. However, taking a common path that represents the combination of a range of views without simply all going our separate ways is a technique that works very well in nature.
Amongst humans, we call this technique consensus.
Migrating from Typo to Simplelog
Posted by Craig Ambrose on May 17, 2007 at 11:19 PM
I’ve moved this blog from Typo, to Simplelog, both rails based blogging systems. Normally I don’t do this sort of meta-blogging, but I though I’d share a few tips about the migration, and some comments on Simplelog.
Installation
Like the other rails blogs, simplelog has some installation instructions for how to get the simplelog files on your sever, and like all rails developers, I promptly ignored them, and setup a copy of simplelog in a subversion repository of my own and deployed to my server using capistrano. This makes deployment and upgrading much easier, as I can test new simplelog versions on my local machine first, and then roll out updates using cap deploy as normal.
Migrating your Data
There don’t seem to be any nice scripts for migrating data from Typo to Simplelog. There are some scripts for exporting typo in data into Moveable Type or word press format, and you could probably hook together two or three different scripts to eventually get from Typo to Simplelog, but I couldn’t figure out a simple way to do that without also installing a PHP app, and that just seemed way too complex, so I thought I’d roll my own.
The bad news I’m afraid, is that I haven’t bothered to sufficiently generalise it to make it work for everyone, but if you want to do this too, then you can use my code as a starting point and refine it for your needs.
When you want to script some little task in rails, the tool for the job is of course rake. I started working on a little rake task inside my typo installation, to export the data in a format I could use for simplelog. I started by comparing both database schemas, and seeing how I could generate SQL to use with simplelog.
It turns out that both these databases are non-trivial. They use single-table inheritance, and the also both have cache columns full of keywords and search text and so forth. The active record classes in both apps have the code to do this, so the best way to convert your data is to leverage that code. You could write a ruby app that loads both model layers and connects to both database, but that didn’t seem too trivial, so what I ended up doing was creating a typo rake export task that generated a simplelog rake import task.
The Code
Here’s the code, it only copies the data I was interested in (articles and comments), and it does so with a few specific values (such as a fixed author id). It also, on large databases, will hit some id conflicts as it does so. To fix this, run it multiple times until the auto-numbered ids become bigger than the ids in your old database (or solve that problem properly). Also, it’s butt ugly, remember, I just wanted to do this in an hour and never see it again.
Before running this script, make sure that you have setup your simplelog database (with the migrations), and created a user (my had id 2, which is hardcoded in the script). Also, set the default markup format in simplelog to be whatever you used in typo (I was using textile).
def q(value)
value = '' if value.nil?
if value.is_a? String
escaped = value.gsub(/\\/, '\\\\').gsub(/\r\n/, "\n").gsub(/'/, "\\\\'").gsub(/\n/, '\' + "\\\\n" + \'')
return "'#{escaped}'"
end
value
end
task :typo_export => [:environment] do
file = File.new("/path_to_craigs_simplelog/lib/tasks/import_from_typo.rake", 'w') file << "task :import_from_typo => [:environment] do \n"
file << " require 'find'\n"
file << " require 'post'\n"
file << " require 'preference'\n"
file << " require 'application'\n" file << " Post.delete_all\n"
file << " update_posts = \"\"\n"
for article in Article.find(:all)
file << " new_record = Post.create(:author_id => 2, :created_at => '#{article.created_at.to_s(:db)}'.to_time, :modified_at => '#{article.updated_at.to_s(:db)}'.to_time, :permalink => #{q article.permalink}, :title => #{q article.title}, :body_raw => #{q article.body}, :comment_status => 1, :text_filter => #{q 'textile'})\n"
file << " update_posts += \"UPDATE posts SET id = #{article.id} WHERE id = \#{new_record.id}; \"\n\n"
end
file << " Post.connection.execute update_posts\n\n" file "\n\n"
file << " Comment.delete_all\n"
file << " update_comments = \"\"\n" for comment in Comment.find(:all)
file << " rew_record = Comment.create(:post_id => #{comment.article.id}, :name => #{q comment.author}, :email => #{q (comment.email.blank? ? 'anon@noemail.com' : comment.email)}, :subject => '', :body_raw => #{q comment.body}, :ip => #{q comment.ip}, :is_approved => 1, :url => #{q comment.url})\n"
file << " update_comments += \"UPDATE comments SET id = #{comment.id}, is_approved = 1 WHERE id = \#{new_record.id}\"\n\n"
end
file << " Post.connection.execute update_comments\n\n" file << "end\n"
end
Once this is run, you can go to the simplelog installation and run rake import_from_typo. If it doesn’t work, figure out what went wrong, modify the script, and repeat. Please notice that this is a destructive import (note the calls to “delete_all”).
This is a long post already, so I’ll post comments on simplelog next time.
How Do You Freeze Your Rails Version?
Posted by Craig Ambrose on May 15, 2007 at 03:45 AM
The first time a Ruby on Rails release introduced a few backwards incompatible changes, it caused a bit of an uproar (the one I’m thinking of was 1.0.1, which broke Typo, amongst other things). Suddenly, everyone realised that we were writing web applications that weren’t just dependent on Ruby on Rails, they were dependent on a particular version of Rails.
So, these days we all know that we need to lock our Rails applications into using a set version. There are two ways of doing this, and each has its pros and cons.
[1] Freezing Rails in the Vendor Directory
If your application finds a copy of Rails in the ./vendor/rails directory, then it will use that instead of whatever rails gems are installed on the system. This is really handy, and it’s the method that I initially adopted for all my sites when the rails 1.0.1 problem occurred. At first, I was copying the Rails files into my project subversion repositories, but I quickly learned that the easier way to do it was with subversion externals.
From the root of your rails application, execute:
svn propedit svn:externals vendor/
The subversion externals properties from that directory will now be editable in your default editor. Each line in this folder represents a link to an external repository, with the name of the local folder to export to first, then the repository URL. So, for example, to use rails version 1.2.3, we would use the following.
rails http://svn.rubyonrails.org/rails/tags/rel_1-2-3/
You can find the URL of that tag, or any other rails tag or branch (or the trunk itself) by browsing the rails repository.
The upside of this method, is that your application is safe to deploy on almost any machine with the correct ruby stack installed, even if the rails gems are not present.
The downside, is that rails is pretty large, and every time you do a svn checkout, it has to grab it all. In particular this slows down your svn deploy command. If your using deprec, it can really slow down your svn setup command too, because the file permissions have to be set on all those folders.
[2] Specifying the Rails Version in environment.rb
This is the accepted method now, and is done by default in all new rails applications. To lock in that same version number, all you need to do is add the following to your environment.rb, if it isn’t already present:
RAILS_GEM_VERSION = '1.2.3' unless defined?
This means that rails will load the gem for rails 1.2.3, if it exists. If it doesn’t exist, it will throw an error, rather than run your application. Remember that gem doesn’t remove old versions when it installs new ones, so even if a newer version of rails is installed on the server, if the correct version was there once, it should stay better.
These days, I’m moving my sites over to using this method, for the reasons of hard disk space and speed that I mentioned above.
The only downside is, I need to make sure that the right version of the gem is installed. However, most applications have other gem dependencies too, apart from just rails.
How do you ensure that the required gems are installed on a new server? If you’re logging in to the server manually and installing the gems, have a think about automation. Surely this is the province of the cap setup task. It might seem easy to do it now, but don’t forget that you might have to do it again when you move servers in six months time. Also, when that happens, you might be in a hurry. You might also be migrating to a multi-server cluster.
Don’t wait, automate now. Wack all your setup needs into cap tasks. Consider using before or after filters on the cap setup action. Please note that if you’re using deprec, it already adds these filters, so you’ll need to use an action called after_after_setup. This problem is fixed in capistrano 2, which should be out as a stable release very soon, but until then, you can do it the hard way.
Securing your Mongrel Processes with Deprec
Posted by Craig Ambrose on May 08, 2007 at 10:06 PM
For a long time, on many of my Rails apps, I’ve been letting my Mongrel processes run as the root user. Any sysadmin worth her salt will tell you that this is a bad idea, because any exploits in you web application will give the malicious user root access to your machine, and also any bugs in your application have the potential to destroy all sorts of important data. For example, try adding <% rm_rf ’/’ %> to one of your rails templates and you’ll see what I mean (that’s sarcasm by the way, don’t do it).
So, we need to run those mongrel processes with a user with less privileges. Our mongrel user should be able to read our entire rails application, and only be able to write to those relevant directories, such as logs, cache, sessions, pids and so on.
This security problem was a significant issue for my current client, and so we hired Mike Bailey (the guy who writes the deprec gem), to ensure that deprec has
a system for handling this properly. It’s in the latest deprec release, and here’s how it works.
Deprec 1.7
Deprec has had a whole swathe of small releases in the last couple of weeks. Mike is back from his adventures in india, and busy fixing bugs and adding new features. By the time you read this, the release number might be even bigger, but these instructions on how to apply to old sites will probably still be valid. For new sites, you don’t really have to worry about this stuff.
Before Upgrading
Make sure that your gems are up to date on the server.
Why do you need to do this? Well it turns out that mongrel 1.0.1 came out recently, and at the moment, deprec can’t correctly answer the questions that gem will ask about which version to use when upgrading. If you hit this problem, or want to avoid it, log into your server(s) and run sudo gem update now. You can decide which version of mongrel you want (probably 2 – ruby) and then deprec wont have any troubles.
On an app that’s been around for a while you might also want to run cap cleanup, if you don’t already do this every now and then to remove old releases on the server. This will make the following tasks run much faster.
Applying the New Groups and Permissions
Remember the cap setup task that sets up your initial directory structure for an application? Deprec also adds to that task to ensure that file permissions are correct, and the mongrel cluster set up. This task is re-runnable. It doesn’t hurt to run it as often as you want, although it is pretty slow. By running it with the new deprec release, you can setup the right file permissions to use mongrel as a non-root user.
Please note that the last step performed by the setup task is to create the database. This currently fails, as the database already exists. This bug will get fixed, but it doesn’t matter, just ignore the error.
Roll Out a New Version
To get your app rolled out with the new permissions, run cap deploy, even if it hasn’t changed since the last release.
Consider Updating Your Rails Stack
If you setup your machine with deprec some time ago, as I did, then there will be a few bugs. In particular there was an old bug where correct initialisation scripts for mongrel and apache where not added to the default runlevel, and they didn’t start on launch. To make sure that your system is configured as per the latest deprec, run: cap install_rails_stack. This is also always re-runnable to bring your system up to date.
Summary
So, for upgrading an old server to the latest deprec, my recommendations are:
# first, login to the server and run "sudo gem update"
cap cleanup
cap setup
cap deploy
cap install_rails_stack
cap restart_apache
For setting up a new server, instructions vary slightly according to your host, but for an Ubuntu 6.0.6 server at slicehost.com, I use the following process:
export HOSTS=your.hostname.com
cap setup_admin_account_as_root
cap setup_ssh_keys
cap install_rails_stack
cap setup
cap deploy_with_migrations
cap restart_apache
A Rake Task to Grab Data From Your Production Site
Posted by Craig Ambrose on April 17, 2007 at 10:17 PM
When you’re fixing bugs that occur in a production rails application, the first step is often to reproduce them locally in development mode. Every now and then, I find that I want to make my local development site run with the same data as my live production site, so here’s a little rake task to make it easier.
desc "Make local develoment site look like the live site"
task :sync do
host = 'craigambrose.com'
path = '/var/www/apps/craigambrose/current' db_config = YAML.load_file('config/database.yml') system "ssh #{host} \"mysqldump -u #{db_config['production']["username"]} -p#{db_config['production']["password"] } -Q —add-drop-table -O add-locks=FALSE -O lock-tables=FALSE —ignore-table=#{db_config['production']["database"]}.sessions #{db_config['production']["database"]} > ~/dump.sql\""
system "rsync -az —progress #{host}:~/dump.sql ./db/production_data.sql"
system "mysql -u #{db_config['development']["username"]} -p#{db_config['development']["password"]} #{db_config['development']["database"]} < ./db/production_data.sql"
system "rsync -az —progress #{host}:#{path}/public/system/ ./public/system"
end
Installation
Rake tasks can go in any file ending in .rake inside your lib/tasks directory (within a rails application). You might put this in a file specific to your application, like craigambrose.rake. You can also namespace rake tasks by putting them inside a blog like:
namespace :craigambrose do
# your tasks go here
end
With my task, you’ll need to set the value of two local variables, host and path. Or, perhaps you can think of a way to read those from your deploy.rb?
Usage
If you’ve namespaced it like I have, then it’ll be something like:
rake craigambrose:sync
Note that this uses a lot of shell commands, and basically assumes that you are developing on a unix box of some sort. You can delete the rsync for /public/system if you don’t store any files there.
Feed Fetcher: A plugin for finding RSS feeds from URLs
Posted by Craig Ambrose on March 27, 2007 at 08:52 AM
I’ve got a site with a few blog aggregation features, and when users type in their blog URL to add to the site, they don’t seem to know the difference between the blog html page URL and the blog RSS feed URL.
So, I wanted the site to work regardless of which one they used, by finding RSS resource links from the HTML page (if that’s what they enter). This is what Google Reader and all the other nice looking blog readers do.
It’s not earth shatteringly exciting code, but it’s good enough to re-use, and it’s fully tested, so I’ve made into a plugin. Now there’s no excuse for offering a sloppy user experience for subscribing to feeds.
Installation
script/plugin install svn://rubyforge.org/var/svn/ambroseplugins/feed_fetcher
Usage
The main function that you’ll be using is the class method get_feed_source on the FeedFetcher class.
If successful, it’ll return a FeedSource object, that you can interrogate for the feed details or
the items in the current feed. If it fails, it’ll throw an exception that you can use to determine
the problem and return a message to the user.
Here’s the code that I use to call the feed fetcher from within my Blog model. If it works, I’m
setting a couple of instance members on self.
begin
result = FeedFetcher::FeedFetcher.get_feed_source(site_url)
if result
self.feed_url = result.url
self.title = result.title
end
rescue FeedFetcher::NoFeedForPageError
@feed_error = "Sorry, we couldn't find a feed for this URL. Your blog needs to have a RSS feed facility for us to use it on Playful Bent."
rescue FeedFetcher::PageFeedError
@feed_error = "You blog has a RSS feed, which is great. However, it doesn't work for us right now, which is less great. Sorry, this wont work."
rescue FeedFetcher::PageDoesntExistError
@feed_error = "Are you sure you typed that right? We just tried to fetch that URL and we couldn't find anything there at all."
end
More Info
I keep a static page on this plugin here. You can check back there for the current information at any point.
In Support of Kathy Sierra
Posted by Craig Ambrose on March 26, 2007 at 10:05 PM
This is the first time I’ve written a post here that doesn’t have a chunk of code or a programming technique explained. However, it’s important enough to break my stride for a moment.
If you’re into rails, or part of the web2.0 crowd, then you are most certainly indebted to the work that Kathy Sierra has done.
Death Threats Against Bloggers
This shit is not ok.
Go have a read and offer support and encouragement.
Redbox Has Moved
Posted by Craig Ambrose on March 25, 2007 at 11:53 PM
I’ve been having svn troubles over the last few days, and people haven’t been able to download Redbox, or my Associated List plugin. I’ve moved them both to rubyforge, which should be much more reliable. New locations at:
Redbox
svn://rubyforge.org/var/svn/ambroseplugins/redbox
Associated List
svn://rubyforge.org/var/svn/ambroseplugins/associated_list
I’ll try and filter these through to all the appropriate places. In the meantime, you can update your plugin sources for a given project using:
script/plugin source svn://rubyforge.org/var/svn/ambroseplugins/
(from within a rails application base directory)
The Dangers of Removing Duplication
Posted by Craig Ambrose on March 16, 2007 at 08:32 PM
Now there is a controversial title.
I’m not trying to say that it’s dangerous to remove duplication in your code, merely that there are dangers that you might face when doing so. Duplication in code is a bad smell. It’s a reminder to us that we need to refactor it in some way in order to improve the design and make the duplication go away.
Sometimes, we get so caught up in the idea the removing the duplication is the only goal that we forget about the fact that this process is actually supposed to drive good object oriented design. Looking back at rails code that I’ve written over the last year, the place where I’m seeing this more than any other is in helpers.
Rails helper methods are places to extract logic from the view. They are tremendously handy, and a nice easy place to stick logic that appears in several different view templates, or contains so much conditional logic that it would make our rhtml templates unreadable. However, they are also just procedures. I find myself slipping into the trap of using procedural programming techniques to remove duplication here, rather than good design practices. Sometimes you can spot this because you find yourself starting to add a lot of arguments to you helper methods.
I’m going to share with you an example of some of my bad code. No programmer likes to do this, but I don’t feel too bad about it because I’ve just removed this method from a project, so I can at least say I did write this bad code, but now I’ve improved it.
Some Code in Need of Refactoring
def editable_section_if(condition, name, resource_name = nil, form_url = nil)
rollover_link_name = "rollover_link_#{name}"
display_div = "display_#{name}"
edit_div = "edit_#{name}"
partial_prefix = resource_name.nil? ? '' : "#{resource_name}/" result = "" if condition
result << "<div class=\"editable_text\" onmouseover=\"Element.show('#{rollover_link_name}')\" onmouseout=\"Element.hide('#{rollover_link_name}')\" id=\"#{display_div}\">"
result << '<div class="rollover_link_bar">'
result << link_to_function("Click to Edit", "Element.hide('#{display_div}'); Element.show('#{edit_div}')", {:class => "rollover_link", :id => rollover_link_name, :style => "display: none;"})
result << ' </div>'
result << render(:partial => "#{partial_prefix}display_#{name}")
result << '</div>' result << "<div id=\"#{edit_div}\" style=\"display: none\">" form_url = {:action => "update_#{name}", :id => params[:id]} if form_url.nil?
form_url.merge!(:controller => resource_name) unless resource_name.nil?
result << form_remote_tag(:url => form_url, :method => :put, :update => name, :loading => "Element.show('spinner_#{name}')", :loaded => "Element.hide('spinner_#{name}')", :html => {:class => 'inline'}) result << render(:partial => "#{partial_prefix}edit_#{name}")
result << "<div class=\"editable_form_buttons\">"
result << submit_tag("Save")
result << " or "
result << link_to_function("Cancel", "Element.hide('#{edit_div}'); Element.show('#{display_div}')")
result << "</div>" result << '</form>'
result << '</div>'
else
result << render(:partial => "#{partial_prefix}display_#{name}")
end result
endAbout the Code
This is a helper method to generate an inline form, which remains hidden in the page. The div displaying the data is loaded from a partial, as is the contents of the form, and edit and cancel buttons are used to toggle between them with javascript. The submit button uses ajax to submit the form. Also, it uses a condition to determine if the data is even editable, and if not, just displays the data, not the form.
This approach is a fairly standard thing when you realise that you want to do some inline editing that’s more than just one field, and so is a bit more complex that the built in rails inline editor helpers.
So what are the bad smells here?
First up is the really heavy string manipulation. I find myself doing this in helpers a fair bit. I probably started writing this in a rhtml template and then moved it into a helper when I needed to re-use it. The ERb of a rhtml template is more suitable for something which is so html heavy. All this string append operations, and bits of html everywhere make this method really hard to read. If I was missing a closing html tag, would I be able to easily tell? Do you find yourself writing helpers like this, or is it just me? Perhaps it should be a partial.
Secondly, the last two parameters were actually added latter. They sound important, but they have default values of nil. They are actually an attempt to try and make this system compatible with REST. They don’t entirely succeed, and this is an ugly function to try and make work in this situation.
Thirdly, this method is too long. Some programmers believe that methods of this length are just fine. If you are one such, please don’t ever bother to send me your resume.
I have already refactored this code, and I like the way it’s coming along, so keep an eye out for another blog post soon where I’ll suggest some better ways of doing this.
Pagination Using AJAX
Posted by Craig Ambrose on March 12, 2007 at 08:35 AM
The rails pagination helpers automatically give us a set of links to load different pages of a result set. Occasionally, you might not want to load these other pages of results as full page loads. There’s no reason why you can’t call an index action using AJAX, and have it respond with some javascript to update the list.
If you’re going to do this, you’ll need a different pagination helper. Since I’m such a nice guy, I’ve already written it for you.
The Code
def ajax_pagination_links(paginator, options={}, html_options={})
name = options[:name] || ActionView::Helpers::PaginationHelper::DEFAULT_OPTIONS[:name]
url_params = (options[:params] || ActionView::Helpers::PaginationHelper::DEFAULT_OPTIONS[:params]).clone links = pagination_links_each(paginator, options) do |n|
url_params[name] = n
link_to_remote(n.to_s, {:url => url_params, :method => :get, :loading => "Element.update('#{name}_loading_number', #{n}); Element.hide('#{name}_links'); Element.show('#{name}_loading')", :complete => "Element.show('#{name}_links'); Element.hide('#{name}_loading')"}, html_options)
end
loading_number = content_tag('span', '', :id => "#{name}_loading_number")
loading_spinner = image_tag 'spinner.gif'
loading = content_tag('div', "...loading page " + loading_number + ' ' + loading_spinner, :id => "#{name}_loading", :style => "display:none", :class => 'loading_text')
links_tag = content_tag('div', links, :id => "#{name}_links")
links_tag + loading
end
Usage
Simply wack that method in a relevant helper module, and use as in the following example:
<%= ajax_pagination_links @message_pages, :name => 'message_pages' %>
You only really need to specify the name option if there is going to be more that one of these in the current DOM and thus you’re going to hit problems with multiples of the same ID.
Also, I’ve referenced a spinner image. You may want to provide that, or pull the image tag bit out of the code.
Finally, I’ve added classes to things if you want to style it. I think it looks good with:
div.loading_text
{
color: #666;
}
Loading Parts of a Page in the Background
Posted by Craig Ambrose on March 05, 2007 at 10:24 PM
Occasionally I find that one of my apps ends up with “the page from hell”. The design seems to require that a certain page pull in all sorts of information from different places, and perhaps some of it isn’t even viewed straight away, it’s just being loaded into the background so that the user can access it instantly.
I had such a page, which had a growing number of tab pages that I wanted to load without any delay at all. I was starting to get worried about the page load time, but perhaps even worse was the fact that the controller loading this page was becoming quite messy, since some of the tabs clearly involved other resources.
The following helper module might be of some use to you. It defines a helper method called call_remote_queue, which executes a sequence of queued ajax requests.
The Code
module RemoteQueueHelper def call_remote_queue(urls = [], spinner_id = nil)
javascript_tag remote_queue_js(urls.reverse, spinner_id)
endprotected def remote_queue_js(urls, spinner_id)
this_url = urls.pop
this_options = remote_url_options(this_url) return nil unless this_url if urls.empty?
this_options[:complete] = "Element.hide('#{spinner_id}')"
else
this_options[:complete] = remote_queue_js(urls, spinner_id) unless urls.empty?
end
remote_function(this_options)
end def remote_url_options(url)
if url.is_a? Hash
result = url
else
result = {}
result[:url] = url
result[:method] = :get
end
result
endend
Usage
<%= call_remote_queue [user_stories_path(@user), user_dares_path(@user)], 'area_tabbar_spinner' %>
In my case, the urls in question used rjs templates to add the tab page to the list of tab pages. The second, optional parameter is the id of a spinner element to disable when the chain of loading is complete.
Example
This was used on the user profile page of an adult social networking site over at Playful Bent. This is an adult site, so use your own judgement as to whether to view it at work.
Here is a link to a profile which demonstrates this code, and doesn’t display any particularly naughty pictures to annoy your boss (but no guarantees about five minutes from now I’m afraid, it’s user content).
A Rake Task for Database Backups
Posted by Craig Ambrose on March 01, 2007 at 12:54 AM
I wrote this little rake task for a client yesterday, and it seems handy enough that it’s worth disseminating. It doesn’t, however, seem to justify a plugin as yet.
Before doing this, I did search through the lists of rails plugins to see if there was a solution for backup up databases. I was fairly surprised to find that there wasn’t. Yes, to some extent this is our hosting provider’s responsibility. However, they often charge to retrieve backups, and might not give us the fine grained control that we are after.
To use this task, add the following code to a rakefile somewhere. I created a file called backup.rake, and put it inside /lib/tasks.
The Code
require 'find'namespace :db do desc "Backup the database to a file. Options: DIR=base_dir RAILS_ENV=production MAX=20"
task :backup => [:environment] do
datestamp = Time.now.strftime("%Y-%m-%d_%H-%M-%S") base_path = ENV["DIR"] || "db"
backup_base = File.join(base_path, 'backup')
backup_folder = File.join(backup_base, datestamp)
backup_file = File.join(backup_folder, "#{RAILS_ENV}_dump.sql.gz") File.makedirs(backup_folder) db_config = ActiveRecord::Base.configurations[RAILS_ENV] sh "mysqldump -u #{db_config['username']} -p#{db_config['password']} -Q —add-drop-table -O add-locks=FALSE -O lock-tables=FALSE #{db_config['database']} | gzip -c > #{backup_file}" dir = Dir.new(backup_base)
all_backups = dir.entries[2..-1].sort.reverse
puts "Created backup: #{backup_file}" max_backups = ENV["MAX"].to_i || 20
unwanted_backups = all_backups[max_backups..-1] || []
for unwanted_backup in unwanted_backups
FileUtils.rm_rf(File.join(backup_base, unwanted_backup))
puts "deleted #{unwanted_backup}"
end
puts "Deleted #{unwanted_backups.length} backups, #{all_backups.length - unwanted_backups.length} backups available"
endend
Usage
Here’s how the new backup rake task works. You call it with:
rake db:backupOptions
It has a few possible options, I’ll give an example first, then explain:
rake db:backup DIR=/home/sugarstats RAILS_ENV=production MAX=10The DIR option specifies the base directory for the backups. I create a “backup” directory inside that, so you don’t need to include that on your path. This actually defaults to “db” inside the application, but you will want to overwrite it with the option above.
RAILS_ENV should be set to production. It is probably development by default.
MAX is the number of backups to keep. It defaults to 20, which I think is a nice safe number. Once this number of backups is exceeded, older backups are deleted. There’s no way to turn this off, other than removing that bit of code.
In Production
You’ll want to call this from cron. Like all rake tasks, it’s expecting to be run from the application directory, so the cron task needs to change the directory first. I’ve found that "cd /app_dir && rake db:migrate" works well, but you should test your cron task is working. I’m always getting cron tasks wrong.
Credits
This code was paid for by Marston Alfred, over at Sugarstats and is used in production for that site. Sugarstats is a website to help diabetics manage their sugar levels, activities, and medication, and is well worth a look (even if I do say so myself).
Web 2.0 and Spam
Posted by Craig Ambrose on February 07, 2007 at 04:00 PM
Everyone is creating a web2.0 startup at the moment, fueled with all sorts of exciting ideas about AJAX, active white space, long tail business plans, new styles of collaborative marketing…
oh, and they’re sending people spam
I’ve received several pieces of spam from companies recently about what look like really good web2 products. The first time, I felt sympathetic, and told them that it was spam, and pointed out how this clashed with the type of marketing that they wanted to use. I’ve heard responses like “oh, it’s not spam because our product is really good and you’re going to love it”. WTF?
An email is spam if it:
- Is unsolicited
- Is sent in bulk
Furthermore, bulk emails, even if solicited, need to have an automated unsubscribe option.
That’s a pretty widespread definition. It’s also (in simple words) basically the definition used by the Australian anti-spam legislation. I can’t speak for other countries (New Zealand, where I am now, appears to have no such legislation).
The email I got this week, that prompted this post, was from stickytag.com. It’s a website for collaborative writing and moving around of sticky notes. Sounds web2.0 of course, and they even say “While we followed many of the design and development principles of the Web 2.0 buzz…”
And yet they sent me an email that is clearly not personally addressed to me, is sent to a range of undisclosed recipients, contains no unsubscribe information, contains a massive image file, and a great wad of marketing-speak, and presume that this will make me think well of them? Interestingly, it’s title contains the words “Blogger Release”.
This clue would tend to indicate that they think that by being a blogger, I have thus solicited their email, considering that it is on a tenuously related subject to which I write.
WTF? That’s like saying that by writing a poem about life after death I’m asking to be contacted by religious evangelists.
Anyway, I’ll stop ranting. The two lessons learned here are:
1. If you’re are going to play the web2.0 game, don’t fuck it up with something like this. Market the new way, by teaching, not by abusing your users. Go read Kathy Sierra and Seth Godin and Signal vs Noise and come back when you’ve figured it out.
2. Don’t use Sticky Tag (.com). We can’t afford to tolerate web businesses who behave as badly as Viagra salesmen, because it will eventually make all of us look bad. Boycott businesses which spam you. There will be another ten people with the same product withing a few months anyway.
Freelancing on Rails
Posted by Craig Ambrose on December 12, 2006 at 03:05 AM
I’ve just released the second episode of my podcast, with ideas and discussion for freelance computer programmers (particularly ruby on rails ones). I figure two episodes on schedule means that it’s serious enough that it’s worth me blogging about it.
If you haven’t already have a listen, have a look over at http://www.craigambrose.com/podcasts for all the details. The first episode was a bit longer, but I’ve settled into a ten minute format, with a fairly snappy coverage of a particular topic each episode.
Scalable Rails Deployment: Part 3, Automation
Posted by Craig Ambrose on December 02, 2006 at 01:09 AM
This is part 3 of an ongoing series of articles on Ruby on Rails deployment. It is useful by itself, or start reading from Part 1.
Doing Things the Easy Way
In previous articles, I’ve walked through the steps involve in installing the rails stack on a new server or VPS. After learning how to do this, and realising that this was something that something that I would be needing to do a lot, it became time to automate the process. Luckily for me, I have clever friends, so I’ve latched onto the new deprec gem from Mike Bailey. Deprec, or Deployment Recipies, is a library of capistrano tasks for building a new server or VPS from scratch, assuming (for the moment), that it’s a stock Ubuntu server installation.
Words cannot convey how amazingly cool this is. I can now build myself a new VPS in ten minutes, with only a few lines of shell script executed on my local machine. Plus, since I’m using Slice Host, they provide a nice little web interface for destroying and re-creating my VPS, so I’ve been merrily destroying it and rebuilding it with much glee, just because I can.
Slicehost Specific Setup
There is one step to this setup process that is specific to slicehost. This is because slicehost isn’t quite a stock ubuntu installation. Normally, ubuntu server installs with no root account. Instead, you create a user account during setup, which is given sudo access. Slicehost has chosen instead to deploy what is otherwise a normal ubuntu server installation, except that you have a root account, and no user with sudo. This is fine, but to make it work with deprec, we have to create that non-root user ourselves.
So, ssh into your new slicehost VPS as root, and create a user as follows. If your name isn’t craig, you’ll want to pick something else of course.
useradd -m craig
passwd craig
# you will be prompted for craig's new password, type it now
visudo
# add craig to sudoers list, described below
The -m in the useradd command above ensures that the new user will be given a home directory. The visudo command opens the sudoers list in an editor (pico by default) and you can just copy the line that’s already in there, using the new username. For example, mine looks like this: “craig ALL=(ALL) ALL”.
Now, we have a user as normal, and we can build our VPS with deprec.
Installing deprec
Deprec is a gem that you install on your development machine. Once installed, the deprec_dotfiles command creates a .caprc config file in your home directory, and patches capistrano to load this config file. Future version of capistrano will include this change. This means that you can run some cap commands outside the context of a particular rails app. If you don’t want to do this, you don’t have to, but all the commands below will have to be excecuted from the root of a rails application on your development machine. The .caprc file also gives you the convenience of being able to add any capistrano helper tasks that you might find useful.
sudo gem install deprec -y # installs what you need
deprec_dotfiles # patches capistrano + creates your ~/.caprc
cap show_tasks # now you have deprec tasks included
Using deprec
From the deprec usage instructions, found Here :
cd /path/to/railsapp
deprec —apply-to . —name projectname —domain www.projectname.com
# edit config/deploy.rb to put in details of your subversion repository
cap setup_ssh_keys # copy out your public keys
cap install_rails_stack
cap setup
cap deploy_with_migrations
cap restart_apache
It really is that simple. Note that you use the "deprec —apply-to" command, instead of the usual "cap —apply-to". This actually does the same thing, but it creates a deploy.rb file in your rails app that’s got a few more configuration options in it that capistrano knows about. If you already use capistrano for your app, be sure to rename your existing deploy.rb file first or deprec will stomp on it.
The rest of the cap tasks above do all the things that I’ve described in this series of articles. You end up with a properly setup server, using your ssh keys for login, and your application deployed and running using mongrel and apache load balancing. I have my apps up and running using this code with no extra effort, and I’m very pleased with the performance.
At the moment, if you have other dependencies, such as rmagick, you’ll have to install these yourself.
Coming Soon in Deprec
I’ve been helping out a bit with the deprec source code, and I’m pretty pleased with the way it’s coming along. The next release of the gem reduces the number of commands needed in the above description, and also includes tasks to automate installing rmagic, to automate the slicehost step above, and to install postfix for handling SMTP email.
The next release of deprec (1.2) will also be fully re-runnable, so that you’ll be able to run it over the top of your existing deprec installed site as many times as you want, to keep your config up to date with the latest best practices.
Questions about deprec should be directed to Mike Bailey, over at his blog: http://codemode.blogspot.com
Oh, btw, here’s my Technorati Profile.
Copying Files Between Models With file_column
Posted by Craig Ambrose on November 28, 2006 at 10:12 PM
I’m using the file_column plugin a lot. It lets you treat a file as an attribute on one of your models, and all you need to do to store it is provide a string column in your database table.
A lot of people have expressed concern that file_column doesn’t seem to have been worked on by it’s author at all this year. This maybe so, but don’t let that deter you from the fact that it is still currently the simplest and best way to handle images in rails applications.
One of the first snags that I’ve hit with file_column so far took me a few hours to nut out today, so I’d like to share it with you, gentle reader, in the hope that it might save you the same effort. I had a model object, which contained a file_column (and thus, potentially, a file), and I wanted to clone it. As a result of this, I wanted the new object to have it’s own separate copy of the file.
If you just naively copy the file_column (in my case, called “image”, on a model called “gift”), then file_column warns you that you’re just giving it a string, and thus you probably forgot a multipart declaration on your form. As you can see, file_column is pretty focussed on uploading files, not copying them between models.
Enough waffle, let’s just straight to my final clone method:
def clone
result = Gift.new(:name => name, :price => price, :description => description)
unless image.nil?
result.image = File.open(image)
FileUtils.cp(image, result.image)
end
result
end
The main bit of magic here is that I’m passing file_column (on the new object), a file object, rather than just a string. The file_column plugin has some basic support for this, but it’s pretty basic indeed. It seems that it can manage to copy the file out of the temp location into the final destination when I actually save the new model (not done inside clone), but it doesn’t copy it into the temp dir correctly (that normally happens on update. So, I put the copy in myself, and it all works.
You may not want to duplication your images on clone, but obviously this same technique is useful for any copying of file_column files between models, of the same or different model class.
When I get some “free” time (heheh), I’d like to do a bit of work on file column and fix this so that the hack isn’t needed, and perhaps look at some other design improvements. I’ve been through just about every line of it’s code this year on various projects, and while overall I’m greatly impressed, I think there’s more work that could be done to improve it.
Scalable Rails Deployment: Intermission
Posted by Craig Ambrose on November 23, 2006 at 09:54 PM
Welcome back. This series is going through the process of setting up a scalable production VPS for a ruby on rails app, from a non-sysadmin perspective. The first post is here.
Since my last post, I’ve been doing a number of different things with my server in an attempt to find the best configuration and share it with you. Before I move forward with the specifics of that, I wanted to take a brief moment to share some of the things I’ve learnt while experimenting with this.
Load Balancing is Fun
I haven’t talked about it much yet, but you probably already know that for serious rails deployment, we’re talking about multiple instances of the Mongrel web-server, and some sort of software load balancer to pass the HTTP requests through to them.
I’ve tried out two different load balancers on my ‘experimentation server’. The first was Pound, which was nice because it’s configuration file really is just so damn simple (mainly because it does nothing but load balance). On the downside, the Ubuntu packages for it were too old to actually work nicelly, so I had to compile it. Then, I had to ensure that it restarted on reboot (I wrote my own init.d script). This isn’t a bad solution, but Pound doesn’t serve static content, and Mongrel isn’t well suited to it either, so it typically requires the installation of another web server as well, such as Apache or Lighttpd.
I also tried out using Apache2, with mod_load_balancer. At this point, I have a confession to make:
I was scared of apache configuration files.
There, I’ve said it. I’ve used apache for years, for static sites as well as PHP, and Rails, and it’s always worked fine. However, I always felt that I needed to stay ‘on the beaten path’ of the config that my linux distro provided for me, and look closely at other people’s recipes before considering anything but the simples change to my apache configuration.
Now you may well think that I’m a bit silly, but I’m well pleased to now be free of this notion. Apache configuration is very straightforward. I think it’s made a bit complicated by distros including config files fill of default setting which I don’t really need. Standard configuration of apache is simple, virtual hosts are simple, and mod load balancer is also straight forward to configure and seems to be rock solid to use.
So, I’m going to stick with the apache option, and in my next post on this subject, I’ll be talking about that setup in more detail.
RAM is Your Biggest Cost for Small Sites
VPS prices are heavily dependent on how much RAM you get. This is because servers can only have so much ram, and so the amount of available RAM tends to determine how many VPSs can fit on a single physical server.
I’m told, by people in the know, that this year we’re starting to see physical serv
