Image management that will scale

Posted by Craig Ambrose on November 27, 2007 at 02:59 AM

There is a lot of conflicting information around about handling user uploaded images in rails applications. I’ve done it a number of different ways, and the good news is that it’s not too hard to move from one system to another. However, dealing with scaling issues is a pain and it’s nice to get it right first go. So, here are some problems that I’ve encountered recently, along with some solutions.

Files Per Directory Limit

Depending on which OS you use for hosting, you’ve probably got a limit to the number of files (or directories) you can put inside a given directory. It’s usually about 32,000. While this seems like a long way off, if your site accepts user content then hopefully this will eventually become a problem for you. There have been various talks and articles written about different hashing systems for file names, but it’s worth mentioning that this is basically a solved problem, and you shouldn’t have to tackle it yourself.

If you’re still using file_column, as I am for a few things, then this one might bite you. The simplest solution, I think, is to migrate to attachment_fu. The file system store for attachment_fu implements file name based hashing, and the s3 and database stores don’t suffer from the problem at all. Also, the way in which attachment_fu handles pluggable storage classes means that you could also slip in your own custom storage system later without having to change the way that you use attachment_fu in your models.

If you’re thinking of making the switch, here’s an article I wrote on migrating from file column to attachment_fu.

RMagick Memory Leaks

RMagick is really handy, and so just about every rails image handling tutorial on the internet recommends it’s use. I’m using it all over the place. My advice to you, is don’t ever do this. It turns out that RMagick leaks memory every time it manipulates an image. I haven’t measured the amount myself, but I’m told it’s quite a bit. Certainly I’ve been having resource consumption problems with scripts using RMagick heavily. So, say goodbye to it.

DHH recommended just using the image magick binaries manually. That’s basically a good idea, but a slightly easier way of doing that is to use the mini_magick gem. Mini magick provides a ruby API, but under the hood it just calls the image magick command line tools. Attachment_fu comes with a mini magic processor, so you can just add ”:processor => :MiniMagick” to your call to has_attachment and you’re in business. Khamsouk Souvanlasy wrote a good tutorial on using mini_magick with acts_as_attachment.

Cropping

The one thing I noticed in using attachment_fu instead of file column is that file column resizes images and crops them nicelly, the way that you would expect. By default, attachment_fu tends to stretch them. This has been covered better by other people, so I just want to mention it because stretching is almost certainly what you want, and until Rick fixes it, I’d suggest making a small change to the plugin yourself. There are a number of articles on the subject, but I think the best one is probably over at toolman tim’s blog.

Don’t just go with Tim’s solution though, have a look at the comments, and you will find options for the different image processesors. I used “labrat’s” suggested fix for mini magick (paste here)

Amazon S3

Amazon S3 appears to be a great solution for handling user generated images, and I’m starting to use it a fair bit. One word of warning however, is that I’ve already started to encounter an occasional communication error with amazon, as discussed in this thread and I don’t yet know how serious it is or how easily fixed. I’ll post some more on this subject when I’m better informed.

Comments

There are 5 comments on this post. Post yours →

Hampton Caitlin just posted some attachment_fu gotchas which include a few possible fixes to amazon s3 problems. Might help.

The URL to the post, stripped from my last comment, is:
http://hamptoncatlin.com/2007/attachment_fu-gotchas

Tekin

I had to stick with file_column because I was making heavy use of the dynamic resizing url_for_image_column feature but came a cropper to the per directory limit problem. Here’s my quick dirty patch to do attachment_fu style organising of files to overcome this.

http://pastie.caboo.se/122479

Timothy Hunter

A lot of the behavior people call memory leaks in RMagick is addressed by this note: http://rubyforge.org/forum/forum.php?thread_id=1374&forum_id=1618, which explains how to migtigate the problem by running GC.start occasionally.

With those tips in mind, if you know of any memory leaks in RMagick I’d really love a chance to fix them. Please send me a script that reproduces the leak using only Ruby and RMagick. No Rails, attachment_fu, scruffy, Gruff, etc. You can send the script directly to rmagick AT rubyforge DOT org or open a bug in the RMagick bug tracker on RubyForge.

If you can’t reproduce these leaks in such a script, please let us know that, too. Thanks!

Hi Tim,

My information on memory leaks on RMagick is mostly second hand, such as recommendations from my sysadmins. From my end, all I’m seeing is that RAM is the critical resource for my app, and that I have much more trouble managing it effectivelly than I do CPU cycles. The problem that you’re referring to in your thread above is the main problem that I’m concerned about. In my case, however, I’m not dealing with nice simple scripts that manipulate a bunch of images in a loop.

For example, if we use rmagick to perform a few transformations on uploaded images in the mongrel process, at what point does mongrel garbage collect the memory being used? Adding even 10 Mb of memory usage to my mongrels on average would be quite bad.

I like RMagick, and certainly fixing any memory usage problems seem like good idea, and one that people (myself included) should help out with, but I also found that switching to minimagic added pretty solid predictability to the process of manipulating images and didn’t really have any downsides. Having said that, I’ve tried, in the past, some fairly complex things with RMagick, and I’m yet to do the same with mini-magick, so I’m not sure how that will go.

And, thanks for the reply Tim, always good to get informed views on the subject.

Post a comment

Required fields in bold.