A Rake Task for Database Backups

Posted by Craig Ambrose on March 01, 2007 at 12:54 AM

I wrote this little rake task for a client yesterday, and it seems handy enough that it’s worth disseminating. It doesn’t, however, seem to justify a plugin as yet.

Before doing this, I did search through the lists of rails plugins to see if there was a solution for backup up databases. I was fairly surprised to find that there wasn’t. Yes, to some extent this is our hosting provider’s responsibility. However, they often charge to retrieve backups, and might not give us the fine grained control that we are after.

To use this task, add the following code to a rakefile somewhere. I created a file called backup.rake, and put it inside /lib/tasks.

The Code


require 'find'namespace :db do desc "Backup the database to a file. Options: DIR=base_dir RAILS_ENV=production MAX=20"
task :backup => [:environment] do
datestamp = Time.now.strftime("%Y-%m-%d_%H-%M-%S") base_path = ENV["DIR"] || "db"
backup_base = File.join(base_path, 'backup')
backup_folder = File.join(backup_base, datestamp)
backup_file = File.join(backup_folder, "#{RAILS_ENV}_dump.sql.gz") File.makedirs(backup_folder) db_config = ActiveRecord::Base.configurations[RAILS_ENV] sh "mysqldump -u #{db_config['username']} -p#{db_config['password']} -Q —add-drop-table -O add-locks=FALSE -O lock-tables=FALSE #{db_config['database']} | gzip -c > #{backup_file}" dir = Dir.new(backup_base)
all_backups = dir.entries[2..-1].sort.reverse
puts "Created backup: #{backup_file}" max_backups = ENV["MAX"].to_i || 20
unwanted_backups = all_backups[max_backups..-1] || []
for unwanted_backup in unwanted_backups
FileUtils.rm_rf(File.join(backup_base, unwanted_backup))
puts "deleted #{unwanted_backup}"
end
puts "Deleted #{unwanted_backups.length} backups, #{all_backups.length - unwanted_backups.length} backups available"
endend

Usage

Here’s how the new backup rake task works. You call it with:

rake db:backup

Options

It has a few possible options, I’ll give an example first, then explain:

rake db:backup DIR=/home/sugarstats RAILS_ENV=production MAX=10

The DIR option specifies the base directory for the backups. I create a “backup” directory inside that, so you don’t need to include that on your path. This actually defaults to “db” inside the application, but you will want to overwrite it with the option above.

RAILS_ENV should be set to production. It is probably development by default.

MAX is the number of backups to keep. It defaults to 20, which I think is a nice safe number. Once this number of backups is exceeded, older backups are deleted. There’s no way to turn this off, other than removing that bit of code.

In Production

You’ll want to call this from cron. Like all rake tasks, it’s expecting to be run from the application directory, so the cron task needs to change the directory first. I’ve found that "cd /app_dir && rake db:migrate" works well, but you should test your cron task is working. I’m always getting cron tasks wrong.

Credits

This code was paid for by Marston Alfred, over at Sugarstats and is used in production for that site. Sugarstats is a website to help diabetics manage their sugar levels, activities, and medication, and is well worth a look (even if I do say so myself).

Tags: (none)
Hierarchy: previous, next

Comments

There are 23 comments on this post. Post yours →

Ben Askins

Nice one Craig – I have a need for this on a client site who’s using a windows production server. I’ll let you know I go with it.

Works like a charm :-)

Excellent work fella… using this with cron could one day be a lifesaver! Here’s a similar idea for backing up Rails data and then uploading it to Amazon S3.

The line:
sets max_backups to 0 if you don’t supply MAX because
This caused the script to delete all backups.

>> nil.to_i
=> 0

Changing it to:

max_backups = ENV["MAX"] || 20
unwanted_backups = all_backups[max_backups.to_i..-1] || []

seems to work.

Other than that, this works great! Thanks!

This might be obvious but it took me a couple minutes to find out. In the script posted the code isn’t formatted quite right and its missing some returns. I had to go through and format the code correctly before rake would recognize it. Just a tip for anyone running into syntax errors.

Another formatting bit worth noting:

In the line:

sh "mysqldump -u #{db_config['username']} -p#{db_config['password']} -Q —add-drop-table -O add-locks=FALSE -O lock-tables=FALSE #{db_config['database']} | gzip -c > #{backup_file}"

It seems that Wordpress or whatever this blog is running on has converted two hyphens into an em dash. Nice for text, not so nice for code.

Just convert them back into double dashes, and stuff works.

Thanks for this post!

:D

Craig

Well spotted Patrick. The blog is running on SImplelog. I’ll see if I can find a way to turn that “feature” off. :)

rense Vanderhoek

Great script. I added -h#{db_config[‘host’]} to the sh-line :)

I have detected a little bug with this task and it’s that nil.to_i returns 0 (at least in my ruby version, 1.8.5), so the expression ENV[“MAX”] || 20 will always return 0 in case no MAX value was given and that causes that all backups are deleted. I changed it to max_backups = (ENV[“MAX”] || 20).to_i and now it works fine (I know that converting an integer to an integer again is quite redundant)

Well spotted dagi3d and Tom, I’ll have to pop that into the script. :)

I’ve spent a little time tweaking this script so it plays nicely.

It includes the above mentioned fixes for the backup delete problem, the missing carriage returns and the double dash issue.

I’ve used—opt instead of the original mysqlbackup command line options, which should automatically choose the optimal dump options.

I’ve also put in a little hack to remove the -p option in case you’re testing locally on a user account with no password – the default MySQL set-up.

http://pastie.caboo.se/95708

Windows users go here to get gzip, without which the script will fail. You will need to add the gzip directory (probably C:\Program Files\GnuWin32\bin) to your system path, then re-open your shell.

http://sourceforge.net/project/showfiles.php?group_id=23617

MySQL ONLY

Is there a database agnostic way to do this?

Good point Alex. I believe such a thing would be possible. Possibly as a two part process using the existing rake tasks to dump the schema, and then ANSI compliant SQL for reading and writing data.

However, frankly, backup is probably best served by your database specific tools. I’m not advocating mysql specifically, if I was using postgres then I probably would have backed up via a postgres specific method.

piyush

when i try this code i got the syntax error as unexpected tIDENTIFIER, expecting $end

solution ?...

piyush, the problem is at the end:

endend

Put a carriage return between the two ends.

Also, read the above comments for other bugs and a more robust code download.

Script works like a charm, but another delete problem is still present:

all_backups = dir.entries[2..-1].sort.reverse

The above code removes ’.’ and ’..’ from the entries list… However on my system, for some reason, these entries don’t appear as 1st and 2nd items, but somewhere in the middle of the array! (So it would delete all your backups and possibly more)
Instead use this code:

all_backups = (dir.entries – [’.’, ’..’]).sort.reverse

Furthermore, although nobody will be stupid enough to run anything like this as root, it’s still not quite elegant to use rm_rf (although short in code)... It’s better to iterate the dirs under backup_base, iterate the files under each of these dirs and delete them individually (and also the dirs containing them)..

Finally: you delete the directories containing the *.sql.gz.. but what if you want to keep different counts of backups for different environments? (i.e. perhaps only delete files for the current RAILS_ENV, and delete backupdirs only if empty)

Slight typo in your blog post.

“I’ve found that “cd /app_dir && rake db:migrate” works well, but you should test your cron task is working.”

I think you mean

cd /app_dir && rake db:backup

:)

This doesn’t work for me … I used the code from:

http://pastie.caboo.se/95708

and get this result:

$ rake db:backup
(in …)
*
* config.breakpoint_server has been deprecated and has no effect. *
*

mysqldump -u —opt | gzip -c > db/backup/2008-04-20_21-44-26/development_dump.sql.gz
Created backup: db/backup/2008-04-20_21-44-26/development_dump.sql.gz
Deleted 0 backups, 1 backups available

Actually, already for some time now, I am trying to figure out if and how the
ActiveRecord::Base.configurations[RAILS_ENV]
is or should work in Rails 2.0.2 …

E.g. with RAILS_ENV equal to ‘development’ I see this:

$ ./script/console
Loading development environment (Rails 2.0.2)
*
* config.breakpoint_server has been deprecated and has no effect. *
*

>> db_config = ActiveRecord::Base.configurations[RAILS_ENV]
=> “backup_production”
>> db_config[‘username’].to_s
=> ””
>> RAILS_ENV
=> “development”
>> db_config[‘database’].to_s
=> ””

Is it possible that the method
ActiveRecord::Base.configurations
is undocumented or changed recently?

Thanks,

Peter Vandenabeele
peter AT vandenabeele DOT com

Ah, I found it … I used this alias in config/database.yml

development: backup_production
backup_production:
adapter: mysql
...

and as a result this function:

ActiveRecord::Base.configurations[RAILS_ENV]

did not return the expected hash of database connection parameters, but only the name of the alias (“backup_production”).

If change my config/database.yml to

development:
adapter: mysql
...

it works as expected. Could this be a problem in the way the aliasing in the database.yml file is interpreted ?

Thanks,

Peter

Original script had quite a few problems for me… I made it work (great):

http://pastie.caboo.se/199267

require ‘find’

namespace :db do

desc "Backup the database to a file. Options: DIR=base_dir RAILS_ENV=production MAX=20"
task :backup => [:environment] do
datestamp = Time.now.strftime("%Y-%m-%d_%H-%M-%S")
base_path = ENV["DIR"] || "db"
backup_base = File.join(base_path, 'backup')
backup_folder = File.join(backup_base, datestamp)
backup_file = File.join(backup_folder, "#{RAILS_ENV}_dump.sql.gz")
FileUtils.mkdir_p(backup_folder)
db_config = ActiveRecord::Base.configurations[RAILS_ENV]
pass = ''
pass = '-p' + db_config['password'] if db_config['password']
sh "mysqldump -u #{db_config['username']} #{pass} #{db_config['database']} -Q —add-drop-table -O add-locks=FALSE -O lock-tables=FALSE | gzip -c > #{backup_file}"
dir = Dir.new(backup_base)
all_backups = dir.entries[2..-1].sort.reverse
puts "Created backup: #{backup_file}"
max_backups = (ENV["MAX"] || 20).to_i
puts max_backups
unwanted_backups = all_backups[max_backups..-1] || []
for unwanted_backup in unwanted_backups
FileUtils.rm_rf(File.join(backup_base, unwanted_backup))
puts "deleted #{unwanted_backup}"
end
puts "Deleted #{unwanted_backups.length} backups, #{all_backups.length - unwanted_backups.length} backups available"
end

end

swards

I was getting an error:

rake aborted!
private method `makedirs’ called for File:Class

So I added require ‘ftools’

It looks like this worked. More info on ftools is here:

http://www.ruby-doc.org/stdlib/libdoc/ftools/rdoc/classes/File.html

Why is require ‘find’ here?

I seem to have successfully created a backup. Thanks for this code. But I will not know how successful it was until I’ve restored from the backup… so…

Any tips on how to restore? It appears that Rails only supports decompressing strings, and I know I have binary data in my tables. It’s not clear from the mysql manual that mysqlimport will handle the output from mysqldump. I’m out of my depth if shell commands are required. How have others done this?

kevin

-O,—set-variable=name
Change the value of a variable. Please note that this option is deprecated; you can set variables directly with:
—variable-name=value.

Post a comment

Required fields in bold.