I have a 5G mysqldump which takes 30 minutes to restore from backup.  That means that when the database reaches 50G, it should take 30×10=5 hours to restore.  Right?  Wrong.

Mysqldump recovery time is not linear.  Bigger tables, or tables with more indexes will always take more time to restore.

If I restore from a raw backup (LVM snapshot, xtrabackup, innodb hot backup), it is very easy to model how much longer recovery time will take:

I can tell progress with mysqldump by monitoring the rate at which show global status like 'Handler_write'; increases and compare it to my knowledge of about how many rows are in each table.  But progress != a magic number like “17 minutes”.  Not unless I do a lot of complex modeling.

I am not saying a 5 hour recovery is good or bad.  What I am saying is knowing remaining time is very important during disaster recovery.  Being able to say “we’ll be back at 2PM” is much better than saying “we’ll be back between 1PM and 4PM.. maybe”.

27 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Gwen Shapira

Responsible DBAs test restore from their backups on a regular basis.
Say, every 2 month. During this check, it is useful to record how long restore took for exactly this purpose. When management asks – how much longer will this take? I can look in my charts and say “20 more minutes, captain”.

It is still good to be able to calculate the remaining time, but maybe not as critical.

Nikolay

I have huge mysql server with lots of myisam/maria databases, can someone elaborate how we can do non blocking backup, even if we back up some dirty (in the middle of transaction) data.

Gerry

I wouldn’t favor depending solely on mysqldump for backups unless you have some other HA considerations in place (ie: master-master). This way you can work on the restore while the service is still on line.

My $.02
G

Rolando Edwards

There are two alternatives, one using mysqldump and one not.

Option 1 (using mysqldump)

Write a customized backup script
1) Extract the names of all databases (except mysql and information_schema)
2) FLUSH TABLES WITH READ LOCK;
3) Record Log File and Position from SHOW MASTER STATUS;
4) Run all mysqldumps of each database in parallel as background processes

Apply this in reverse when restoring
Load all the mysqldumps back in parallel

Example : I have a client at my DB/Web hosting company with 650+ databases (450+ GB) in one mysql instance.
mysqldump databases one by one : 7 hours
mysqldump in parallel, 7 databases at a time : 3.5 hours

I have performed 3 full restorations in this client’s lifetime with LogicWorks

All 650+ DBs (450+ GB), 10.5 hours, loading 32 mysqldumps in parallel at a time

I am sure 5GB should present no challenge if you parallel dump all databases. If this involves one database, then parallel dump all individual tables (which I have also scripted for other DB Hosting clients)

Option 2 (not using mysqldump)

MAATKIT’s very own toolset has Parallel Dump and Parallel Restore sugin SELECT INTO and LOAD DATA INFILE under the hood.

http://www.maatkit.org/doc/mk-parallel-dump.html
http://www.maatkit.org/doc/mk-parallel-restore.html

Patrick Casey

If you’re just running innodb, you can run mysqldump with the –singletransaction flag which will get you a consistent logical backup. It’ll take a long time on a larger database, but it will give you a snapshot in a non-blocking manner.

If you have a non transactional engine in your mix (myisam I’m looking at you), then there really is no way to get a “hot” consistent dump. You can issue flush tables with read lock, but to do this right you have to hold the lock until your dump completes. A read lock is just that, a lock, and that means no delta transactions as long as the backup is in flight.

If you are on some form of snapshot aware storage you an execute:

Flush tables with read lock
SNAP
Release lock

This will get you a consistent snap even on myisam. Problem is the FLUSH can take an unpredictable amount of time b/c it must wait for all in flight queries to complete. That means its impossible to guarantee a “no downtime backup” via this method. You’re locked up for flush time + snap time.

If you’re pure innodb, I generally trust its recovery engine so you can snap it w/o the read lock. This is a crash consistent copy and you’ll have to go through a recovery when you try to restore that backup. There are horror stories on the internet of this form of snap coming up corrupt, but I will say I’ve personally never had a problem with it.

Baron Schwartz

Very often there is no way at all to make any kind of dump & load work, period. It is not uncommon for me to see clients whose dumps would take days or weeks to load a single table. At such sizes, dumps are completely unusable. One of the main problems is building indexes.

Lucas

I relly like the mysql dump, but they have any compability problems whith another version`s of mysql, maybe its better now.

Mark Wilden

Don’t update indexes for each record added. Drop them before the load, then recreate them after.

Mark Wilden

Actually, I was just making a general comment – not necessarily referring specifically to Percona plugins.

However, I don’t see how it could be slower to create the index all at once than to do it one record at a time.

Morgan Tocker

@Mark: It’s in the manual:

“In MySQL versions up to 5.0, adding or dropping an index on a table with existing data can be very slow if the table has many rows. The CREATE INDEX and DROP INDEX commands work by creating a new, empty table defined with the requested set of indexes. It then copies the existing rows to the new table one-by-one, updating the indexes as it goes.”

http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-create-index-overview.html

In the text it says up to 5.0, but it’s including 5.1 if you don’t use the plugin.

Mark Wilden

Wow, I had no idea – that’s pretty incredible. I guess it’s for the benefit of the clustered index, but still.

Thanks for the information>

Blagomir Ivanov

By my opinion there are =rare cases that you have to restore the entire database. I have made an entire restore few times in my life. In most cases there is one one or few broken tables. The best solution for me was to write a script that takes all databases and all tables names in those databases and archive them (with mysqldump) in separate files.
At the end i have:
db.table1
db.table2
…..
db.tableX

Dont know its the best, but it`s working for me 🙂

Patrick Casey

Blagomir,

We use a similar technique for a subset of our logical backups and it works well.
One challenge folks should be aware of though is that if you do no explicitely dump all tables without one transaction, each table will dump out at a different point in time so they won’t necessarily be consistent between tables.

Example, if you have:
Food
Meals (contains a foreign key to food)

And you dump food at 12:51 PM and Meals at 13:22 PM, then its possible there will be a reference in Meals to a piece of food which didn’t exist at 12:51 PM.

Depending on data model this may or may not be an issue for any given app, but its an important constraint to be aware of if you’re considering going down this road.

Andy

Use LVM for your filesystems and place MySQL on an LVM partition.

You can then lock all tables and snapshot the partition – only a few seconds of lock time.
Then copy the snapshot to another partition or device.

You now have a full copy of the raw data – so restore is only the time taken to copy it into place – no restoring of dumps etc.

Here is a really nice tool to do it all – http://www.lenzg.net/mylvmbackup/

Baron Schwartz

Agree, except for “only a few seconds.” http://www.mysqlperformanceblog.com/2010/04/24/how-fast-is-flush-tables-with-read-lock/

There are lots of articles on this blog about LVM backups.

Patrick Casey

Andy, other challenge I’ve run into with LVM snapshots is that they really load up the disk subsystem.
While I have a snap mounted I’m doubling my write iops.
To use the snap as a backup, I have to copy it off main system and onto a backup device, which means I have to read hundreds of GB of data, which, again, loads up my io subsystem at the same time I’m taking a double-write penalty.

We’ve found in our environment that SAN snaps work great (very low cost), but that LVM snaps are too intrusive on the production servers for regular use, so we typically snap the slaves nightly and the masters weekly since doing an LVM snap on a busy database will grind it down too much to tolerate.

Andy

yes, LVM does have the issue of extra writes… so doing snapshots off the slave helps.

Marc

What I’m really missing here is the “this is the way to do it” with a full example. I’m no sql management expert but find myself having to set one up regardless.
I’m looking for the “works in 99% cases” solution but it almost seems like it’s more of a “there’s no good solution”. Which is kind of weird for such a popular database system.

Ilan Hazan

Another argument for not using mysqldump for recovery is that when restoring a dump table into the MySQL master server, it can lead to serious replication delay.
To overcome the replication delay, caused by restoring the dump table on the master, there is a need to widespread the massive inserts. This can be done by the MySQL SLEEP command.
See As restoring a dump table into the MySQL master – you better get some sleep

Mark Wilden

I do see in http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-create-index-overview.html that in 5.1, creating a *secondary* index is faster with ADD INDEX than inserting one-by-one.

Simon J Mudd

@Andy, I have been doing LVM snapshots for cloning and backups. Indeed it works well and is pretty easy to setup. However, I have seen considerable performance problems on CentOS 6 with xfs which were not apparent in CentOS 5, especially with filesystems of a few hundred GB or more. This can make using this method rather more cumbersome that you would expect.