May 22, 2007

Wishes for mysqldump

Posted by peter

Dealing with dumping and recovery of large and partially corrupted database I’ve got couple of feature ideas for mysqldump or similar tool and appropriate import tool

Dump in parallel single thread dump is not efficient of course especially on systems with multiple CPUs and disks. It is lesser issue in recovery case because import takes most of the time anyway but will be handy for some backup needs.

Dump each table in its own file This is much more convenient compared to single say 100GB sql file allowing to prioritize data load and if load fails for any reason you can easily restart it. Also it allows to prioritize data load if you’ve got to load many tables which have different priority for your application.

Safe Dump Dumping corrupted Innodb tables you will have some of the them crashing Innodb which breaks mysqldump process. It would be great for load script to record in the logs such table was not dumped successfully, wait for MySQL to complete restart and continue going with other tables.

Parallel restore This is absolutely required if time is the issue as serious systems may perform much better in such case.

As usually when need arises there is no time to implement solid solution which will work for a lot of people so we get by with few quick and dirty shell scripts to do this.

Innodb crash recovery update

Posted by peter

I have not had a serious Innodb corruptions for a while, typically even if it happened it was some simple table related corruption which was easy to fix on table level. In couple of cases during last year when it was more than that we had backups and binary logs which means it was easier to recover from backup and replay binary logs.

This time I have a challenge to play it hard way because backup is in special form which will take a while to recover. It also should be nice exercise in patience because database is over 1TB in size.

One bug I already reported makes me worry. If it is global bug it should have been Innodb recovery show stopper while it goes back so many releases (5.0.33 surely still has it).

Lets see what else we run into.

One minor “practicality” I should mention is using –socket=/tmp/mysqlx.sock –port=3307 or something similar to make sure MySQL is isolated from all scripts which may bother it for the time of recovery. For complex systems it may be very hard to ensure no one touches MySQL using other ways.

Magic Innodb Recovery self healing

Posted by peter

We have certain type of the table corrupting with Innodb, as it is limited to only one particular index on one particular table type it is likely to be Innodb bug but Heikki currently could not find what could be causing it.

Happily as we have data stored in many tables of same format rather than one monster table these rare corruptions did not cause too much problems to us - as it was not clustered index we always could repair table by running ALTER TABLE with limited impact to production system (as only one table of about a hundred will be locked)

So everything kind of worked (of course we still hoped new MySQL release will have this Innodb bug fixed) until today we got MySQL to crash during recovery process right after 17% of log records were applied.

Such corruption is one of the worst onces, because you can’t really recover data on per table basics.

Also if MySQL crashes during recovery lower values of innodb_force_recovery normally would not work. Reading Documentation one may think you need to set it as far as 6 (SRV_FORCE_NO_LOG_REDO) if crash happens during redo stage of log recovery. As it is very nasty option I wanted to check if this is really required by trying lower recovery settings.

Happily I was able to restart MySQL with innodb_force_recovery=4 (SRV_FORCE_NO_IBUF_MERGE). So in my case probably Insert buffer merge was affecting corrupted page rather than log replay itself so I got lucky.

My initial plan was to now dump potentially corrupted tables (in this particular corruption instance Innodb did not print table name in question), drop tables, restart Innodb without innodb_force_recovery and load them back. But just yesterday one of the customers told me he had Innodb magically healing itself after he was able to complete recovery with innodb_force_recovery=4. So I decided to check if I’m to get lucky second time today.

And Indeed Restarting Innodb with innodb_force_recovery=0 allowed it to start normally and I could rebuild affected tables my
“normal way”.

So when dealing with Innodb recovery for bad corruptions you may want to:

- Start with lower innodb_force_recovery settings and see which one allows you to start.

- After recovery succeeds you can try restarting with lower recovery setting as it may allow you to recover more data or avoid full dump and restore.

Another trick which helped me in previous instances, but not this one is to do full system restart before attempting recovery. In few instances problems were caused by Kernel bug, inconsistence in OS cache or RAID cache or something else which made problem to disappear after system restart.