August 2, 2014

Innodb crash recovery update

I have not had a serious Innodb corruptions for a while, typically even if it happened it was some simple table related corruption which was easy to fix on table level. In couple of cases during last year when it was more than that we had backups and binary logs which means it was easier to recover from backup and replay binary logs.

This time I have a challenge to play it hard way because backup is in special form which will take a while to recover. It also should be nice exercise in patience because database is over 1TB in size.

One bug I already reported makes me worry. If it is global bug it should have been Innodb recovery show stopper while it goes back so many releases (5.0.33 surely still has it).

Lets see what else we run into.

One minor “practicality” I should mention is using –socket=/tmp/mysqlx.sock –port=3307 or something similar to make sure MySQL is isolated from all scripts which may bother it for the time of recovery. For complex systems it may be very hard to ensure no one touches MySQL using other ways.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Toby says:

    Were you able to find the root cause of the corruption?

    Software or hardware? (If the latter, Solaris’ ZFS may be prophylactic.)

  2. peter says:

    Toby,

    This is interesting one. As I wrote in the previous post most likely it is Innodb bug because corruption was happening always for the same index in the same set of tables, which is unlikely for hardware bug. Normally it was OK and easily recoverable but this time crash corrupted page was touched by insert buffer merge process which made this crash terminal :)

Speak Your Mind

*