August 2, 2008

Recovery beyond data restore

Posted by peter

Quite frequently I see customers looking at recovery as on ability to restore data from backup which can be far from being enough to restore the whole system to operating state, especially for complex systems.

Instead of looking just at data restore process you better look at the whole process which is required to bring system to the working state, including data consistency requirements and times. This has to be considered for different data loss scenarios which may happen.
[read more...]

Troubleshooting Relay Log Corruption in MySQL

Posted by peter

Have you ever seen the replication stopped with message like this:

Last_Error: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted (you can check this by running ‘mysqlbinlog’ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.

This is relay relay log corruption and you can check details in the MySQL Error log file. The error message describes few reasons and indeed because there is little validation (ie no checksums) in the replication there are multiple reasons for bad event to show up in relay logs.
[read more...]