MySQL 5.1 introduces row-based binary logging. In fact, the default binary logging format in GA versions of MySQL 5.1 is ‘MIXED’ STATEMENT*;   The binlog_format  variable can still be changed per sessions which means it is possible that some of your binary log entries will be written in a row-based fashion instead of the actual statement which changed data, even when the global setting on the master is to write binary logs in statement mode.   The row-based format does offer advantages particularly if triggers or stored procedures are used, or if non-deterministic functions like RAND() are used in DML statements.

A statement based replication slave can get out of sync with the master fairly easily, especially if data is changed on the slave.   It is possible for a statement to execute successfully on a slave even if the data is not 100% in sync, so MySQL doesn’t know anything is wrong.  This isn’t the case with row-based replication.  Only actual changes are written to the database with row-based logs.  Take as an example a DELETE statement which does not modify any rows.  No binary logging will be performed by this statement in row-based logs, but the DELETE will be written to a statement based binary log.  Since row-based logs only contain information about changed rows, a replication slave which does not find a row referenced in a row-based log will immediately return an error: HA_ERR_KEY_NOT_FOUND.

This is probably best shown by example.  First, let’s create a test table on a MySQL master database and populate it with some data.

Insert a single “seed” row into the table:

INSERT … SELECT a few times to get some dummy data:

Verify that we have a reasonable amount of data to play with:

My test setup includes a MySQL slave.  After the above setup completes, I am going to change data on the slave by truncating t1.  After truncation:

Now, lets make a modification on the master:

On the slave, SHOW SLAVE STATUS now reports an error (many fields omitted):

This error message is telling us that a row that was modified on the master can not be found on the slave:HA_ERR_KEY_NOT_FOUND.  It doesn’t tell us what row it couldn’t find though.  In our simple example, we know that all the rows are missing, and the one in particular that the server is trying to find is c1=1, but how can we find that out if we don’t know the conditions under which replication went out of sync?

To determine exactly what row is having an issue, we can use the ‘mysqlbinlog’ utility in combination with the slave relay log.  Examining the output of SHOW SLAVE STATUS, we can see that the slave is currently executing at relay log file: mysql_sandbox25162-relay-bin.000002 at position: 340718.  MySQL 5.1 ships with a new version of mysqlbinlog that includes new options that make decoding RBR statements possible.  These options are “–base64-output” and “-v”.  For debugging row-based logs, I suggest ‘–base64-output=decode-rows -v”.  This omits any base64 output and instead replaces the output with semi-valid SQL statements.  I say semi-valid because the statements do not include column names.  Column names are not included in the binary log, only positions.

mysqlbinlog mysql_sandbox25162-relay-bin.000005 –base64-output=decode-rows -v > decoded.log

Examine the decoded binary log with the ‘less’ utility, and search forward for the value 340718:

As you can see, the RBR entry has been decoded as an update statement.  The OLD value (the row that is missing) is represented in the WHERE clause.  The binary log does not carry column names, so each column is represented by the column position prefixed with @.

There you have it.  You can now find out exactly which row is missing and begin your investigation as to why it is missing. You may have to do a bit more leg work if many rows were modified during a transaction.

You should always set read_only = true on your MySQL slaves to prevent accidental changes in data from happening!

*edit*
The MySQL 5.1 default binary logging format changed back to STATEMENT in MySQL 5.1.29.

14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Giuseppe Maxia

MySQL 5.1 default binlog_format in GA has always been STATEMENT. The main reason for this choice is that we wanted to avoid compatibility issues.
Notice that 5.1 has been GA since version 5.1.30.
I must say that your example is quite unsettling, as it is based on a human mistake that puts the slave in a non recoverable state.
See also this post, where the enhancement of mysqlbinlog was announced.
http://datacharmer.blogspot.com/2008/09/dsclosing-binlog-entries-with-row-based.html

Peter Zaitsev

Justin,

With Statement Level replication MySQL 5.1 to 5.0 works for some kinds of applications. It is not guaranteed to work in all cases of course but it does for many simple MySQL use cases in practice.
This is in fact important case because it allows to get extra assurance for MySQL 5.0 -> 5.1 migrations – you can keep MySQL 5.0 slave for some time after promoting Master to MySQL 5.1 to have something to roll back to.

Peter Zaitsev

Justin,

On Compatibility – I think this is why there is MySQL and Drizzle. MySQL has a lot of weight with bad choices over many years which can’t be easily broken because this unsettles a lot of customers who relay on these behaviors.

Also people do not often know they relay on any side effects. They just upgrade Master and Slave to MySQL 5.1 and just expect things work as before. If it breaks this is serious issue.
I think moving back to STATEMENT by default in MySQL 5.0 was a good choice. Breaking existing applications is a lot more serious issue than improving things (which row level replication does)

Kim Carlsen

One problem I have experienced with ROW based binary logging, is that mysql can make log entries that get too huge to be parsed through mysql client.

An example is, if you delete all rows in a huge table
DELETE * FROM huge_table;

This will be formulated as one BINLOG ” entry in the binary log, with a delete statement for each row in the table.
As a result you will get the error ‘ERROR 1153 (08S01) at line 1074175: Got a packet bigger than ‘max_allowed_packet’ bytes’ when trying to apply
this log to a database (mysqlbinlog bin-log.000001 | mysql -uroot)

You can increase the max_allowed_packet to a maximum of 1GB, but you will still be able to get larger entries in the log.

Justin Swanhart

Hi Kim,

I wanted to let you know that the next release of mysqlbinlog should include a –max-allowed-packet option instead of hard coding it to 1G. This should resolve your problem with mysqldump.

Hi All.
I using MIXED binlog format. And having problem with simple delete query.
DELETE FROM temp.tenp WHERE id = 10;
This query must delete 3000000 rows from my innodb table.
But slave came to out of sync. (like in comment#2 by Justin Swanhart were hi used STATEMENT binlog format)

But if I’ve change binlog format to ROW based, I’ve got error like in comment#8 by Kim Carlsen …..

What I must to do ? Some configuratuin tuning?
do not delete a lot of rows in one query ?
How much rows I can delete per query if I using MIXED binlog format ?

Trung Ly

I’m experiencing a very strange situation. I set binlog_format = “ROW” in a console applying for my current session only

mysql> show variables like ‘%binlog_format%’;
+—————+——-+
| Variable_name | Value |
+—————+——-+
| binlog_format | ROW |
+—————+——-+

and ran an update query to change data of 1 row. The change only affected on the specific column(s) which determined in the update query. At this moment, I’m so confused about the accuracy of MySQL replication.

For your info
* MySQL Master: MySQL Cluster 5.1.56-ndb-7.1.17-cluster-gpl-log
* MySQL Master: binlog_format = MIXED
* MySQL Slave: MySQL 5.1.56-ndb-7.1.17-cluster-gpl
* MySQL Slave: SET GLOBAL SLAVE_EXEC_MODE = ‘IDEMPOTENT’;

Are there bugs on this version?

Raghunandan

While we are on this topic, is it possible to replicate only inserts that happen on master to slave? I know its not there in 5.1 but in 5.5 is there any option to replicate DML of choice? I need this for a thing I am experimenting and wanted to know if either MySQL or Percona has any solution.

Thanks and Regards,
Raghu

arganzheng

Hi, you mistaken the row-based replication and statement-based replication. It is the RBR with slave_exec_mode=strict will cause the HA_ERR_KEY_NOT_FOUND problem. check it for yourself: http://dev.mysql.com/doc/refman/5.1/en/replication-rbr-usage.html

mani

Thanks man, I will check it immediately