Recently I was tasked with investigating slippage between master and slave in a standard replication setup.

The client was using Maatkit’s mk-table-checksum to check his slave data was indeed a fair copy of that of the master.

He could then examine the checksum.checksum table and see all was well, however there were various tables with different crc values.

So, now I needed to find out what was updating the table. Here is where tools like mysqlsla and Maatkit’s mk-log-parser come into their own as they both allow you to quickly parse the binary log files, extracting the relevant statements.

Check out http://hackmysql.com/mysqlsla_filters for how to filter by statement.

Looking through the binary logs I could see this table is actually an audit table for changes to the Foo table. The trail is kept using two triggers on that table.

So whats the problem with that?, well there is a situation where two overlapping transactions updating the Foo table can be reordered once serialized on the slave.

Here is an example:

I recreated the tables and triggers, populating the Foo table with a handful of rows and then ran the following.

Here is the update trigger:

    Transaction 1 starts and updates.

    Transaction 2 starts, updates and commits.

    Transaction 1 commits last.

Now when these statements get run on the slave they will be serialized, thus changing the order of the inserts made by the trigger. The Foo_History table is now out of sync.

Master:

Slave:

As you can see from the above, the updates were performed in a different order, with the inserts being assigned a different Foo_History_Id. This is because the statements are written to the binary log in commit order.

10 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Gregory Haase

I think you’re subject line might be a bit misleading. I mean, replication is technically behaving exactly as it should be. If MVCC rules are correct and we do not have dirty reads, then from a client perspective, the order of events on the slave is more correct.

Depending on how the replica will be used, this might be fine. From a historical perspective, the order can’t truly conflict or you’d have gotten a deadlock in the above process. So historical analysis shouldn’t be broken. If your replica exists for emergency backup/restore, then having a few lines out of order shouldn’t cause any problems.

One obvious way around this is to find a way not to have an auto-incremented key on the history table. If you can make a unique, combined key as primary key, it should be the same regardless of the insert order. If that can’t be done, then consider skipping a primary key altogether – maybe the history table isn’t accessed enough for this to be a performance issue. If you can’t do either of those, then I would suggest that you disable the crc check on this table and just have mk-table-checksum do a row count.

Baron Schwartz

Good points all. Another major thing here is that the client doesn’t want to have any checks showing “orange lights” that are to be ignored, because that leads to bad things — people get numb to any sort of warnings that the slave has different data.

I think this table probably doesn’t need a primary key. I had pondered how to avoid the problem but there’s no candidate key at all in the data (Ewen didn’t show the real client data), so if there’s to be a PK it has to be a surrogate.

Gregory Haase

I guess this is one of those situations where not being able to see the full story can lead to all kinds of incorrect assumptions. For instance, when I saw there were deletions on the auto_increment value I immediately jumped to the conclusion that the application is deleting audit data, which would be a major major no no.

Dmitry Lenev

Curious what is engine of “Foo_History” table? If it is the same engine as for “Foo” table then I would say that you might have encountered a bug worth reporting…

Jesper Krogh

I’m sorry, but can you clarify how OLD.Foo_History_ID comes from? Looking at you trigger definition, it looks like it is comes from the table being updated, but from the result it looks like an auto_increment column on the history table.

If the former is the case, then they should be the same on the master and the slave.

If the latter is the case, then even without replication there might be a problem if the updated table also has an auto_increment column (which is my impression is not an issue here), as I believe MySQL doesn’t support having two auto_increments in the same “statement” (including triggers).

Of course, one should remember that the exact same issue exists, if the master is reloaded using the binlog.

Dmitry Lenev

Hello Ewen!

I think that actually Jesper is right. In case of statement based replication there is also a restriction on number of auto-increment columns used by statement as whole.
Please see http://bugs.mysql.com/bug.php?id=19630. Using more than one auto-increment column in statement should be fine for row based or mixed mode replication though.

Best regards!

Jeremy

Very similar to what we are doing today, although order doesn’t matter since our application orders by timestamp. Giuseppe Maxia posted something recently on Revision Engine (http://www.ddengine.org/versioneng) and we started looking at this to gain logging without writing scripts to build triggers on every table.