July 31, 2014

Why audit logging with triggers in MySQL is bad for replication

Recently I was tasked with investigating slippage between master and slave in a standard replication setup.

The client was using Maatkit’s mk-table-checksum to check his slave data was indeed a fair copy of that of the master.

He could then examine the checksum.checksum table and see all was well, however there were various tables with different crc values.

So, now I needed to find out what was updating the table. Here is where tools like mysqlsla and Maatkit’s mk-log-parser come into their own as they both allow you to quickly parse the binary log files, extracting the relevant statements.

Check out http://hackmysql.com/mysqlsla_filters for how to filter by statement.

Looking through the binary logs I could see this table is actually an audit table for changes to the Foo table. The trail is kept using two triggers on that table.

So whats the problem with that?, well there is a situation where two overlapping transactions updating the Foo table can be reordered once serialized on the slave.

Here is an example:

I recreated the tables and triggers, populating the Foo table with a handful of rows and then ran the following.

Here is the update trigger:

    Transaction 1 starts and updates.

    Transaction 2 starts, updates and commits.

    Transaction 1 commits last.

Now when these statements get run on the slave they will be serialized, thus changing the order of the inserts made by the trigger. The Foo_History table is now out of sync.

Master:

Slave:

As you can see from the above, the updates were performed in a different order, with the inserts being assigned a different Foo_History_Id. This is because the statements are written to the binary log in commit order.

About Ewen Fortune

Ewen has extensive background in networking and is MySQL Cluster certified. He is a former Percona employee.

Comments

  1. I think you’re subject line might be a bit misleading. I mean, replication is technically behaving exactly as it should be. If MVCC rules are correct and we do not have dirty reads, then from a client perspective, the order of events on the slave is more correct.

    Depending on how the replica will be used, this might be fine. From a historical perspective, the order can’t truly conflict or you’d have gotten a deadlock in the above process. So historical analysis shouldn’t be broken. If your replica exists for emergency backup/restore, then having a few lines out of order shouldn’t cause any problems.

    One obvious way around this is to find a way not to have an auto-incremented key on the history table. If you can make a unique, combined key as primary key, it should be the same regardless of the insert order. If that can’t be done, then consider skipping a primary key altogether – maybe the history table isn’t accessed enough for this to be a performance issue. If you can’t do either of those, then I would suggest that you disable the crc check on this table and just have mk-table-checksum do a row count.

  2. Ewen says:

    Hi Gregory,

    I admit I had problems inventing a title for this post, the truth is that there is slightly more to the story. I also saw some statements in the binary logs that were deleting rows on the value of the autoincrement column, therefore deleting different row data. You are correct in stating that replication is behaving as expected, however from both the point of view of the developers and the system architect things are a little different. They are both expecting the slave to contain the same data as the master.

  3. Good points all. Another major thing here is that the client doesn’t want to have any checks showing “orange lights” that are to be ignored, because that leads to bad things — people get numb to any sort of warnings that the slave has different data.

    I think this table probably doesn’t need a primary key. I had pondered how to avoid the problem but there’s no candidate key at all in the data (Ewen didn’t show the real client data), so if there’s to be a PK it has to be a surrogate.

  4. I guess this is one of those situations where not being able to see the full story can lead to all kinds of incorrect assumptions. For instance, when I saw there were deletions on the auto_increment value I immediately jumped to the conclusion that the application is deleting audit data, which would be a major major no no.

  5. Dmitry Lenev says:

    Curious what is engine of “Foo_History” table? If it is the same engine as for “Foo” table then I would say that you might have encountered a bug worth reporting…

  6. Ewen says:

    Dmitry,

    Both Tables in this case were InnoDB.

  7. Jesper Krogh says:

    I’m sorry, but can you clarify how OLD.Foo_History_ID comes from? Looking at you trigger definition, it looks like it is comes from the table being updated, but from the result it looks like an auto_increment column on the history table.

    If the former is the case, then they should be the same on the master and the slave.

    If the latter is the case, then even without replication there might be a problem if the updated table also has an auto_increment column (which is my impression is not an issue here), as I believe MySQL doesn’t support having two auto_increments in the same “statement” (including triggers).

    Of course, one should remember that the exact same issue exists, if the master is reloaded using the binlog.

  8. Ewen says:

    Hi Jesper,

    Its the latter, but the restriction is one autoincrement per table not per statement.

  9. Dmitry Lenev says:

    Hello Ewen!

    I think that actually Jesper is right. In case of statement based replication there is also a restriction on number of auto-increment columns used by statement as whole.
    Please see http://bugs.mysql.com/bug.php?id=19630. Using more than one auto-increment column in statement should be fine for row based or mixed mode replication though.

    Best regards!

  10. Jeremy says:

    Very similar to what we are doing today, although order doesn’t matter since our application orders by timestamp. Giuseppe Maxia posted something recently on Revision Engine (http://www.ddengine.org/versioneng) and we started looking at this to gain logging without writing scripts to build triggers on every table.

Speak Your Mind

*