April 23, 2014

Follow these basics when migrating to Percona XtraDB Cluster for MySQL

Percona XtraDB ClusterGalera/Percona XtraDB Cluster (PXC) for MySQL is a hot thing right now and some users jump right in without enough testing. Consequently, they’re more likely to either suffer failure or issues that prevent them from moving forward. If you are thinking of migrating your workload to Percona XtraDB Cluster, make sure to go through these basics.

log_slave_updates is REQUIRED

You need to have log_slave_updates enabled on the cluster node acting as async slave for replicated events from the async master to be applied to the other nodes, that is if you have more than one PXC node. This is because before Galera can create writesets for the replicated events, binlog events must be generated for the transactions first. Under normal async replication, an event will not be written to the slave’s binary unless log_slave_updates is enabled, this is similar to Percona XtraDB Cluster in that if you want an async event replicated to the whole cluster you have to have the same enabled.

MyISAM in PXC May Lead to Inconsistencies and May Not Even Work!

MyISAM tables are supported within Percona XtraDB Cluster, however, MyISAM has only basic support, primarily because the storage engine is non-transactional and so PXC cannot guarantee the data will remain consistent within the cluster. Also, at the time of this writing, from async stream, MyISAM is not being replicated at all which I reported on this bug. This would be a showstopper for anyone who wants to, but still have MyISAM tables. You can still try by filtering MyISAM tables though if you can leave them behind. Lastly, once that bug above is fixed, and you still have MyISAM tables you wish to keep running under PXC, wsrep_replicate_myisam allows you to do so. However, if you can, you should consider moving to InnoDB altogether. There are very few reasons to stay with MyISAM nowadays i.e. if you have FULLTEXT you simply cannot replace in short term.

Control Your Auto-Incrementing Columns

PXC/Galera controls auto-incrementing values internally within the cluster, this is to avoid collisions when INSERTs are happening on not only a single node. However, this may work differently when replicating from an async master, for example like the one described on these two bugs. Galera use writesets to replicate cluster events to the other nodes, in essence these are RBR events, plus a few additional structures used for certification. Having said that, it would be good if your async master can use ROW based binlog format as well to achieve better consistency, if you have an async master <= 5.0 though, you can workaround this by turning off wsrep_auto_increment_control from the Percona XtraDB Cluster nodes as workaround. Note that with the latter, make sure to not forget turning the feature back on when you switch to the new cluster especially if you are planning to write on multiple nodes.

Have PRIMARY KEYS

If you still have tables without PRIMARY KEYs, then its time to make one for them. Galera does not work well with those and even if there is basic support when wsrep_certify_nonPK is enabled, you can still hit issues like when automatic creation of primary keys for use during certification becomes non-deterministic. Although the previous bug has been fixed on latest release (5.5.30-23.7.4), table without PK imposes an additional overhead, and because cluster performance is somewhat dependent on the slowest node – this overhead can easily become visible on the whole cluster and your async replication being affected.

Be Prepared for some Latency

PXC can take workloads, however not just any workload – it shines with small transactions but not with big ones. If you are consistently running overnight reporting jobs and getting them through the replication stream expect some replication lag. This is because because synchronous replication inside PXC has an additional overhead, this means the SQL_THREAD will not be able to execute events as fast, on top of that, other factors affecting async replication like if your workload is CPU or IO bound. Peter wrote some good details about it here.

If you have encountered any other issues replicating to Percona XtraDB Cluster, I’d like to hear your thoughts and experience on the comments :)

About Jervin Real

Jervin is a member of Percona's Rapid Response Consulting team. When you come to Percona for consulting, chances are he'll be greeting you first. His primary role is to make sure customer issues are handled efficiently and professionally. Jervin joined Percona in May 2010.

Comments

  1. Mrten says:

    It’s a pity that the mysql databases with all the privileges are still MyISAM…

  2. Jervin Real says:

    Mrten,

    Galera/PXC handles the privileges differently even if these tables are MyISAM – simply do not make direct modification to them instead use the usual commands i.e. GRANT, DROP USER, etc

  3. Mrten says:

    I know, but when you’re moving to PXC and have a metric sh*tload of grants (on tables, on columns, …) that need an extra IP because you move from 127.0.0.1 to 192.168.x.x it’s a whole lot simpler (and by extension, safer) to run (CREATE TABLE AS SELECT *, UPDATE host, INSERT) a few times than trying to recreate the grants.

    I don’t know whether it’s even possible or wheter Oracle is planning to move the mysql db to InnoDB tables (try googling for that), but it would be a nice addition…

    Thanks for the log-slave-updates tip though, hadn’t heard that one before.

  4. Manish says:

    We recently migrated to Percona DB on our local machines and when 5-6 user submit some form at a time some miss-happening occur and due to this some of our records not saved properly. Is this because of Percona? as earlier it use to work fine on SQL. does it not support concurrent submission of a form (multiple user saving same form at a time).

  5. Jervin Real says:

    Manish,

    What kind of “mis-happening”? I’d suggest posting any error logs you get from the MySQL server via our forums – http://www.perconaforums.com/

  6. Manish says:

    Hi Jervin,

    thanks for the prompt reply. we do not get any error as such, it is just that the data are not getting updated in all the tables and only few tables are getting updated which was never the case when we were on SQL.
    so i guess the problem is due to Percona.
    Please help us in this regards.

    Note: i checked the same process on our Development server which is still running on MYSQL and it worked fine.

  7. Ricardo says:

    I having a hard time with percona xtradb cluster. The cluster has 3 nodes, and is up and running (all tip top according to wsrep status indicators). It did the SST perfectly fine from one node to the other 2. BUT every time i do an update in a table, the write is not propagated to the other nodes, even though ddl statements such as table creation are! any idea of what’s going on? i turned all sorts of logging on and i keep seeing:
    “Skipping empty log_xid” messages in error log, but not affecting the tables im testing so not sure this is related..

  8. Jervin Real says:

    @Ricardo,

    What kind of tables? Can you create a new thread in the forums and post your configuration and wsrep_* status there.

  9. Ricardo says:
  10. Ricardo says:

    @Jervin
    can you take a look at that thread – any idea of what’s going on? is there any more debug i can do? thanks

  11. Pat says:

    Is log_slave_updates still required? If so it should be added to the installation guide: http://www.percona.com/doc/percona-xtradb-cluster/5.5/howtos/cenots_howto.html

  12. Jervin Real says:

    @Pat,

    log_slave_updates is only required it your cluster is replicating asynchronously from a non-member of the cluster.

Speak Your Mind

*