Based on discussions with several clients, we are strongly considering implementing a limited form of parallel replication. Single-threaded replication is one of the most severe limitations in the MySQL server.

We have a brief outline of the ideas at this wiki blueprint. So far, the “binlog order” idea is the only one that is workable. It has the added property that it is much more flexible in the future, so we could lift some of the restrictions. We will initially impose these limitations to keep things simple, and make sure we can actually get something working for our clients.

We’re trying to help the users who are the most negatively affected by the limitations, and whose problems can be improved the most with a (relatively) simple implementation. In particular, that means heavily sharded applications, SaaS applications (see also our related whitepaper), and hosting providers. In general, users who have a lot of independent work happening in short transactions that can be parallelized safely on the replica.

If you are interested in hiring us to implement this limited form of parallel replication for your own use, please contact us to discuss. We need to get more customers involved, to share the cost of this large and ambitious project.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Robert Hodges

Welcome to the party. 😉 I believe you saw my article on Tungsten work in this area (http://scale-out-blog.blogspot.com/2010/10/parallel-replication-on-mysql-report.html). We are in testing with our own customers now, but look forward to friendly competition from the geniuses at Percona. We may want to think about some shared benchmarks for SaaS / ISP use as we both have strong interests in this area.

Kristian Nielsen

Very good to see Percona starting to think about this problem.

I think the restrictions in the blueprint
(https://www.percona.com/docs/wiki/percona-server:blueprints:parallel-replication)
are a very good start. I would recommend going with the second approach
labelled “binlog order”. I will try to summarise my reasons for this:

The “binlog order” approach has the best potential for future development. A
fundamental limitation of the MySQL binary log is that it serialises
transactions on the master side. This means that we discard information about
independence of transactions, which is needed to implement parallel
application on the slave. In the Percona approach, you start with a simple way
to re-introduce the needed information, independent database updates. And with
the “binlog order” approach, if we later enhance the binary log to preserve
more such information (or provide such information in a different way), it is
possible to extend the approach to have less restrictions, eg. parallel
application of independent transactions within a single database.

In the “full independece” approach, I think it is fundamentally harder to
extend it later; it seems we will be stuck with consistency (and
serialisation) within the database, inconsistency (and parallelisation)
between databases.

Also, I think it is best to avoid the complex mix of states in the “full
independence” approach. We need replication state to become _more_ crash
resistent, not less! Also, row-based idempotent replication does not solve the
problem when there are DDL statements involved.

I think the “binlog order” approach still has good opportunities for
parallelisation. It is only the actuall commit operation that needs to be
serialised, the actual transactions can freely run in parallel in each
database/SQL thread (and the commit operation is cheap, especially if it can
be combined with one of the patches that implement/fix group commit).

The drawback that a slow transaction in one database / SQL thread will delay
replication of short transactions in another remains valid though. I am not
sure to which extent this will also reduce opportunities for parallelisation;
it seems to me that as such delays occur on the slave, more binlog stream will
become pending from the IO thread, increasing the opportunities for
parallelisation and helping the slave catch up. But highly skewed workloads,
with some databases having many small updates and others having few large
updates might benefit significantly from the “full independence” approach.