I just found post by Kevin, in which he criticizes Master-Master approach, finding Master with many slaves more optimal. There is surely room for master-N-slaves systems but I find Master-Master replication much better approach in many cases.

Kevin Writes “It requires extra hardware thats sitting in standby and not being used (more money and higher capital expenditures)”, I’m surprised why would it ? Typically you can use both nodes for most of the reads, and this is in fact use pattern MMM was designed for.

“There’s only one machine to act as the standby master. If you have 10 slaves you should be able to fail five times and still be operational.” This is valid consideration but honestly for most of applications it is enough unless you’re using unreliable hardware.
Typically other risks of downtime are higher, what is about having application error which deletes/trashes all the data and gets replicated to all the slaves ?

When you compare master-master to multiple slaves you should compare it for same amount of servers. For example if we have 6 boxes we can use 1 master and 5 slaves or 3 master-master pairs. In this case each of the master-master nodes gets 1/3rd of database size and about 1/3rd of traffic.

The benefits of Master-Master replication are the following

It is more simple especially if you just write to one node fallback and recovery are rather easy. Even if all things are automated simple things mean less software bugs.

Handling write load If your application is write intensive master-N-slave configuration will be saturated much faster because it has to handle much more write load. Especially keeping into account MySQL replication is single thread it might be not long before it will be unable to keep up.

Waste of cache memory If you have same data on the slaves you will likely same data cached in their database caches. You can partially improve it by load partitioning but still it will not be perfect – for example all of write load has to go to all nodes getting appropriate data in the cache. In our example if you have 16GB boxes and say 12GB allocated to MySQL database caches you can get 12GB effective cache on the master-N-slave configuration compared to 36GB of effective cache on 3 master-master pairs.

Waste of Disk Disk is cheap but for IO bound workloads you may need fast disk array, which becomes not so cheap so having less data to deal with becomes important.

More time to clone If replication breaks you may need more time to re-clone it (or restore database from backup) compared to multiple master-master pairs.

I however agree if you have small database (compared to memory size) with insignificant write load master-N-slave configuration may be indeed better as it allows to keep application simple, having all data on the same server.

The customer for which MMM cluster was implemented has about 600GB per node and some other installations have similar sizes.
MMM Cluster support Master with multiple slaves configuration as well, this however was not the main focus.

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Dathan Pattishall

“When you compare master-master to multiple slaves you should compare it for same amount of servers. For example if we have 6 boxes we can use 1 master and 5 slaves or 3 master-master pairs. In this case each of the master-master nodes gets 1/3rd of database size and about 1/3rd of traffic.”

I don’t know if I am reading this right, but the 1/3 of the data is spread across each master-master pair? If so, wouldn’t the application need to know where the data is to access the correct master-master? The solution doesn’t look like it will abstract the data-layer that much.

Other then the question above:

Overall I would say this is a great solution for many people to get a HA solution without building it directly into their app.

Jeremy Cole

Hi Peter,

I completely agree with you except for one point: If you send reads to the “inactive” master, you are lying to yourself about your true capacity. If one of the masters fails, you don’t have enough read capacity to handle your load.

For this reason, I always suggest that customers use master-master, but scale by adding more partitions, rather than using the “slave” for reads.

Regards,

Jeremy

Kevin Burton

““There’s only one machine to act as the standby master. If you have 10 slaves you should be able to fail five times and still be operational.” This is valid consideration but honestly for most of applications it is enough unless you’re using unreliable hardware.”

No.. actually. It’s not.

Let’s take Technorati for example. They have three copies of everything. This way if a machine failed at night you still have a redundant copy.

If it’s 1AM I want to be able to go back to sleep. 🙂

“Typically other risks of downtime are higher, what is about having application error which deletes/trashes all the data and gets replicated to all the slaves ?”

Sure… but this is a non sequitor because we’re dealing with master promotion design here.

“Handling write load If your application is write intensive master-N-slave configuration will be saturated much faster because it has to handle much more write load. Especially keeping into account MySQL replication is single thread it might be not long before it will be unable to keep up.”

This is a tangential discussion isnt’ it? We use shard and partition our data so we spread writes this way. Each shard has a single master and two slaves that can be promoted to master.

We can just create as many shards as we need writes.

“Waste of cache memory If you have same data on the slaves you will likely same data cached in their database caches. You can partially improve it by load partitioning but still it will not be perfect – for example all of write load has to go to all nodes getting appropriate data in the cache. In our example if you have 16GB boxes and say 12GB allocated to MySQL database caches you can get 12GB effective cache on the master-N-slave configuration compared to 36GB of effective cache on 3 master-master pairs.”

Again. This seems like a tangential issue and one your design suffers from as well. Isnt’ the data replayed on your slaves in MMM?

It seems like we’re comparing apples to oranges here.

I’m not arguing that partitioning your data isn’t important. It is. I’m just saying that there’s and easy way to use a slave as a master anytime you want.

I didn’t grok from your original post that MMM has sharding (which it seems to now) so I’ll have to play with it. I was just arguing that master to master replication isn’t necessary.

Thanks for posting this stuff!

Kevin

gueast

Another great guide on this subject I’ve found…

http://www.dancryer.com/2010/01/mysql-circular-replication

This is part 1 of a three posts series:

– MySQL Load-Balanced Cluster Guide – Part 1 – setting up the servers themselves and configuring MySQL replication.

– MySQL Load-Balanced Cluster Guide – Part 2 – set up a script to monitor the status of your MySQL cluster nodes, which we’ll use in the next guide to set up our proxy.

– MySQL Load-Balanced Cluster Guide – Part 3 – setting up the load balancer with HAProxy, using the monitoring scripts