July 25, 2014

Getting use of Slave in MySQL Replication

MySQL Replication is asynchronous which causes problems if you would like to use MySQL Slave as it can contain stale data. It is true delay is often insignificant but in times of heavy load or in case you was running some heavy queries on the master which not take time to replicate to the slave replication lag can be significant. Also even very small lag can cause the problems – for example you’ve posted comment on the blog and on next page reload you do not see it as it was read from the slave millisecond later…. this is something you would not like to happen.

I’ll list some techniques here which I found to be helpful for offloading load to the slave without causing application to be have crazy. The same approach can be used in Master-Master replication in Active-Passive mode, just think about passive node as a slave.

Query type based distribution Elect some queries as “time critical” and other queries as “non time critical” in terms of requirements regarding reading stale data. run time critical queries on the master and non time critical on the slave, or balance them if you have underloaded master. For example search queries or reporting queries are very good candidates. The simplified version of this approach is using Slave for reporting while Master for OLTP queries. This solution works well if there are enough queries which can be non real time.

Session based approach You mainly care about real-time behavior for users which participate on the site, like they’ve added comment and it does not show up. If other visitors see a posts few seconds after it was created by an author they would not even notice, so you can use session based persistence and map users which are passive to the slave while if user had written something you can make him to read from the master for certain period of time. It is best to use this technique with assigning visitor to the dedicated slave for the duration of session which is both good from performance standpoint as he is likely to touch the same data and because all slaves may have different replication lags and you do not want visitors to see objects disappearing on page reloads. This approach is cool but can be a bit complicated and it also does not work well if you’re doing writes for passive visitors which when directly visible to them.

Change time based approach If your operations are specific to particular “objects” – blogs, movies, profiles, threads in forums etc you can track when object was modified last time. For example when new post was added to the blog, or edited. Now for object which were not changed for significant amount of time you can do reads from the slave and for objects which just were updated you can read them from the master. “Last updated” time is always to be queried from the master and better cached in something like memcache. This technique especially works great with caching things in memcache as it means actively updated and read stuff will be read from memcache and old stuff can be read from the slaves, reducing master load and improving scalability by use of simple replication. If using this approach you should chose granularity properly or use different ones – for example you may operate on blog level or on post level – checking when post was last modified/commented and using slaves to work with inactive posts.

Obviously there are other methods you can use for loading the slave, I just feel these free are relatively simple to use and usable for wide variety of applications. Whichever method you’re using you should not forget to monitor if replication is running properly and if replication lag is within the limit. It may be good idea to exclude slaves from the reads if lag is high or adjust behavior to make sure you only read slaves if you can afford reading data which is that stale.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Scott MacVicar says:

    A mixed approach is ideal, we personally use all three of these.

    Another use for a slave is fulltext searching, we have a text table that’s InnoDB on the master and MyISAM on the slave. Though if you can Sphinx is another good approach if you want to skip MyISAM completely.

  2. peter says:

    Thanks Scott,

    Yes sure. You can mix them and do some other things. For large systems I would not only use master-slave replication without any partitions anyway.

    Using set of slaves for MyISAM full text search indexes is good idea if you’re stuck with MySQL Full Text search. Frequently you can configure them to replicate only full text search table(s) so you get better cache usage on them.

  3. I use another concept: On master, I issue SHOW MASTER STATUS after write, save File and Position to session variable and if client wants to read written data, I issue DO MASTER_POS_WAIT() on the slave. So master is exclusively used for writting and slave for writting.

    With this concept, it could take long for the slave to catch up. Thus instead of MASTER_POS_WAIT(), there could be SHOW SLAVE STATUS and if (Master_Log_File, Read_Master_Log_Pos) is lower than session variables, data would be read from master, instead from slave.

  4. peter says:

    Jakub,

    Regarding MASTER_POS_WAIT() you’re right it may take too long for slave to catch up so it is scary.

    Regarding other approach – this should work for loads which have very light amount of writes as otherwise Master and Slave will be rarely fully in sync. The other thing is – Read_Master_Log_Pos is where the IO thread is positioned, while you need SQL to look at SQL threads as this is one which really applies updates, so you should use Exec_Master_Log_Pos if you want to check if Master and Slave are in sync.

  5. Yeah….. don’t use MASTER_POS_WAIT()…. implement it in code in the client.

    MPW uses up a connection that could otherwise be used for another real query….

  6. Thomas says:

    Then use a timeout on MASTER_POS_WAIT en fetch data from master if a timeout occurs.
    Not the best solution, but it ensures slaves are used as much as possible.

    This can be very useful if you divide the tables in a database to difference slaves. Using the slave as much as possible greatly increases the chance most data is in cached in memory whereas the same query could be really, really slow on the master because it might not have much of it in cache.

  7. James Day says:

    Jakub, going to the master if the slave is lagged is very dangerous. Consider what happens if an ALTER TABLE is replicated: all slaves lag at the same time, the applications then mount an accidental denial of service attack on the master and take the site down. Slow queries or broken queries can also cause this problem. It’s good to track but switching to a different slave or giving up and providing out of date information for non-critical queries (and critical had better be limited!) is also necessary to handle fault cases.

  8. peter says:

    James there are many different circumstances out where. You may have just one slave in which case it might not overload master too badly. There are also different cases about reading stale data – in some cases you can use it if you have no choice in others it is absolutely critical data is reasonably recent.

    The thing is even non-critical queries might allow only certain delay – can you really use the slave if it is lagging by a week ? (And I’ve seen even worse cases).

    Designing how slaves are usage you should of course take your normal lag spikes in account – if you have nightly job which does 10min update queries – this will be normal lag time for you which does not cause switching.

    Regarding Alter Table – if it is long, in most cases the good idea is to run it on the slaves separately from the master and exclude it from replication.

  9. Dmitry says:

    Exists any ways to create synced replication without delays or lags with mysql ?

  10. peter says:

    Dmitry,

    Check Google Semi-Synchronous replication patches.

  11. Dmitry says:

    Thanks, Peter. Will test it, cause delays sometimes 5 seconds before in slave appears this row.

  12. peter says:

    Dmitry,

    If replication is that delayed you need to tune it first as semi synchronous replication will cral very slow in such case.

  13. Dmitry says:

    That delay become random and once per 2-3 days, now we researching about.

    Thankx.

  14. Jojo says:

    Quick question. Have you tried replicating from 5.1 (master) to 5.0.xx (slave)? I’m planning to do a downgrade of our existing db due to some problems I’m experiencing on 5.1. I’m planning to create a replica first then cutover to the replicas so as to minimize downtime.

    Looking forward to your reply. Thanks.

  15. peter says:

    Not yet to be honest. I did not try 5.1 -> 5.0 Generally to be safe you better have Slaves at same or higher revision than the master.

Speak Your Mind

*