Making replication a bit more reliable

Running MySQL slave is quite common and regular task which we do every day, taking backups from slave is often recommended solution. However the current state of MySQL replication makes restoring slave a bit tricky (if possible at all). The main problem is that InnoDB transaction state and replication state are not synchronized. If we speak about backup and you can execute SHOW SLAVE STATUS command you can get reliable information about current state, but some solutions does not allow that. Look for example Sun Storage 7410, which provides storage via NFS and where you can make ZFS snapshots without any info what kind of data you are storing there. What makes situation worse is that files with replication state (relay-log.info, master.info) are not synchronized on disk after each update, and even wrose – in case with NFS they are stored on client side OS/NFS cache for long time. As solution we can do patch to execute fsync() for these files after each write, but I can’t predict how much performance penalty we will see here, I expect it will be very significant.

Our idea is not new, it was taken from TransactionalReplication http://code.google.com/p/google-mysql-tools/wiki/TransactionalReplication patch and http://bugs.mysql.com/bug.php?id=34058 bug report. Basically we want to store the state of replication in InnoDB transactional log file, in this case we will be able to see what position in replication the last executed transaction corresponds to. Of course it will work if you have writes exclusively to InnoDB storage engine, case with mix of storage engines is much more complex and I do not see easy way to solve it.

So we propose overwrite_relay_log_info extension for XtraDB storage engine, the name comes from fact that XtraDB will try to rewrite relay-log.info by replication position, or at least this info is available in error-log output, so you can repoint your slave to correct position by executing CHANGE MASTER command.

Currently patch is available in Launchpad lp:~percona-dev/percona-xtradb/overwrite-relay-log-info , and after testing it will go to main XtraDB tree. If there is interest the patch will be ported to 5.0 and 5.1 trees. Some info available at this documentation page.

15 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Shlomi Noach

15 years ago

Very nice!!

Domas Mituzas

15 years ago

it is my favorite feature from google patch 😉

Alex P.

15 years ago

hi, i have a ‘Master’ DB replicating to 3 ‘Slaves’ DB (with InnoDB engine) and it’s a bit difficult to restore when some slave stops replicate… if im not wrong this “patch/feature” will help me to restore replication rigth?
thanks in advance and sry for my ‘not very good’ english 🙂
btw, great mysql blog.

Vadim Tkachenko

Author

15 years ago

Alex,

It may help, or may not. To be sure I need to know what exact problem you have with restoring replication.

Alex P.

15 years ago

hi. Sometimes i got (in slave) the “Duplicate key entry” error (i see this in the ‘show slave status’ Last_error) always with an insert query. Dont know exactly why but just happened in 1 of 3 slave, and the others was fine and running normally. With Myisam just need “load data from master” but with InnoDB we have to download and run a backup of the Master DB and it takes some minutes….. i have this problem around 1 time every 3 months but im looking for a better solution..
thanks again.

Gil

15 years ago

Alex, I have run into similar duplicate key errors in the past. It turns out the slave I was using to take backups had slightly different data than the master. You may want to do a consistency check using Maatkit just to be sure.

Vadim

15 years ago

Alex P, Gil

Actually this patch can fix that issue.

The problem is when your slave crash or some another problem – relay-log.info contains old information, and when you do slave start – slave tries to repeat transactions which were already executed (and you getting Duplicate key error).

Our patch tries to fix that and change relay-log.info to the real last transaction.

However Dublicate key error may be related to different issues also, so I can’t say for sure our patch is solution.

Alex P.

15 years ago

yes i know the “duplicate key error” may be related to different issues (sometimes its isp problem, manual inserts, our slaves uses dyndns and the update of the service sometime make it crash, etc ), the way i resolve the crash is making a backup of the master, send this backup to the slave (via sftp) then i execute this backup, delete the ‘relay’ files and restart the mysql service… after that slave is runnning ok again. I know this “solution” is not very good (what do you think? ) so im trying to find another one, that’s why i think this patch/feature will help me.

Ernesto Vargas

15 years ago

Alex P,

You could also do:

STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;

So the replication will jump that Dup Key error and continue with the next statement of the replication.

Alex P.

15 years ago

thank you. i will try it and find if can solve the problem.

🙂

Aric

14 years ago

Also, in my.cnf you can skip replication errors.

slave-skip-errors = all

Baron Schwartz

14 years ago

These “solutions” make replication LESS reliable and MORE likely to have further problems! FIX the problem, don’t hide it.

Shlomi Noach

14 years ago

@Aric,

Skipping errors by specifying “slave-skip-errors = all” in my.cnf is quite dangerous, since you will not even be aware of replication issues; with Ernesto’s solution, you are at least forced into manual intervention, which may supply you with enlightenment about the problem’s origins.

Instead, your replicated data will keep on moving away from master’s data.

Robin

12 years ago

Hello,everybody.
Does mysql have a feature to customize the slave port? I see my slave port is random.
Thank you!

James Parks

10 years ago

Your slave port is random because you’re looking at it’s ephemeral port. The slave does not listen for incoming connections from the master, but is rather a client of the master.

I imagine you were asking this question because you were trying to figure out which ports to allow through your firewall. If you’ll allow all established TCP connections, you’ll solve this problem. In IPtables, it would look something like this:

-A INPUT -m state –state RELATED,ESTABLISHED -j ACCEPT

I know it’s been a long time since you posted, but thought someone else might find this useful.

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Making replication a bit more reliable

Related

Related Blog Articles

RECOMMENDED ARTICLES

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

Troubleshooting PostgreSQL on Kubernetes With Coroot

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Making replication a bit more reliable

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

Troubleshooting PostgreSQL on Kubernetes With Coroot

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation