Running MySQL slave is quite common and regular task which we do every day, taking backups from slave is often recommended solution. However the current state of MySQL replication makes restoring slave a bit tricky (if possible at all). The main problem is that InnoDB transaction state and replication state are not synchronized. If we speak about backup and you can execute SHOW SLAVE STATUS command you can get reliable information about current state, but some solutions does not allow that. Look for example Sun Storage 7410, which provides storage via NFS and where you can make ZFS snapshots without any info what kind of data you are storing there. What makes situation worse is that files with replication state (relay-log.info, master.info) are not synchronized on disk after each update, and even wrose – in case with NFS they are stored on client side OS/NFS cache for long time. As solution we can do patch to execute fsync() for these files after each write, but I can’t predict how much performance penalty we will see here, I expect it will be very significant.

Our idea is not new, it was taken from TransactionalReplication http://code.google.com/p/google-mysql-tools/wiki/TransactionalReplication patch and http://bugs.mysql.com/bug.php?id=34058 bug report. Basically we want to store the state of replication in InnoDB transactional log file, in this case we will be able to see what position in replication the last executed transaction corresponds to. Of course it will work if you have writes exclusively to InnoDB storage engine, case with mix of storage engines is much more complex and I do not see easy way to solve it.

So we propose overwrite_relay_log_info extension for XtraDB storage engine, the name comes from fact that XtraDB will try to rewrite relay-log.info by replication position, or at least this info is available in error-log output, so you can repoint your slave to correct position by executing CHANGE MASTER command.

Currently patch is available in Launchpad lp:~percona-dev/percona-xtradb/overwrite-relay-log-info , and after testing it will go to main XtraDB tree. If there is interest the patch will be ported to 5.0 and 5.1 trees. Some info available at this documentation page.

15 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Shlomi Noach

Very nice!!

Domas Mituzas

it is my favorite feature from google patch 😉

Alex P.

hi, i have a ‘Master’ DB replicating to 3 ‘Slaves’ DB (with InnoDB engine) and it’s a bit difficult to restore when some slave stops replicate… if im not wrong this “patch/feature” will help me to restore replication rigth?
thanks in advance and sry for my ‘not very good’ english 🙂
btw, great mysql blog.

Alex P.

hi. Sometimes i got (in slave) the “Duplicate key entry” error (i see this in the ‘show slave status’ Last_error) always with an insert query. Dont know exactly why but just happened in 1 of 3 slave, and the others was fine and running normally. With Myisam just need “load data from master” but with InnoDB we have to download and run a backup of the Master DB and it takes some minutes….. i have this problem around 1 time every 3 months but im looking for a better solution..
thanks again.

Gil

Alex, I have run into similar duplicate key errors in the past. It turns out the slave I was using to take backups had slightly different data than the master. You may want to do a consistency check using Maatkit just to be sure.

Vadim

Alex P, Gil

Actually this patch can fix that issue.

The problem is when your slave crash or some another problem – relay-log.info contains old information, and when you do slave start – slave tries to repeat transactions which were already executed (and you getting Duplicate key error).

Our patch tries to fix that and change relay-log.info to the real last transaction.

However Dublicate key error may be related to different issues also, so I can’t say for sure our patch is solution.

Alex P.

yes i know the “duplicate key error” may be related to different issues (sometimes its isp problem, manual inserts, our slaves uses dyndns and the update of the service sometime make it crash, etc ), the way i resolve the crash is making a backup of the master, send this backup to the slave (via sftp) then i execute this backup, delete the ‘relay’ files and restart the mysql service… after that slave is runnning ok again. I know this “solution” is not very good (what do you think? ) so im trying to find another one, that’s why i think this patch/feature will help me.

Ernesto Vargas

Alex P,

You could also do:

STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;

So the replication will jump that Dup Key error and continue with the next statement of the replication.

Alex P.

thank you. i will try it and find if can solve the problem.

🙂

Aric

Also, in my.cnf you can skip replication errors.

slave-skip-errors = all

Baron Schwartz

These “solutions” make replication LESS reliable and MORE likely to have further problems! FIX the problem, don’t hide it.

Shlomi Noach

@Aric,

Skipping errors by specifying “slave-skip-errors = all” in my.cnf is quite dangerous, since you will not even be aware of replication issues; with Ernesto’s solution, you are at least forced into manual intervention, which may supply you with enlightenment about the problem’s origins.

Instead, your replicated data will keep on moving away from master’s data.

Robin

Hello,everybody.
Does mysql have a feature to customize the slave port? I see my slave port is random.
Thank you!

James Parks

Your slave port is random because you’re looking at it’s ephemeral port. The slave does not listen for incoming connections from the master, but is rather a client of the master.

I imagine you were asking this question because you were trying to figure out which ports to allow through your firewall. If you’ll allow all established TCP connections, you’ll solve this problem. In IPtables, it would look something like this:

-A INPUT -m state –state RELATED,ESTABLISHED -j ACCEPT

I know it’s been a long time since you posted, but thought someone else might find this useful.