February 9, 2012

Confusing MySQL Replication Error Message

I already wrote about some MySQL Error Messages which are confusing, here is one more:

080603 20:53:10 [Note] Slave: connected to master 'repl@host.com:3306',replication resumed in log 'master-bin.003676' at position 444286437
080603 20:53:10 [Note] Slave: received end packet from server, apparent master shutdown:
080603 20:53:10 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'master-bin.003676' position 444292333
080603 20:53:10 [Note] Slave: connected to master 'repl@host.com:3306',replication resumed in log 'master-bin.003676' at position 444292333
080603 20:53:10 [Note] Slave: received end packet from server, apparent master shutdown:
080603 20:53:10 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'master-bin.003676' position 444294573
080603 20:53:10 [Note] Slave: connected to master 'repl@host.com:3306',replication resumed in log 'master-bin.003676' at position 444294573
080603 20:53:10 [Note] Slave: received end packet from server, apparent master shutdown:
080603 20:53:10 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'master-bin.003676' position 444298239
080603 20:53:10 [Note] Slave: connected to master 'repl@host.com:3306',replication resumed in log 'master-bin.003676' at position 444298239

After setting up new slave Server I’m getting error log file flooded with messages like this and there is no hint in the message what would explain what is wrong.

In fact the issue in this case is (because of configuration error) two slave servers got the same server-id.

Seriously in this case Master clearly sees the problem in this case as there are 2 servers with same server-id connected and replicating so it should report it to the slave instead of sending end packet.

At very least it would be nice to include possible reason for this error message which MySQL already does in many other cases.

I’ve now filed it as a bug.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Ernesto Vargas says:

    One day it happen to me, and took me almost an hour to find that out.

    Moving foward I always use a base my.cnf to I copy to any other server and the first thing is to increase the server-id.

    Could MySQL just use the servername intead of a numeric value?

  2. peter says:

    Ernesto,

    I think a lot of people have script or process to set server-id but things tend to break from time to time and it is great to get a good error message.

    Server-id is stored in binary log and is integer this is why server-name would not work – too muich overhead.

  3. Cosimo says:

    Great post! I just got this error, and all the other related pages I could find were completely off-track.

    Anyway, of the two slaves with the same server-id, one lagged behind with no apparent reason (of course now I know why), the other I just stopped it. Is the former slave going to ever reach the master (Seconds_behind_master = 0) or not?

  4. yingkuan says:

    Thanks whole bunch! Got the error in one of our clusters. Saved me a lot of time.
    Definitely misleading and confusing. Almost got our network guy killed.
    It really looks like intermittent network problem, the slave is catching up with master but through errors every second.

  5. yingkuan says:

    btw, our version is 5.0.84-64

  6. Sean Scullion says:

    Amazing! Thank you! :D

  7. Harsha says:

    Guys,

    i get exactly the same error, but server-ids are unique

    How to get out of this?

    thanks,
    Harsha

  8. Harsha says:

    Additional info:

    Master is 5.1.48 [server-id 1]
    1st Slave is 5.1.50 [server-id 2] and
    2nd slave is 5.1.48 [server-id 3]

    Does anyone see incompatibility issue?

    Note: 1. replication is happening but showing the error message that is confusing

  9. Guano says:

    I’ve created 2 extra slaves with the same server-id as the original slave, after some time, one of them caught up with the master and the other one started getting further behind, I noticed the I/O message in the logs.
    Should I rebuild the slaves from scratch (200 GB of data) or can I just change the server-id and restart?

Speak Your Mind

*