August 21, 2014

read_buffer_size can break your replication

There are some variables that can affect the replication behavior and sometimes cause some big troubles. In this post I’m going to talk about read_buffer_size and how this variable together with max_allowed_packet can break your replication.

The setup is a master-master replication with the following values:

max_allowed_packet = 32M
read_buffer_size = 100M

To break the replication I’m going to load the 4 million rows with LOAD DATA INFILE:

MasterA (test) > LOAD DATA INFILE '/tmp/data' INTO TABLE t;
Query OK, 4510080 rows affected (26.89 sec)

After some time the SHOW SLAVE STATUS on MasterA gives us this output:

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event 'mysql-bin.000002' at 74416925, the last event read from './mysql-bin.000004' at 171, the last byte read from './mysql-bin.000004' at 190.'

Very strange! We have loaded the data on MasterA and now it has the replication broken with a error on max_allowed_packet. The next step is to check the binary logs of both masters.

MasterA:

masterA> mysqlbinlog data/mysql-bin.000004 | grep block_len
#Begin_load_query: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 4194304
#Append_block: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 10420608

No block is larger than the max_allowed_packet (33554432).

MasterB:

masterB> mysqlbinlog data/mysql-bin.000004 | grep block_len
#Begin_load_query: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 4191676
#Append_block: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 10419732

Do you see the difference? 33555308 is larger than the max_allowed_packet size 33554432 so the Master2 has written the some blocks 876 bytes larger than the safe value. Then, MasterA tries to read again the binary log from MasterB and the replication breaks because packets are too large. No, the replicate_same_server_id is not enabled :)

What is the relation between read_buffer_size and this bug?

Again, an example is best than words. These are the new values:

max_allowed_packet = 32M
read_buffer_size = 16M

We run the LOAD DATA INFILE again and now this is the output on both servers:

#Begin_load_query: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 14614912

The maximum size of the data blocks are based on the read_buffer_size value so they are never larger than the max_allowed_packet :)

Therefore, a read_buffer_size value larger than max_allowed_packet can break your replication while importing data to MySQL. This bug affects from 5.0.x to the latest 5.5.25 release and the easiest workaround is to not have a read_buffer_size larger than max_allowed_packet. The bug 30435 seems not to be really solved.

And remember, big values on read_buffer_size will not increase your performance.

About Miguel Angel Nieto

Miguel joined Percona in October 2011. He has worked as a System Administrator for a Free Software consultant and in the supporting area of the biggest hosting company in Spain. His current focus is improving MySQL and helping the community of Free Software to grow. Miguel's roles inside Percona are Senior Support Engineer and Manager of EMEA Support Team.

Comments

  1. Nice catch Miguel,

    Looks like the minimum of max_allowed_packet and read_buffer_size is attempted to be used but because of header or something else the event in replication log ends up being larger than max_allowed_packet. I’m wondering why does it take so long to fix.

    It is also worth to note such large read_buffer_size is usually not a good idea.

    What exactly 5.5 release did you test it ? Couple of years from now people will be finding this blog post and “latest 5.5 release” will not mean the same thing to them.

  2. vishnu rao says:

    cool find.

    very curious to know if –log-slave-updates was enabled on master2 ?
    also, could this happen with normal insert statements instead of load data infile ?

  3. Justin Swanhart says:

    Hi vishnu,

    In order for this to happen, –log-slave-updates had to be enabled on the master2, otherwise master2 would not have rewritten the data packets into its binary log, to then read back by master1.

  4. liang xie says:

    cool ! we observed the same issue as well : )

  5. Peter, good one :) I’ve updated the post. It was 5.5.25.

  6. ruochen says:

    the maximum of max_allowed_packet is 1GB in mysql document,i set this values in my.cnf as usually,could you tell me the risk of it.thanks.

  7. zhang says:

    hi Miguel…

    I couldn’t find the comment about “block_len” in my binary log.

    I also did testing following your above two variables. But there are some differences. I just set up the simple master-slave replication on MySQL 5.6.5-m8. Then to run load the size of more than 1GB datafile on master, the replication is still working.

    As we know, read_buffer_size is only related to the queries on MyISAM. In my point actually, the variable is used on the storage level. but replication is working on the mysqld level. so I am puzzled.

  8. Mark Calagan posted this http://bugs.mysql.com/bug.php?id=60926 what has now unfortunately been marked as private.

    There is a summary here: http://code.google.com/p/sqlyog/issues/detail?id=1772

    Is it same, related or different?

  9. Actually Mark’s own Blog is still available even though the bug report he posted is not:
    http://www.facebook.com/note.php?note_id=10150236650815933

  10. Valeriy Kravchuk says:

    http://bugs.mysql.com/bug.php?id=60926 is not private any more…

  11. ruochen: I don’t see any problem of using a large max_allowed_packet value, it only sets a limit on the packet that you are going to send being 1GB is the limit of the protocol.

    zhang: I didn’t tried it with 5.6. anyway, the problem comes when the slave writes the chunks to its own binary log (log_slave_updates) so if you have a master master A-B-A or a chain replication A-B-C then the problem starts after the first slave.

  12. Manohar says:

    This error may also occur when you are try to read a log position on master binlog which doesn’t exist. So this is also a case to consider.

Speak Your Mind

*