Just yesterday I wrote about math of automatic failover today I’ll share my thoughts about what makes MySQL failover different from many other components and why asynchronous nature of standard replication solution is causing problems with it. Lets first think about properties of simple components we fail over – web servers, application servers etc. We [...]
The Math of Automated Failover
There are number of people recently blogging about MySQL automated failover, based on production incident which GitHub disclosed. Here is my take on it. When we look at systems providing high availability we can identify 2 cases of system breaking down. First is when the system itself has a bug or limitations which does not [...]
How to lag a slave behind to avoid a disaster
MySQL Replication is useful and easy to setup. It is used for very different purposes. For example: split read and writes run data mining or reporting processes on them disaster recovery Is important to mention that a replication server is not a backup by itself. A mistake on the master, for example a DROP DATABASE [...]
Percona XtraDB Cluster: Failure Scenarios with only 2 nodes
During the design period of a new cluster, it is always advised to have at least 3 nodes (this is the case with PXC but it’s also the same with PRM). But why and what are the risks ? The goal of having more than 2 nodes, in fact an odd number is recommended in [...]
Percona XtraDB Cluster reference architecture with HaProxy
This post is a step-by-step guide to set up Percona XtraDB Cluster (PXC) in a virtualized test sandbox. I used Amazon EC2 micro instances, but the content here is applicable for any kind of virtualization technology (for example VirtualBox). The goal is to give step by step instructions, so the setup process is understandable and [...]
Comparing Percona XtraDB Cluster with Semi-Sync replication Cross-WAN
I have a customer who is considering Percona XtraDB Cluster (PXC) in a two colo WAN environment. They wanted me to do a test comparing PXC against semi-synchronous replication to see how they stack up against each other. Test Environment The test environment included AWS EC2 nodes in US-East and US-West (Oregon). The ping RTT latency [...]
read_buffer_size can break your replication
There are some variables that can affect the replication behavior and sometimes cause some big troubles. In this post I’m going to talk about read_buffer_size and how this variable together with max_allowed_packet can break your replication. The setup is a master-master replication with the following values: max_allowed_packet = 32M read_buffer_size = 100M To break the [...]
Announcement of Percona XtraDB Cluster 5.5.20 GA release
I am excited to announce the availability of the GA release of our new product Percona XtraDB Cluster. Percona XtraDB Cluster is a High Availability and Scalability solution for MySQL Users and is based on Percona Server 5.5.20. With this release we make clustering very easy and affordable for everyone. You can convert your existing [...]
Actively monitoring replication connectivity with MySQL’s heartbeat
Until MySQL 5.5 the only variable used to identify a network connectivity problem between Master and Slave was slave-net-timeout. This variable specifies the number of seconds to wait for more Binary Logs events from the master before abort the connection and establish it again. With a default value of 3600 this has been a historically [...]
Statement based replication with Stored Functions, Triggers and Events
Statement based replication writes the queries that modify data in the Binary Log to replicate them on the slave or to use it as a PITR recovery. Here we will see what is the behavior of the MySQL when it needs to log “not usual” queries like Events, Functions, Stored Procedures, Local Variables, etc. We’ll [...]

