May 23, 2013

Minimizing Downtime from Lengthy AWS Outages

Well, it happened again…  Another lengthy EBS outage in the US-East region impacted several sites across the net.  While failures like this are rare, they can be quite costly and translate into headaches for the operations team when impact production systems for any length of time.  At Percona, we routinely help clients architect and deploy [...]

Automation: A case for synchronous replication

Just yesterday I wrote about math of automatic failover today I’ll share my thoughts about what makes MySQL failover different from many other components and why asynchronous nature of standard replication solution is causing problems with it. Lets first think about properties of simple components we fail over – web servers, application servers etc. We [...]

How to lag a slave behind to avoid a disaster

MySQL Replication is useful and easy to setup. It is used for very different purposes. For example: split read and writes run data mining or reporting processes on them disaster recovery Is important to mention that a replication server is not a backup by itself. A mistake on the master, for example a DROP DATABASE [...]

Percona XtraDB Cluster: Failure Scenarios with only 2 nodes

During the design period of a new cluster, it is always advised to have at least 3 nodes (this is the case with PXC but it’s also the same with PRM). But why and what are the risks ? The goal of having more than 2 nodes, in fact an odd number is recommended in [...]

New variable slave_max_allowed_packet for slave servers

One month ago I wrote about how a big read_buffer_size could break the replication. The bug is not solved but now there is an official workaround to ease this problem using a new configuration variable: slave_max_allowed_packet This new variable will be available in 5.1.64, 5.5.26, and 5.6.6 and can establish a different limit on the [...]

Faster Point In Time Recovery with LVM2 Snaphots and Binary Logs

LVM snapshots is one powerful way of taking a consistent backup of your MySQL databases – but did you know that you can now restore directly from a snapshot (and binary logs for point in time recovery) in case of that ‘Oops’ moment? Let me show you quickly how. This howto assumes that you already [...]

How to Monitor MySQL with Percona’s Nagios Plugins

In this post, I’ll cover the new MySQL monitoring plugins we created for Nagios, and explain their features and intended purpose. I want to add a little context. What problem were we trying to solve with these plugins? Why yet another set of MySQL monitoring plugins? The typical problem with Nagios monitoring (and indeed with [...]

STOP: DELETE IGNORE on Tables with Foreign Keys Can Break Replication

DELETE IGNORE suppresses errors and downgrades them as warnings, if you are not aware how IGNORE behaves on tables with FOREIGN KEYs, you could be in for a surprise. Let’s take a table with data as example, column c1 on table t2 references column c1 on table t1 – both columns have identical set of rows for [...]

Actively monitoring replication connectivity with MySQL’s heartbeat

Until MySQL 5.5 the only variable used to identify a network connectivity problem between Master and Slave was slave-net-timeout. This variable specifies the number of seconds to wait for more Binary Logs events from the master before abort the connection and establish it again. With a default value of 3600 this has been a historically [...]

Statement based replication with Stored Functions, Triggers and Events

Statement based replication writes the queries that modify data in the Binary Log to replicate them on the slave or to use it as a PITR recovery. Here we will see what is the behavior of the MySQL when it needs to log “not usual” queries like Events, Functions, Stored Procedures, Local Variables, etc. We’ll [...]