<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MySQL Performance Blog &#187; replication</title>
	<atom:link href="http://www.mysqlperformanceblog.com/category/replication/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com</link>
	<description>Everything about MySQL Performance</description>
	<lastBuildDate>Sat, 21 Nov 2009 03:11:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Finding your MySQL High-Availability solution – Replication</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/#comments</comments>
		<pubDate>Fri, 13 Nov 2009 20:22:36 +0000</pubDate>
		<dc:creator>yves</dc:creator>
				<category><![CDATA[High Availability]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1607</guid>
		<description><![CDATA[In the last 2 blog posts about High Availability for MySQL we have introduced definitions and provided a list of ( questions that you need to ask yourself before choosing a HA solution.  In this new post, we will cover what is the most popular HA solution for MySQL, replication.  
High Availability solution [...]]]></description>
			<content:encoded><![CDATA[<p>In the last 2 blog posts about High Availability for MySQL we have introduced <a href="http://www.mysqlperformanceblog.com/2009/10/09/finding-your-mysql-high-availability-solution-the-definitions/">definitions</a> and provided a list of (<a href="http://www.mysqlperformanceblog.com/2009/10/16/finding-your-mysql-high-availability-solution-%E2%80%93-the-questions/"> questions</a> that you need to ask yourself before choosing a HA solution.  In this new post, we will cover what is the most popular HA solution for MySQL, replication.  </p>
<h3>High Availability solution for MySQL: Replication</h3>
<p>This HA solution is the easiest to implement and to manage.  You basically need to setup MySQL replication between a master and one or more slaves.  Upon failure of the master, one of the slaves is manually promoted to the master role and replication on the other slaves is re-adjusted to point to the new master.  This solution works well with all the MySQL storage engines including MyISAM (NDB is a special discussed later) but it suffers from the limitation of MySQL replication.  The main limitation, in term of HA, is the asynchronous design of MySQL replication which does not allow the master to be sure the slave has been updated before returning after a <em>commit</em> statement.  There is a window in time where it is possible that a fully committed transaction has not been pushed to the slave(s) leading to data loss.  Many large websites that are fine with some data loss rely on replication for HA and for read scaling.  </p>
<p>In addition to hardware failure, the level of availability of this solution is affected by the availability of the MySQL replication link between the servers.  Replication often break for various reasons and while replication is broken, there is no High-Availability.  Also, the availability of this solution is affected by how much the slaves were behind the master when the outage occurred.  So, if you want to have a good level of availability, you need a good monitoring and alerting system to quickly react to replication issue and you need a rather small write load so that the slaves do not lag behind the master too much.  To maximize the level of availability, recovery should be automatic.</p>
<p>Apart of its simplicity, an HA solution based on replication as many interesting properties, no wonder it is so popular.  First, if the application is well designed and has specific database handles for read and write operations,  this HA solution can scales the read operations to a high level.  Using the slaves for reads cause a second interesting side effect, the caches of the slaves are hot so failing over to a slave means no degraded performance associated with caches warm up.  Finally, it is well known that with MySQL,  altering a table means  recreating the whole table and it is a blocking operations.  Altering a large table may takes many hours.  The trick here is to run the alter table on a slave and then, once done, we let the slave catch up with the master using the new table schema, we failover to the slave and repeat the alter table on the other server.  Those online schema change are easier when a master to master topology is used.  </p>
<p>The following figure summarize the simplest HA architecture using MySQL replication.  All writes are going to the master while reads are spread between the master and the slave.  Upon failure of the master, replication is stopped on the slave and all traffic is redirected to the slave which now handles reads and writes.</p>
<p><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/replication1.png" alt="HA replication" title="HA replication" width="524" height="322" class="alignnone size-full wp-image-1699" /></p>
<table cellpadding="1" cellspacing="1" summary="" border="1">
<tr>
<td bgcolor="#efefef"><b>Pros</b></td>
<td bgcolor="#efefef"><b>Cons</b></td>
</tr>
<tr>
<td>Simple</td>
<td>Variable level of availability (98-99.9+%)</td>
</tr>
<tr>
<td>Inexpensive</td>
<td>Not suitable for high write loads</td>
</tr>
<tr>
<td>All the servers can be used, no idle standby</td>
<td>read scaling only if application splits reads from writes</td>
</tr>
<tr>
<td>Supports MyISAM</td>
<td>Can lose data</td>
</tr>
<tr>
<td>Caches on failover slave are not cold</td>
<td></td>
</tr>
<tr>
<td>Online schema changes</td>
<td></td>
</tr>
<tr>
<td>Low impact backups</td>
<td></td>
</tr>
</table>
<h3> Automatic failover with replication</h3>
<p>I already mentioned that for best HA levels, failover or recovery should be automatic.  There are tools to manage automatic failover  with replication like MMM, Flipper and Tungsten.  Here, I will quickly describe the most popular one, MMM.</p>
<p>With MMM, you need to add a separate server, the Manager that, like the name imply, manages the availability of the MySQL service.  A high availability solution based on MMM requires at the 2 MySQL servers configured in a Master to Master topology.  Additional slaves can also be added.  A MMM agent runs on all the MySQL servers and it is used to do OS level operations.  The principle of operation of MMM is based on VIPs.  There is one write VIP, where write operations are sent and as many read VIPs as the number of MySQL servers.  For the write VIP, MMM monitors the state of the current master and, upon failure, try to kill all the connections to the failing server and transfer the write VIP to the other master.  For the read VIPs, MMM monitors the state of the slaves and remove the read VIP of a slave if it has failed or is lagging behind the master by more than a defined threshold.  One of the main limitation of MMM is its lack of fencing capability.  It is important to stop all the connections to the failing master and if that server is not responding, maybe because of a network problem, a stonith device must be used to fence it.  I am far from being an expert with MMM, other guys on my team are way better than me, but I heard that the MMM v1 code base had some deficiencies.  MMM v2 is a complete rewrite that addresses some of the shortcomings of v1. Walter Heck from OpenQuery gave an excellent <a href="http://forge.mysql.com/wiki/Dual_Master_Setups_With_MMM">webinar</a> on it recently.</p>
<p>The architecture of a highly available setup using MMM and Master-Master replication is presented on the figure below. Apart from the minimum requirement of two MySQL servers replicating each other, there is a third server, called the manager, that controls both MySQL server through an agent that is running on each server. The manager controls and monitors the state of the replication and assign virtual IPs for specific roles.  There are one VIP where write operations are sent and two or more VIPs where read operations are sent.  If replication on one of the MySQL servers lags behind too much, its read VIP will be moved to another server.</p>
<p><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/master-master.png" alt="master-master" title="master-master" width="528" height="537" class="alignnone size-full wp-image-1700" /></p>
<p>As a conclusion, replication can be used in many cases to build effective and scalable highly available solutions but it has some limitations.  In my next blog post, I&#8217;ll present another HA solution build around Heartbeat and DRBD.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by yves |
      <a href="http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/&amp;title=Finding your MySQL High-Availability solution – Replication" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/&amp;title=Finding your MySQL High-Availability solution – Replication" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/&amp;title=Finding your MySQL High-Availability solution – Replication" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/&amp;T=Finding your MySQL High-Availability solution – Replication" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/&amp;title=Finding your MySQL High-Availability solution – Replication" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/13/finding-your-mysql-high-availability-solution-%e2%80%93-replication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>State of the art: Galera &#8211; synchronous replication for InnoDB</title>
		<link>http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 15:08:58 +0000</pubDate>
		<dc:creator>Vadim</dc:creator>
				<category><![CDATA[High Availability]]></category>
		<category><![CDATA[Innodb]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[xtradb]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1556</guid>
		<description><![CDATA[First time I heard about Galera on Percona Performance Conference 2009, Seppo Jaakola was presenting &#8220;Galera: Multi-Master Synchronous MySQL Replication Clusters&#8221;. It was impressed as I personally always wanted it for InnoDB, but we had it in plans at the bottom of the list, as this is very hard to implement properly.
The idea by itself [...]]]></description>
			<content:encoded><![CDATA[<p>First time I heard about Galera on Percona Performance Conference 2009, Seppo Jaakola was presenting <a href="http://www.percona.com/ppc2009/PPC2009_galera.pdf">&#8220;Galera: Multi-Master Synchronous MySQL Replication Clusters&#8221;</a>. It was impressed as I personally always wanted it for InnoDB, but we had it in plans at the bottom of the list, as this is very hard to implement properly.<br />
The idea by itself is not new, I remember synchronous replication was announced for SolidDB on MySQL UC 2007, but later the product was killed by IBM.</p>
<p>So long time after PPC 2009 there was available version mysql-galera-0.6, which had serious flow, to setup a new node you had to take down whole cluster. And all this time Codership ( company that develops Galera) was working on 0.7 release that introduces node propagation keeping cluster online. You can play with 0.7pre release by yourself <a href="http://www.codership.com/en/downloads/galera">MySQL/Galera Release 0.7pre</a>. </p>
<p>In current version propagation is done by mysqldump from one of nodes (&#8221;donor&#8221;). In next release Codership is going to support LVM snapshot and xtrabackup which will make the setup of new node even easier. The current annoyance I see is that if you shutdown one node for short period of time for quick maintenance, after  start, the node has to load whole mysqldump, like it is new empty node. I hope Codership guys will address this also.<br />
Another thing I miss for now is support of InnoDB-plugin, which as we know performs much better than standard InnoDB &reg;.</p>
<p>So what is so interesting about Galera. Couple things:</p>
<p>- High Availability. Any of N standby nodes are available immediately when main node fails. Galera is serious pretender to be included to the list, Yves put recently, <a href="http://www.mysqlperformanceblog.com/2009/10/16/finding-your-mysql-high-availability-solution-%e2%80%93-the-questions/">http://www.mysqlperformanceblog.com/2009/10/16/finding-your-mysql-high-availability-solution-%e2%80%93-the-questions/</a>. I am not sure how many nines it will provide <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> , but efforts on test setup and deployment should be comparable with MMM setup.</p>
<p>- Scale Writes. Galera allows to write to any of N nodes and automatically propagate to other nodes. It sounds too ideal, and there is drawback &#8211; with increasing amount of nodes you write to, your transaction rollback rate may increase, especially if you working on the same dataset. You can find some results on <a href="http://www.codership.com/en/content/benchmarking-write-scalability">Codership&#8217;s page</a>,  and I am going to run my own benchmarks also. Also from benchmark you can see that communication overhead maybe significant for short writes.</p>
<p>- Scale Reads. It can be done with regular replication, but  with synchronous your &#8220;slaves-nodes&#8221; are in the same state, there is no &#8220;slave behind&#8221;. When you read from any slave, you read actual data. Although it also has serious drawback &#8211;  our cluster is fast as fast the &#8220;weakest&#8221; node in the chain. So if one node gets overloaded and performance degrades, the same happens with whole cluster.</p>
<p>- Heterogeneous-database replication. It is not here yet, and I do not know what&#8217;s in Codership roadmap, but group manager protocol in Galera is database independent, and it&#8217;s only matter of database drivers. For InnoDB currently it is set of patches, and I see it is quite possible to make the same for Postgres. So MySQL-Postgres cluster setup is not so far ahead <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>On <a href="http://www.codership.com/en/company/about">&#8220;Company page&#8221; Codership says</a> their goal is &#8220;to promote and exploit the latest developments in computer science to produce fast and scalable synchronous replication solution that &#8220;just works&#8221; for databases and similar applications&#8221;, which I think they have success in. Implementing fast, scalable and working group communication and transaction manager is the art.</p>
<p>As for now I would not put 0.7 release into production yet, but you may seriously consider to play with it in test environment, and report bugs to Codership team, they are very responsive.<br />
I am waiting for next releases and looking to make integration with XtraDB.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Vadim |
      <a href="http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/#comments">6 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/&amp;title=State of the art: Galera &#8211; synchronous replication for InnoDB" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/&amp;title=State of the art: Galera &#8211; synchronous replication for InnoDB" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/&amp;title=State of the art: Galera &#8211; synchronous replication for InnoDB" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/&amp;T=State of the art: Galera &#8211; synchronous replication for InnoDB" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/&amp;title=State of the art: Galera &#8211; synchronous replication for InnoDB" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/10/27/state-of-the-art-galera-synchronous-replication-for-innodb/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Why MySQL&#8217;s binlog-do-db option is dangerous</title>
		<link>http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/#comments</comments>
		<pubDate>Thu, 14 May 2009 13:01:03 +0000</pubDate>
		<dc:creator>Baron Schwartz</dc:creator>
				<category><![CDATA[replication]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=689</guid>
		<description><![CDATA[I see a lot of people filtering replication with binlog-do-db, binlog-ignore-db, replicate-do-db, and replicate-ignore-db.  Although there are uses for these, they are dangerous and in my opinion, they are overused.  For many cases, there's a safer alternative.
The danger is simple: they don't work the way you think they do.  Consider the following [...]]]></description>
			<content:encoded><![CDATA[<p>I see a lot of people filtering replication with binlog-do-db, binlog-ignore-db, replicate-do-db, and replicate-ignore-db.  Although there are uses for these, they are dangerous and in my opinion, they are overused.  For many cases, there's a safer alternative.</p>
<p>The danger is simple: they don't work the way you think they do.  Consider the following scenario: you set binlog-ignore-db to "garbage" so data in the garbage database (which doesn't exist on the slave) isn't replicated.  (I'll come back to this in a second, so if you already see the problem, don't rush to the comment form.)</p>
<p>Now you do the following:</p>
<div class="igBar"><span id="lcode-2"><a href="#" onclick="javascript:showPlainTxt('code-2'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-2">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">$ mysql</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mysql&gt; delete from garbage.<span style="">junk</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mysql&gt; use garbage;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mysql&gt; update production.<span style="">users</span> set disabled = <span style="color:#800000;color:#800000;">1</span> where user = <span style="color:#CC0000;">"root"</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>You just broke replication, twice.  Once, because your slave is going to execute the first query and there's no such table "garbage.junk" on the slave.  The second time, <em>silently</em>, because the update to production.users isn't replicated, so now the root user isn't disabled on the slave.</p>
<p>Why?  Because binlog-ignore-db doesn't do what you think.  The phrase I used earlier, "data in the garbage database isn't replicated," is a fallacy.  That's not what it does.  In fact, it <em>filters out binary logging for statements issued from connections whose default database is "garbage."</em>  In other words, filtering is not based on the contents of the query -- it is based on what database you USE.</p>
<p>The other configuration options I mentioned work similarly.  The binlog-do-db and binlog-ignore-db statements are particularly dangerous because they keep statements from ever being written to the binary log, which means you can't use the binary log for point-in-time recovery of your data from a backup.</p>
<p>In a carefully controlled environment, these options can have benefits, but I won't talk about that here.  (We covered that in <a href="http://www.amazon.com/dp/0596101716?tag=perinc-20">our book</a>.)</p>
<p>The safer alternative is to configure filters on the slave, with options that actually operate on the tables mentioned in the query itself.  These are replicate-wild-* options.  For example, the safer way to avoid replicating data in the garbage database is to configure replicate-wild-ignore-table=garbage.%.  There are still edge cases where that won't work, but it works in more cases and has fewer gotchas.</p>
<p>If you are confused, you should read the <a href="http://dev.mysql.com/doc/refman/5.1/en/replication-rules.html">replication rules section of the manual</a> until you know it by heart <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Baron Schwartz |
      <a href="http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/#comments">10 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/&amp;title=Why MySQL&#8217;s binlog-do-db option is dangerous" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/&amp;title=Why MySQL&#8217;s binlog-do-db option is dangerous" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/&amp;title=Why MySQL&#8217;s binlog-do-db option is dangerous" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/&amp;T=Why MySQL&#8217;s binlog-do-db option is dangerous" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/&amp;title=Why MySQL&#8217;s binlog-do-db option is dangerous" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Faster MySQL failover with SELECT mirroring</title>
		<link>http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/#comments</comments>
		<pubDate>Sun, 01 Feb 2009 14:13:27 +0000</pubDate>
		<dc:creator>Baron Schwartz</dc:creator>
				<category><![CDATA[High Availability]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=584</guid>
		<description><![CDATA[One of my favorite MySQL configurations for high availability is master-master replication, which is just like normal master-slave replication except that you can fail over in both directions.  Aside from MySQL Cluster, which is more special-purpose, this is probably the best general-purpose way to get fast failover and a bunch of other benefits (non-blocking [...]]]></description>
			<content:encoded><![CDATA[<p>One of my favorite MySQL configurations for high availability is master-master replication, which is just like normal master-slave replication except that you can fail over in both directions.  Aside from MySQL Cluster, which is more special-purpose, this is probably the best general-purpose way to get fast failover and a bunch of other benefits (<a href="http://www.mysqlperformanceblog.com/2007/10/29/hacking-to-make-alter-table-online-for-certain-changes/">non-blocking ALTER TABLE</a>, for example).</p>
<p>The benefit is that you have another server with all the same data, up and running, ready to serve queries.  In theory, it's a truly <em>hot</em> standby (stay with me -- that's not really guaranteed).  You don't get this with shared storage or DRBD, although those provide stronger guarantees against data loss if mysqld crashes.  And you can use the standby (passive) master for serving some SELECT queries, taking backups, etc as usual.  However, if you do this <em>you actually compromise your high-availability plan</em> a little, because you can mask the lack of capacity that will result when one of the servers is down and you have to rely on just one server to keep everything on its feet.</p>
<p>If you need really high availability, you can't load the pair of servers more than a single server can handle.  (You can always use the passive server for non-essential needs -- it doesn't have to be completely dead weight.)  As a result, some people choose to <em>make the passive server truly passive, handling none of the application's queries</em>.  It just sits there replicating and doing nothing else.</p>
<p>The problem is that the passive server's caches start to get skewed to handle the write workload from replication, and not the read workload it will have to handle if there's a planned or unplanned failover.  This isn't a big problem on small systems, but with buffer pools in the dozens of gigabytes (which is arguably "small" these days), it starts to matter a lot.  Warming up a system so it's actually responsive can take hours.  As a result, the passive master isn't truly hot anymore.  It needs to handle the workload it's supposed to be ready to take over.  If you fail over to it, it might perform very badly -- get unresponsive, cause tons of I/O, etc.  In reality, it can be completely unusable for a long time.</p>
<p>To measure how much this really matters, I did some tests for a customer who was having troubles with this type of scenario.  I used <a href="http://www.maatkit.org/doc/mk-query-digest.html">mk-query-digest</a> (with some new features) to watch the traffic on the active master and replay SELECT queries against the passive one.  I timed the results and ran them through the analysis part of mk-query-digest.  A simple key lookup ran in tens of milliseconds on the active master, but executed for up to dozens of seconds on the passive one.</p>
<p>After a couple of hours of handling SELECT traffic, these same queries were responding nicely on the passive master, too.</p>
<p>Is that all?  "Buffer pool warmed up, performance is better, case closed!"  No.  This isn't as simple as it sounds on the surface.  There are two things happening and both are important to understand.</p>
<p>The first, most obvious phenomenon is that the buffer pool gets skewed to handle the write workload. Since we're running <a href="http://www.percona.com/percona-lab.html">Percona's patched server</a>, we can actually <a href="http://www.percona.com/docs/wiki/patches:innodb_show_bp">measure what's in the buffer pool</a>.  I measured the active master's buffer pool with the following query:</p>
<div class="igBar"><span id="lsql-5"><a href="#" onclick="javascript:showPlainTxt('sql-5'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-5">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">SELECT</span> table_schema, table_name, page_type, count<span style="color:#006600; font-weight:bold;">&#40;</span>*<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">FROM</span> information_schema.innodb_buffer_pool_content <span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #cc66cc;color:#800000;">1</span>, <span style="color: #cc66cc;color:#800000;">2</span>, <span style="color: #cc66cc;color:#800000;">3</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">INTO</span> <span style="color: #993333; font-weight: bold;">OUTFILE</span> <span style="color: #ff0000;">'/tmp/buffer-pool-contents.txt'</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>I loaded this file into a table on my laptop with LOAD DATA INFILE and kept it for later.  I did the same on the slave.  Then I used mk-query-digest to watch the traffic on the active master:</p>
<div class="igBar"><span id="lcode-6"><a href="#" onclick="javascript:showPlainTxt('code-6'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-6">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mk-query-digest --processlist h=active \</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; --filter <span style="color:#CC0000;">'$event-&gt;{arg} =~ m/^SELECT/i'</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; --execute h=passive </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>After a bit I CTRL-C'ed it and it printed out the analysis of the time taken to run the queries against the passive master.  I restarted it and after a few hours of this I did the same thing; the query timings were dramatically better now.  Then I just let it keep running without any aggregation options to avoid any overhead of storing and analyzing queries.  (I added --mirror and --daemonize options so it can run in the background and follow along when the passive/active roles switch.)</p>
<p>After a day or so of doing this, I re-sampled the buffer pool contents on the passive server.  With all three samples stored in tables on my laptop, I wrote a query against these three sets of stats to find the top tables on the active server and left-join those against the tables on the passive server, with both a mixed workload from my mirrored SELECT statements and with the "pure" replication workload.  I totaled the pages up into gigabytes.  Here's the result:</p>
<table border="1">
<tr>
<th> db_table               </th>
<th> active </th>
<th> passive + SELECT  </th>
<th> passive </th>
</tr>
<tr>
<td> site.benefits          </td>
<td> 8.30 </td>
<td> 5.73 </td>
<td> 1.32 </td>
</tr>
<tr>
<td> .                      </td>
<td> 3.13 </td>
<td> 0.94 </td>
<td> 0.50 </td>
</tr>
<tr>
<td> site.user_actions      </td>
<td> 2.55 </td>
<td> 4.09 </td>
<td> 6.29 </td>
</tr>
<tr>
<td> site.user_achievements </td>
<td> 1.36 </td>
<td> 1.20 </td>
<td> 0.35 </td>
</tr>
<tr>
<td> site.clicks            </td>
<td> 1.26 </td>
<td> 3.05 </td>
<td> 5.13 </td>
</tr>
<tr>
<td> site.actions_finished  </td>
<td> 1.14 </td>
<td> 0.46 </td>
<td> 0.74 </td>
</tr>
<tr>
<td> site.ratings           </td>
<td> 0.91 </td>
<td> 0.89 </td>
<td> 0.48 </td>
</tr>
</table>
<p>The difference is clear.  The buffer pool contains over 8G of data for the site.benefits table on the active master, but if you just put a replication workload on the server, that falls to 1.32G.  Other tables are similar.  The mixed workload with some SELECT queries mirrored is somewhere between the two.</p>
<p>One thing we don't know is which pages are in the pool. Same table, same size of data doesn't mean same buffer pool contents.  An insert-only workload will probably fill the buffer pool with the most recent data; a mixed workload will usually have some different hot spot or mixture of hot spots, so it'll bring different parts of the table into memory.</p>
<p>So that's the first thing that's happening.  The second is the insert buffer.  Notice the pages with no database or table name -- the second row in the table above.  Those are a mixture of things, but it's overwhelmingly the insert buffer.</p>
<p>As <a href="http://www.mysqlperformanceblog.com/2009/01/13/some-little-known-facts-about-innodb-insert-buffer/">Peter explained in his recent post on the insert buffer</a>, the other thing the SELECTs do is keep the insert buffer in a production steady-state.  The buffered records are forced to be merged by the SELECTs, and a lot more of the pages from the insert buffer are in the buffer pool, not on disk.  So it's not just the buffer pool that gets skewed with a write-only workload!  The insert buffer can also cause terrible performance.  There are some subtleties about exactly what's happening that I'm still investigating and may write more about later, in this particular case.</p>
<p>So what can we conclude from this?  Simply this: if you have a standby server that's not under a realistic workload, you won't be able to get good performance after a failover.  You need to use some technique to mirror the read-only workload to the passive server.  It doesn't have to be the tools I used -- it could be MySQL Proxy or a TCP sniffer or anything else.  But if you need fast failover, you need some way to at least partially emulate a production workload on the standby machine.</p>
<p>PS: I see <a href="http://scale-out-blog.blogspot.com/2009/02/simple-ha-with-postgresql-point-in-time.html">Robert Hodges just published an article on warm standby for PostgreSQL</a>.  Link love for interested readers.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Baron Schwartz |
      <a href="http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/#comments">9 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/&amp;title=Faster MySQL failover with SELECT mirroring" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/&amp;title=Faster MySQL failover with SELECT mirroring" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/&amp;title=Faster MySQL failover with SELECT mirroring" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/&amp;T=Faster MySQL failover with SELECT mirroring" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/&amp;title=Faster MySQL failover with SELECT mirroring" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/02/01/fast-mysql-master-master-failover-with-select-mirroring/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>High-Performance Click Analysis with MySQL</title>
		<link>http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/</link>
		<comments>http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/#comments</comments>
		<pubDate>Tue, 23 Dec 2008 03:48:17 +0000</pubDate>
		<dc:creator>Baron Schwartz</dc:creator>
				<category><![CDATA[ideas]]></category>
		<category><![CDATA[optimizer]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[xtradb]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=562</guid>
		<description><![CDATA[We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they're generally some kind of loggable event.
The next characteristic of a lot of these systems (real or planned) is the [...]]]></description>
			<content:encoded><![CDATA[<p>We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they're generally some kind of loggable event.</p>
<p>The next characteristic of a lot of these systems (real or planned) is the desire for "real-time" analysis.  Our customers often want their systems to provide the freshest data to their own clients, with no delays.</p>
<p>Finally, the analysis is usually multi-dimensional.  The typical user wants to be able to generate summaries and reports in many different ways on demand, often to support the functionality of the application as well as to provide reports to their clients.  Clicks by day, by customer, top ads by clicks, top ads by click-through ratio, and so on for dozens of different types of slicing and dicing.</p>
<p>And as a result, one of the most common questions we hear is how to build high-performance systems to do this work.  Let's see some ways you can build the functionality you need and get the performance you need.  Because I've built two such systems to manage online ads through Google Adwords, Yahoo, MSN and others, it's easy and familiar for me to use the example of search engine marketing.  I'll do that throughout this article.</p>
<p><strong>Requirements</strong></p>
<p>The words "need" and "want" are different.  Do you really need atomic-level data?  Do you really need real-time reporting?  If you do, the problem is much more expensive to solve.</p>
<p>Start with the granularity of your data.  What data do you need to make your business run?  If you can't get access to the time of day of every click on every ad, will it hamper your ability to measure the ad's value?  Is it enough to know how many times the ad was clicked each day?  If so, you can roll all those events up into a per-day table.</p>
<p>Next, let's look at "real-time."  None of the big three (Google, Yahoo, MSN) provides real-time reporting last time I was involved with them (and I suspect this is still true).  It's too expensive.  Consider your user expectations.  For most applications I've been involved with, having day-old data is adequate, and users don't expect realtime.  The trick here is that when you start out, realtime is possible because your data is small.  "Hey, we do realtime reporting.  Google doesn't even do that!  We're better!" Then you get popular <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />   And if you've promoted your better-ness in the meantime, you might have to do some awkward backpedaling with customers, who now expect realtime data.  The database giveth, and the database taketh away.</p>
<p>Finally, you should think a lot about how you need to query the data.  It is a hard question to answer, and sometimes I've seen it evolve over time, especially as the growing data size forces it to.  This goes back to what data you really need to make your business run.  Anything else is gravy.  If there are nice-to-haves, consider not building them in.  <a href="http://blip.tv/file/1356502">Listen to some talks by 37Signals</a> if you need inspiration to toss things out.  Define the types of queries you absolutely have to have, if possible, and note the ways and types of aggregation (by-ad by-day, for example).</p>
<p>Sometimes I ask a customer "what kinds of queries do you have to run?" and they say "we can't decide, so we want to just store everything." If you can't decide yet, then don't store everything in the database.  Instead, store the source data in some fashion that you can reload later, such as flat files, and build support in the database for one or two capabilities you absolutely need now; then add the rest later, reloading the data if needed.</p>
<p><strong>Aggregate</strong></p>
<p>Aggregation is absolutely key for most people.  There are special cases, and there are ways to do general-purpose work without aggregating (see the section below on technologies), but if you're doing this with vanilla MySQL, you will need to aggregate your data.</p>
<p>What you want to do is aggregate in ways that optimize the most expensive things you'll do.  And then, you might super-aggregate too.  For example, if you aggregate by day and then you do a lot of queries over 365-day ranges for year-over-year analysis, aggregate again by month.  Then write your queries to use the most aggregated data possible to save work.</p>
<p>Avoid operations that update huge chunks of aggregated data at once.  Among other things, you'll make replication lag badly.  More about this later.</p>
<p>Another way to say "aggregate" is to say "pre-compute."  If you have time-critical queries for your app to do its work, can you do the work ahead of time so it's ready to get when needed?  This might or might not be aggregation.</p>
<p><strong>Denormalize</strong></p>
<p>Pre-computing and careful denormalization need to go together.  Figure out what other types of data you'll need in those aggregate tables, and include columns to support these queries.  But beware of denormalizing with character data; try to make your rows fixed-length.</p>
<p>One reason denormalization is important is that nested-loop joins on large data sets are very expensive.  If MySQL supported sort-merge or hash joins, you'd have other possibilities, but it doesn't, so you want to build your aggregate tables to avoid joins.</p>
<p><strong>Watch Data Types</strong></p>
<p>Does your ad ID look like "8a4dabde-1c82-102c-ab13-0019b984eacd" and is it stored in a VARCHAR(36)?  When tables get big, every byte matters a lot.  Use the smallest data types you can, the simplest character sets you can, and watch out for NULLable columns.  Use smallint unsigned or tinyint unsigned if you can.  You can save very large amounts of space.  Choose primary keys very carefully, especially with InnoDB tables -- don't use GUIDs.  Which brings me to my next point:</p>
<p><strong>Use InnoDB</strong></p>
<p>Assuming that you will use the stock MySQL server, InnoDB is usually your best bet.  (Actually, <a href="https://launchpad.net/percona-xtradb">XtraDB</a> might be very interesting for you, but I digress).  Due to the cost of repairing huge MyISAM tables and taking downtime, I would not use MyISAM for anything but read-only tables when things get big.  And even if it's read-only, there's still another reason to use InnoDB/XtraDB tables...</p>
<p><strong>Optimize For I/O</strong></p>
<p>It is pretty much inevitable: if you do this kind of data processing in MySQL, you're going to end up heavily I/O bound.  Listen to any of the talks at past MySQL conferences from people who have built systems like yours, and there's a fair chance they will talk about how hard they have to work on I/O capacity.</p>
<p>What does this have to do with InnoDB?  Data clustering. InnoDB's primary keys define the physical order rows are stored in.  That lets you choose which rows are stored close to each other, which is very beneficial in many cases.  Especially on huge tables, it lets you scan portions of a table instead of the whole table if you a) choose your aggregation to match the order of your common queries and b) choose your primary key correctly.</p>
<p>Let's go back to the ad-by-day table.  If you query date ranges most of the time, you should define the primary key as (day, ad).  Don't use an auto-increment primary key, and don't put ad first.  If you put ad first, then you're going to scan the whole table to query for information about yesterday.  If you put day first, then yesterday will all be stored physically together (within the page -- the pages themselves may be widely separated, but that's another matter).</p>
<p><strong>Don't Store Non-Aggregated Data</strong></p>
<p>I've been talking a lot about aggregated data.  What do you do with the non-aggregated data?  My answer is usually simple: just don't store it in the database.  Instead, pre-aggregate.  Suppose your data is coming from some Apache log or similar source.  Write a script to rip through the file and parse it 10k lines at a time, aggregating as it goes.  When each chunk is done, make it write out a CSV file and import that with LOAD DATA INFILE.  Keep those big fat log files out of the database.  The database is usually the most expensive and hardest-to-scale component in your system -- don't waste resources.</p>
<p>Another benefit of this is the chance to parallelize.  As you know, MySQL doesn't do intra-query parallelization, so ETL jobs written to rely on SQL tend to get really bogged down.  In contrast, moving the processing outside the database lets you parallelize trivially.</p>
<p>If you need to analyze the non-aggregated data, you can store it on the filesystem and write custom scripts to do special-purpose tasks on it.  Storing a little meta-data about each file can help a lot.  Store the ranges of values for various attributes, for example; or the presence or absence of values.  You can put these into the database in a little meta-table.  Then your script can figure out which files it can ignore.  What we're doing here starts to look like a hillbilly version of Infobright, which I'll talk about later.</p>
<p>Alternately, you can store the atomic data as CSV files and use the CSV engine so you have an SQL interface to it (the meta-tables are still a valid approach here!).  This is an easy way to bypass the hard-to-scale database server for the initial insertion, because you can write CSV files with any programming language.  Naturally, CSV files don't store as compactly on disk as [Compressed] MyISAM or Archive.</p>
<p>These are just some ideas I'm throwing around -- the point is to think outside the box, even to think of things that seem "less advanced" than using a database.</p>
<p><strong>Sharding and Partitioning</strong></p>
<p>Sharding is inevitable if your write workload exceeds the capacity of a single server (or if you're using replication, the capacity of a single slave).  Sharding can also help you avoid massive tables that are too big to maintain.  If you know you'll get there, it can change the lifecycle of your application in advance.</p>
<p>What about partitioning in MySQL 5.1?  I know there are some cases when it can help a lot, and we've proven that with our customers.  But you still have to think about how to avoid enormous tables that are hard to maintain, back up, and restore.  And the partitioning functionality is not done yet and not fully integrated into the server, so I expect to find a lot more bugs and annoyances.  There are already inconvenient limitations on some key parts of partitioning, such as maintenance and repair commands, that essentially negate the benefits of partitioning for those operations.  An finally, it doesn't save you from the downtime caused by ALTER TABLE -- a typical reason to think about master-master with failover and failback for maintenance.  As with anything, it's a cost-benefit equation.  What are your priorities?  Choose the solution that meets them.</p>
<p><strong>Be Careful With Data Integrity</strong></p>
<p>When you're storing several levels of aggregation, and there's denormalization, you need to be scrupulous about data cleanliness, because it's really hard to fix things up later.  If your data is coming from a partner site, and you upload bad data there, you'll be getting bad data back for a long time.  And every time you have some incremental job to update the aggregates, you're exposed to that bad data again.</p>
<p>Any inconsistencies in the atomic data tend to get magnified as it gets aggregated, because you suddenly have a single row created from many rows, and if the many rows don't match completely, the single one doesn't know what data should live in it.  And this only gets harder to resolve as you get more levels of aggregations.</p>
<p><strong>Watch Out For The Long Tail</strong></p>
<p>People talk about the long tail and how you can focus on optimizing the short head.  It's the classic 80-20 rule.  Maybe 80% of your ad impressions are on 20% of your ads!  Hooray!  But don't forget that if you're aggregating per-day, an ad that gets a million impressions takes one row, and an ad that gets one impression takes exactly the same: one row.  An impression per day becomes a fixed overhead of storage size.  So, you actually have as many rows as you have unique ads per day.  Viewed this way, suddenly you start to hate the ads that occasionally get an impression.  They're so wasteful!</p>
<p>It's easy to flip back and forth between viewpoints on this and get distracted into making a mistake.  Watch out when you do your capacity planning.  Don't get fooled into calculating the wrong thing.</p>
<p><strong>Be Creative With Table Structures</strong></p>
<p>Suppose you have some yes/no fact about an ad impression, such as whether it was a blue ad (whatever that means.)  You start out with this:</p>
<div class="igBar"><span id="lsql-9"><a href="#" onclick="javascript:showPlainTxt('sql-9'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-9">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> ads_by_day_by_blueness <span style="color:#006600; font-weight:bold;">&#40;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; day date <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; ad int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; is_blue tinyint <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; clicks int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; impressions int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">....</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span><span style="color:#006600; font-weight:bold;">&#40;</span>day, ad, is_blue<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#006600; font-weight:bold;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>What can we improve here?  Especially assuming that there are indexes other than the primary key, we can shrink the primary key's width:</p>
<div class="igBar"><span id="lsql-10"><a href="#" onclick="javascript:showPlainTxt('sql-10'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-10">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> ads_by_day_by_blueness <span style="color:#006600; font-weight:bold;">&#40;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; day date <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; ad int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; clicks int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; impressions int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; blue_clicks int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; blue_impressions int <span style="color: #993333; font-weight: bold;">UNSIGNED</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">....</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span><span style="color:#006600; font-weight:bold;">&#40;</span>day, ad<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#006600; font-weight:bold;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>There are a couple of ways to handle this now.  You can have the clicks column record the total, and the blue_clicks column record only blue clicks; to find out non-blue clicks you subtract one from the other.  Or you can have the blue clicks and non-blue clicks stored, and to get the totals you add them.</p>
<p>Did this gain us anything?  We dropped one column, and we just moved those other values around to store them "next, to in the same row" instead of "below, in the next row."  So we're storing all the same data, right?</p>
<p>Logically, yes; physically, no.  Those values that we pivoted up beside their neighbors will share a set of primary key columns.  And not only will every index be a little narrower, the table will now contain only half as many rows.  That will make the indexes less than half the size.  In real life this technique often makes the table+index much less than half the size.  You have to write a little more complex queries, but that's often justified by a large reduction in table size.</p>
<p>I sort of stumbled upon this idea one day. I have no idea what this technique might be called, so I call it dog-earing the table (somehow the image of putting columns next to each other makes me think of putting cards next to each other and shoving).</p>
<p><strong>Archive</strong></p>
<p>If you don't need data anymore, move it away or get rid of it.  I wrote a <a href="http://www.xaprb.com/blog/2007/06/13/archive-strategies-for-oltp-servers-part-1/">three-part article on data archiving</a> on my own blog a while back.  The benefits of purging and archiving data can be dramatic.</p>
<p><strong>Take It Easy On Replication</strong></p>
<p>Building aggregated tables is hard work for the database server.  If you do it on the master with INSERT..SELECT queries, it will propagate to the slaves and it'll be hard work there too, assuming you use statement-based replication.</p>
<p>You can save that work by either using MySQL 5.1's row-based replication, or in MySQL 5.0 and earlier, doing the work on a slave, then piping the results back up to the master with LOAD DATA INFILE, which kind of emulates row-based replication in a way.</p>
<p>When you're updating big aggregate tables, don't work with giant chunks of them at once.  If there's any possible way, do it in manageable bits.  A day at a time, for example.</p>
<p>There are a lot of other ways you can make replication faster.  I wrote a lot about this in our book, which is linked from the sidebar above.</p>
<p><strong>Don't Assume Traditional Methods Will Save You</strong></p>
<p>What you're really doing here is building a data warehouse.  So you may think you should use traditional DW methods, like star schemas.  The problem is that MySQL doesn't tend to perform well on a data warehousing workload.  The nested-loop joins are not all that fast on big joins; the query optimizer can sometimes pick bad plans when you have a lot of joins between fact and dimension tables, and so on.  With careful tweaking, many of these things can be overcome, but how much time do you have?  And the gains are simply limited by some of MySQL's weaknesses in some cases.</p>
<p>Not only that, but star schemas are not intended to be fast.  The star schema is essentially "I admit defeat and accept table scans as a fact of life."  <a href="http://www.mysqlperformanceblog.com/2008/04/28/the-mysql-optimizer-the-os-cache-and-sequential-versus-random-io/">Table scans can be better than the alternative</a>, if the alternatives are limited, but they're still not what you need unless you're okay with long queries that read a lot of rows -- MySQL can't handle too many of those at once.</p>
<p>Aside from star schemas, another tactic I see people try a lot is to build "flexible schemas" with tables that contain name-value pairs or something similar.  The thought is that you can make the application believe it has a custom table, which is really constructed behind the scenes from the name-value tables in a complex query with many joins.  I have never seen this approach scale well.</p>
<p><strong>Use The Best Technologies You Can</strong></p>
<p>MySQL is not the end-all and be-all.  If you're familiar with it and it can serve you reasonably well, it's fine to use it for things that it's not 100% optimal for.  But if the costs of doing that are going to outweigh the costs of using another solution, then look at other solutions.</p>
<p>One that holds promise is <a href="http://www.infobright.org/">Infobright</a>.  While I have not evaluated their technology in depth, I think it merits a good look.  I had the chance at <a href="http://www.opensqlcamp.org/">OpenSQL Camp</a> to talk to Alex Esterkin and see him present on it, and based on that exposure, I think they are doing a lot of things right.  When I know enough to have a real opinion (or when other Percona people get to it before I do!) you'll see results on this blog.</p>
<p>Another is <a href="http://www.kickfire.com/">Kickfire</a> -- also something I have not had a chance to properly evaluate.  And there are others, and there will continue to be more.  Finally, <a href="http://www.postgresql.org/">PostgreSQL</a> is clearly better for some workloads out-of-the-box than MySQL is, especially for more complex queries.  Percona is not tied to MySQL, although we're most famous for our knowledge about it.  When another tool is the right one, we use it.</p>
<p>Have you thought about using something besides a database?  You have your choice of buzzwords these days.  Hadoop is a big one.  But beware of falling into the trap of brute-forcing a solution that really needs to be solved with intelligent engineering, instead of massive resources.</p>
<p><strong>Conclusion</strong></p>
<p>This article has been an overview of some of the tactics I've used to successfully scale large click-processing and other types of event-analysis databases.  In some cases I've been able to avoid sharding for a long time and run on many fewer disk drives with much less memory, or even with 10-15x fewer servers.  Clever application design, and a holistic approach, are absolutely necessary.  You can't look to the database to solve everything -- you have to give it all the help you can.  Hopefully it's useful to you, too!</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Baron Schwartz |
      <a href="http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/#comments">9 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/&amp;title=High-Performance Click Analysis with MySQL" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/&amp;title=High-Performance Click Analysis with MySQL" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/&amp;title=High-Performance Click Analysis with MySQL" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/&amp;T=High-Performance Click Analysis with MySQL" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/&amp;title=High-Performance Click Analysis with MySQL" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Thanks Giving Challenge:  How to detect replication context</title>
		<link>http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/</link>
		<comments>http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/#comments</comments>
		<pubDate>Thu, 27 Nov 2008 01:31:40 +0000</pubDate>
		<dc:creator>peter</dc:creator>
				<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=546</guid>
		<description><![CDATA[Happy Thanksgiving and little holiday challenge for you.
Say you have a trigger on the slave which you would like to work differently, depending on whenever update is executed via replication thread vs updating table locally ?   This can be helpful for example for auditing updates which were done directly instead of coming from [...]]]></description>
			<content:encoded><![CDATA[<p>Happy Thanksgiving and little holiday challenge for you.<br />
Say you have a trigger on the slave which you would like to work differently, depending on whenever update is executed via replication thread vs updating table locally ?   This can be helpful for example for auditing updates which were done directly instead of coming from the master and some other cases.<br />
Suggest how you would do it by commenting <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/#comments">12 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/&amp;title=Thanks Giving Challenge:  How to detect replication context" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/&amp;title=Thanks Giving Challenge:  How to detect replication context" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/&amp;title=Thanks Giving Challenge:  How to detect replication context" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/&amp;T=Thanks Giving Challenge:  How to detect replication context" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/&amp;title=Thanks Giving Challenge:  How to detect replication context" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2008/11/26/thanks-giving-challenge-how-to-detect-replication-context/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Three ways to know when a MySQL slave is about to start lagging</title>
		<link>http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/</link>
		<comments>http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/#comments</comments>
		<pubDate>Thu, 09 Oct 2008 00:56:06 +0000</pubDate>
		<dc:creator>Baron Schwartz</dc:creator>
				<category><![CDATA[replication]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=511</guid>
		<description><![CDATA[The trouble with slave lag is that you often can't see it coming.  Especially if the slave's load is pretty uniform, a slave that's at 90% of its capacity to keep up with the master can be indistinguishable from one that's at 5% of its capacity.
So how can you tell when your slave is [...]]]></description>
			<content:encoded><![CDATA[<p>The trouble with slave lag is that you often can't see it coming.  Especially if the slave's load is pretty uniform, a slave that's at 90% of its capacity to keep up with the master can be indistinguishable from one that's at 5% of its capacity.</p>
<p>So how can you tell when your slave is nearing its capacity to keep up with the master?  Here are three ways:</p>
<p>One: watch for spikes of lag.  If you have Cacti (and these <a href="http://code.google.com/p/mysql-cacti-templates/">Cacti templates for MySQL</a>) you can see this in the graphs.  If the graphs start to get a little bumpy, you can assume that the iceberg is floating higher and higher in the water, so to speak.  (Hopefully that's not too strange a metaphor.)  As the slave's routine work gets closer and closer to its capacity, you'll see these spikes get bigger and "wider".  The front-side of the spike will always be less than a 45-degree angle in ordinary operation[1] but the back-side, when the slave is catching up after lagging behind, will become a gentler and gentler slope.</p>
<p>Two: deliberately make a slave fall behind, then see how fast it can catch up.  This is sort of related to Method One.  The goal here is to explicitly see how steep the backside of that slope is.  If you stop a slave for an hour, then start it again and it catches up in one hour, it is running at 1/2 of its capacity.  (In case that's confusing: if you stop it at noon and restart it at 1:00, and it's caught up again at 2:00, it played all statements from 12:00 to 2:00 in 1 hour, so it went at 2x speed.)</p>
<p>Three: measure it more scientifically.  Use <a href="http://www.percona.com/percona-lab.html">our patched server</a>, which gives you a USER_STATISTICS table.</p>
<div class="igBar"><span id="lsql-12"><a href="#" onclick="javascript:showPlainTxt('sql-12'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-12">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mysql&gt; <span style="color: #993333; font-weight: bold;">SELECT</span> * <span style="color: #993333; font-weight: bold;">FROM</span> INFORMATION_SCHEMA.USER_STATISTICS <span style="color: #993333; font-weight: bold;">WHERE</span> USER=<span style="color: #ff0000;">'#mysql_system#'</span>\G</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">*************************** <span style="color: #cc66cc;color:#800000;">1</span>. row ***************************</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; USER: <span style="color: #808080; font-style: italic;">#mysql_system#</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp;TOTAL_CONNECTIONS: <span style="color: #cc66cc;color:#800000;">1</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">CONCURRENT_CONNECTIONS: <span style="color: #cc66cc;color:#800000;">2</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; CONNECTED_TIME: <span style="color: #cc66cc;color:#800000;">46188</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;BUSY_TIME: <span style="color: #cc66cc;color:#800000;">719</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ROWS_FETCHED: <span style="color: #cc66cc;color:#800000;">0</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ROWS_UPDATED: <span style="color: #cc66cc;color:#800000;">1882292</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp;SELECT_COMMANDS: <span style="color: #cc66cc;color:#800000;">0</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp;UPDATE_COMMANDS: <span style="color: #cc66cc;color:#800000;">580431</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; OTHER_COMMANDS: <span style="color: #cc66cc;color:#800000;">338857</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp;COMMIT_TRANSACTIONS: <span style="color: #cc66cc;color:#800000;">1016571</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;ROLLBACK_TRANSACTIONS: <span style="color: #cc66cc;color:#800000;">0</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>You can compare the BUSY_TIME to one-half the CONNECTED_TIME (because there are two replication threads on the slave) to see how much of the time the slave thread was actively processing statements.  If the slave threads are always running, you can just use the server's uptime instead.</p>
<p>[1] There are cases where this isn't true, especially if you're monitoring Seconds_behind_master instead of using <a href="http://www.maatkit.org/">mk-heartbeat</a>, which is immune to this anomaly.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Baron Schwartz |
      <a href="http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/#comments">8 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/&amp;title=Three ways to know when a MySQL slave is about to start lagging" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/&amp;title=Three ways to know when a MySQL slave is about to start lagging" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/&amp;title=Three ways to know when a MySQL slave is about to start lagging" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/&amp;T=Three ways to know when a MySQL slave is about to start lagging" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/&amp;title=Three ways to know when a MySQL slave is about to start lagging" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2008/10/08/three-ways-to-know-when-a-mysql-slave-is-about-to-start-lagging/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Fighting MySQL Replication Lag</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/</link>
		<comments>http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/#comments</comments>
		<pubDate>Tue, 23 Sep 2008 04:16:46 +0000</pubDate>
		<dc:creator>peter</dc:creator>
				<category><![CDATA[production]]></category>
		<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=507</guid>
		<description><![CDATA[The problem of MySQL Replication unable to catch up is quite common in MySQL world and in fact I already wrote about it.   There are many aspects of managing mysql replication lag such as using proper hardware and configuring it properly.  In this post I will just look at couple of query [...]]]></description>
			<content:encoded><![CDATA[<p>The problem of MySQL Replication unable to catch up is quite common in MySQL world and in fact I <a href="http://www.mysqlperformanceblog.com/2007/10/12/managing-slave-lag-with-mysql-replication/">already wrote</a> about it.   There are many aspects of managing mysql replication lag such as using proper hardware and configuring it properly.  In this post I will just look at couple of query design mistakes which result in low hanging fruit troubleshooting MySQL Replication Lag</p>
<p>First fact you absolutely need to remember is MySQL Replication is single threaded, which means if you have any long running write query it clogs replication stream and  small and fast updates which go after it in MySQL binary log can't proceed.    It is either more than than just about queries - if you're using explicit transactions all updates from the transactions are buffered together and when dumped to binary log  as one big chunk which can't be interleaved by any other query execution. So if you have transaction containing millions of simple updates instead of one large update to help MySQL replication lag it is not going to work.</p>
<p>This brings us to <strong>rule number one</strong>  - if you care about replication latency you must not have any long running updates. Queries or transactions containing multiple update queries which  add up to long time.     I would keep the maximum query length at about 1/5th of the maximum replication lag you're ready to tolerate.  So  if you want your replica to be no more than 1 minute behind keep the longest update query to 10 sec or so.   This is of course rule of thumb depending on differences in master/slave configuration, their load and concurrency you  may need to keep the ratio higher or allow a bit longer queries. </p>
<p>What should you do if you need to update a lot of rows ? Use Query Chopping - this can be running update/delete with LIMIT in the loop,   controlling maximum amount of values per batch in multiple row insert statement or Fetching data you're planning to update/delete and having multiple queries to delete it (see example below)</p>
<p>This brings us to yet another rule for smart replication - do not make Slave to do more work than it needs to do. It is crippled by having to do all of this in single thread already - do not make it even harder.  If there is considerable effort needed to select rows for modification - spread it out and have separate select and update queries.  In such case slave will only need to run UPDATE<br />
Example:</p>
<div class="igBar"><span id="lsql-15"><a href="#" onclick="javascript:showPlainTxt('sql-15'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-15">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">UPDATE</span> posts <span style="color: #993333; font-weight: bold;">SET</span>&nbsp; spam=<span style="color: #cc66cc;color:#800000;">1</span> <span style="color: #993333; font-weight: bold;">WHERE</span> body <span style="color: #993333; font-weight: bold;">LIKE</span> <span style="color: #ff0000;">"%cheap rolex%"</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>This query will perform full table scan in MySQL 5.0 (even if there are no spam posts) which will load slave significantly.   You can replace it with:</p>
<div class="igBar"><span id="lsql-16"><a href="#" onclick="javascript:showPlainTxt('sql-16'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-16">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">SELECT</span> id <span style="color: #993333; font-weight: bold;">FROM</span> posts <span style="color: #993333; font-weight: bold;">WHERE</span>&nbsp; body <span style="color: #993333; font-weight: bold;">LIKE</span> <span style="color: #ff0000;">"%cheap rolex%"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #993333; font-weight: bold;">UPDATE</span> posts <span style="color: #993333; font-weight: bold;">SET</span> spam=<span style="color: #cc66cc;color:#800000;">1</span> <span style="color: #993333; font-weight: bold;">WHERE</span> id <span style="color: #993333; font-weight: bold;">IN</span> <span style="color:#006600; font-weight:bold;">&#40;</span>list of ids<span style="color:#006600; font-weight:bold;">&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>If there could be many ids matched on the first place you should also use query chopping and run update in chunks if application allows it. </p>
<p>In MySQL 5.1 with row level replication you will not have selection process running on SLAVE but it will not do the chopping for you.</p>
<p>In general this trick does not only work well for full table scan updates  but in general for cases when there are much more rows examined than modified.</p>
<p>The <strong>next common mistake</strong> is using INSERT ... SELECT - which is in similar to what I just described but can be much worse as SELECT may end up being extremely complicated query.  It is best to avoid INSERT ... SELECT going through replication in 5.0 for many reasons (locking, long query time, waste of execution on slave).    Piping data through application is the best solution in many cases and is quite easy - it is trivial to write the function which will take SELECT query and the table to which store its result set and use in your application in all cases when you need this functionality.</p>
<p>Finally <strong>you should not overload your replication</strong> - Quite typically I see replication lagging when batch jobs are running.  These can load master significantly during their run time and make it impossible for slave to run the same load through single thread.   The solution in many cases is to simply space it out and slow down your batch job (such as adding sleep calls) to ensure there is enough breathing room for replication thread.</p>
<p>You can also have controlled execution of batch job - this is when they will check slave lag every so often and pause if it becomes too large. This is a bit more complicated approach but it saves you from running around and adjusting your sleep behavior to keep the progress fast enough and at the same time keep replication from lagging.</p>
<p>In many bad replication lags I've seen simply following these simple rules would avoid a lot of problems and often save massive hardware purchases or development efforts based on assumption MySQL replication can't possibly keep up any more.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/#comments">4 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/&amp;title=Fighting MySQL Replication Lag" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/&amp;title=Fighting MySQL Replication Lag" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/&amp;title=Fighting MySQL Replication Lag" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/&amp;T=Fighting MySQL Replication Lag" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/&amp;title=Fighting MySQL Replication Lag" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Recovery beyond data restore</title>
		<link>http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/</link>
		<comments>http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/#comments</comments>
		<pubDate>Sun, 03 Aug 2008 06:34:07 +0000</pubDate>
		<dc:creator>peter</dc:creator>
				<category><![CDATA[production]]></category>
		<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=456</guid>
		<description><![CDATA[Quite frequently I see customers looking at recovery as on ability to restore data from backup which can be far from being enough to restore the whole system to operating state, especially for complex systems.
Instead of looking just at data restore process you better look at the whole process which is required to bring system [...]]]></description>
			<content:encoded><![CDATA[<p>Quite frequently I see customers looking at recovery as on ability to restore data from backup which can be far from being enough to restore the whole system to operating state, especially for complex systems.</p>
<p>Instead of looking just at data restore process you better look at the whole process which is required to bring system to the working state, including data consistency requirements and times.  This has to be considered for different data loss scenarios which may happen. </p>
<p>Let us look at simple example  - a master with 1TB of database size replicating to 50 servers in 5 different Data Centers via single Replication Relay server in each.   Forget the single point of failure for the second and just think what problems we may have to deal with.  </p>
<p>First lets look at the master.  What may happen to it ?   We can have Master having <strong>soft crash</strong> in which case it will be unavailable for some time but we can get all the data back... or sort of.  In practice you have to be very careful such as using <strong>sync-binlog</strong> <strong>innodb_flush_logs_at_trx_commit=1</strong> and  only using Innodb tables  to be OK in most cases. There are still some edge cases such as modifying meta data stored in MyISAM tables which can get master out of sync with slaves in case of soft crash.   Unless you got into one of these rare cases slave should be able to continue after Master is back online.</p>
<p>Do you have to wait for master to recover ? This is where your <strong>data consistency requirements</strong> come in play.  Remember replication is asynchronous so whenever you switch to the slave in case of master failure you may loose transactions. Google semi-synchronous replication patches can help with it... but they are not yet in the stock MySQL. Yet another way is using DRBD to get a standby MySQL server or at least synchronously replicated master binary logs.    If you can't loose any single transaction you've can't simply switch to the slave.  </p>
<p>What if you can ?  The switch to the slave in this case is not very easy too - all slaves can be on different positions at the master and you need to pick the most up to date to promote. Plus you need to recompute positions as they should be on promoted slave and slave should have <strong>--log-slave-updates</strong> enabled so it somethat has copy of master logs. In many cases I've seen people do not do that and simply point slaves to the starting position of the promoted master - this is dangerous because you're risking all slaves to be inconsistent withe each other, plus if server was seriously behind you're risking to get major inconsistence because relay logs will be lost if you just re-point slave.  So at least you should wait for slave to process all its relay logs before re-pointing it. </p>
<p>Interesting enough Google has solution for us again which comes as "log mirroring" patches which make sure Slaves has copy of logs as they are on the master. </p>
<p>Now what do you do in case of <strong>hard crash</strong> this is when the data is lost on the master ?   This is when you have master data lost, such as you have RAID or disk failure.  Though it also can be things like Innodb corruption or soft crash which you can't recover promptly enough.  </p>
<p>In this case most typically you would plan recovery by switching to the slave (as described) or standby server via DRBD or SAN.   </p>
<p>As you can see we never mentioned recovering from backup so far. It will be needed in the worst case of data loss which is <strong>trashing the data which gets to all slaves via replication</strong>.   This can be caused by user or application error or security breach.</p>
<p>What choices do you have in this case ?  Your main options are using Backup or Slave with delayed replication (which you already could have set up with <strong>mk-slave-delay</strong> from <a href="http://www.maatkit.org">Maatkit</a>.  </p>
<p>Delayed slave is especially helpful if application can operate with just master as in this case you can switch very quickly (just skipping bad statements and catching up) </p>
<p>The main challenge in such failure is the fact you have many trashed copies to deal with.  If you have just one or several small tables corrupted you can reload them.  One option is to reload them on the master (and they will be replicated down to all slaves) the faster however (especially if you have many tiers of replication) is to bring all slaves to the same point in time and load data locally with <strong>SQL_LOG_BIN=0</strong> set for session. </p>
<p>If the large portion of data trashed you may need to recover full database on all slaves which is best done in binary mode for large data sets.   Such global recovery can also put very high stress on your network and backup storage and take a lot of time.  It also may be extremely difficult to get the large backup in timely fashion over long distance network, meaning it is best to have local backup  (and delayed slave if you use one) in each data center you have. </p>
<p>The complexity of recovery is another "liability" of compex replication tree setup.  On the contrary sharded master-master pairs (or master with few slaves) are much easier to deal with.</p>
<p>Recovering the data with replication you always have to keep replication positions in mind. Such if you recover master you need to recover slaves to matching snapshot - either it has to be same state (which is hard to manage) or you need to ensure you understand the position on the master to which backup corresponds to.  This becomes more complex if you have complex replication hierarchy as slave only knows its position on its own master not on the "root" master. </p>
<p><strong>Note </strong>there are also some solutions based on "Continuous Data Protection" class of backup which can be very helpful to go back in time with your data. One of vendors offering solution for MySQL is <a href="http://www.r1soft.com/">R1Soft</a>. Though I have not had a chance to look at it in details.</p>
<p><strong>What is about slave  loss ? </strong>  The slave loss is normally less of the problem. You can reclone slave from the master, another slave or restore from backup. So this is just question of having decent capacity planning (such as being able to shut off 2 slaves and still operate normally), have LVM setup if you want to avoid shutting off slave or master to clone the data and making sure the logs on the master go far back so you can restore from several of backup generations and do point in time recovery. </p>
<p>Timing recovery is also important. Especially for write intensive environments it may take many days to catchup from weekly backups by binary logs so make sure to time it properly.</p>
<p>In the real life environments can be even more complicated - one may use partial replication, replication to different storage engine,  add some tables beyond tables which are being replicated which all has to be accounted for for in the process of replication.</p>
<p>It is also worth to note beyond these 3  main recovery scenarios there are number of other cases which you have to deal with (which often can be resolved by doing recovery be one of these 3 protocols, but you can also take  as shortcut) - for example you may have master or relay binary log corruption.   Master or Slave running out of space,  Slave crashing (and loosing its position on master), Replication breakage (or running out of sync) due to MySQL bugs or wrong use. </p>
<p>Interesting enough very few people have their data recovery practices ironed out so they can answer how they would handle at least these 3 data loss cases for <strong>each</strong> of  servers they have deployed.  Even fewer have gone beyond theory and have tested the processes or have regular testing in place. </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/#comments">11 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/&amp;title=Recovery beyond data restore" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/&amp;title=Recovery beyond data restore" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/&amp;title=Recovery beyond data restore" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/&amp;T=Recovery beyond data restore" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/&amp;title=Recovery beyond data restore" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2008/08/02/recovery-beyond-data-restore/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Troubleshooting Relay Log Corruption in MySQL</title>
		<link>http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/</link>
		<comments>http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/#comments</comments>
		<pubDate>Sun, 03 Aug 2008 03:56:09 +0000</pubDate>
		<dc:creator>peter</dc:creator>
				<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=455</guid>
		<description><![CDATA[Have you ever seen the replication stopped with message like this:

 Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the [...]]]></description>
			<content:encoded><![CDATA[<p>Have you ever seen the replication stopped with message like this:</p>
<blockquote><p>
 Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
</p></blockquote>
<p>This is relay relay log corruption and you can check details in the MySQL Error log file.  The error message describes few reasons and indeed because there is little validation (ie no checksums) in the replication there are multiple reasons for bad event to show up in relay logs.</p>
<p>Really this is only one of various error messages you could see if relay log corrupted. You could also see malformed queries (with some junk), complaining about event to big etc if there is a  garbage in relay logs. </p>
<p>If relay logs are corrupted it is surely worth to check what could cause it - it could be network (especially if replicating over unreliable long distance networks), MySQL bugs on master or slave, hardware problems and few others. In any case it is worth investigating. </p>
<p>Investigating is what you do later but how do you fix the problem first ?  The important question you need to have answered - are logs corrupted on the master ?   If logs on the master are OK you can just run <strong>SHOW SLAVE STATUS</strong> on slave experiencing error and use CHANGE MASTER TO to re-point replication to Relay_Master_Log_File:Exec_Master_Log_Pos:</p>
<div class="igBar"><span id="lsql-18"><a href="#" onclick="javascript:showPlainTxt('sql-18'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">SQL:</span>
<div id="sql-18">
<div class="sql">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">localhost:<span style="color:#006600; font-weight:bold;">&#40;</span>none<span style="color:#006600; font-weight:bold;">&#41;</span>&gt; slave stop;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Query OK, <span style="color: #cc66cc;color:#800000;">0</span> rows affected <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color: #cc66cc;color:#800000;">0</span>.<span style="color: #cc66cc;color:#800000;">00</span> sec<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">localhost:<span style="color:#006600; font-weight:bold;">&#40;</span>none<span style="color:#006600; font-weight:bold;">&#41;</span>&gt; <span style="color: #993333; font-weight: bold;">CHANGE</span> master <span style="color: #993333; font-weight: bold;">TO</span> master_log_file=Relay_Master_Log_File,master_log_pos=Exec_Master_Log_Pos</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Query OK, <span style="color: #cc66cc;color:#800000;">0</span> rows affected <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color: #cc66cc;color:#800000;">1</span>.<span style="color: #cc66cc;color:#800000;">16</span> sec<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">localhost:<span style="color:#006600; font-weight:bold;">&#40;</span>none<span style="color:#006600; font-weight:bold;">&#41;</span>&gt; slave start;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Query OK, <span style="color: #cc66cc;color:#800000;">0</span> rows affected <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color: #cc66cc;color:#800000;">0</span>.<span style="color: #cc66cc;color:#800000;">00</span> sec<span style="color:#006600; font-weight:bold;">&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>This will purge existing relay logs re-fetch all events which have not been executed yet.   Doing this command make sure your master is operational and it still has all the logs needed to re-fetch events. </p>
<p>How would you know if logs are OK on the master ?  Well in this case there were probably 5 another slaves which did not have the problem - which means Master is most likely OK.  In any case it is little harm to try restarting from the same position - if logs are bad on the master you would get the same error message again and can continue with investigation.</p>
<p><strong>What if logs on the master are corrupted ? </strong>  In this case you have  couple of choices (and you also potentially have multiple slaves to deal with).  You can use <strong>mysqlbinlog</strong>  (or you favorite hex editor if mysqlbinlog does not work) to find the next event start and potentially recover "corrupted" event to be manually executed on the slaves. </p>
<p>Skipping around event makes master and slave potentially inconsistent and you should access the risks depending on applications (and on amount of events which were corrupted) you may want to let replication continue from the new position or resync the slaves to the master.</p>
<p><strong>How can you recover the slave</strong> ?   As all slaves are likely to be affected in this case you can't clone another slave.  You also can't use classical method of recovery from backup - because you would need relay logs to roll forward, and they are corrupted.  You can either re-clone the data from Master.  (This is where LVM or similar techniques can help you a lot)  or skip bad events as described and when use <a href="http://www.maatkit.org/">Maatkit </a> <strong>mk-table-checksum</strong> to check  what tables are out of sync and when use <strong>mk-table-sync</strong> to resync them.</p>
<p>Last method works in particularly well in case you can afford to run for a while with slaves which are a bit out of sync, which is quite often better than having just master available (also having extra load of data copied from it).  </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/#comments">3 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/&amp;title=Troubleshooting Relay Log Corruption in MySQL" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/&amp;title=Troubleshooting Relay Log Corruption in MySQL" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/&amp;title=Troubleshooting Relay Log Corruption in MySQL" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/&amp;T=Troubleshooting Relay Log Corruption in MySQL" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/&amp;title=Troubleshooting Relay Log Corruption in MySQL" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2008/08/02/troubleshooting-relay-log-corruption-in-mysql/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
