<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MySQL Performance Blog&#187; Benchmarks</title>
	<atom:link href="http://www.mysqlperformanceblog.com/category/benchmarks/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com</link>
	<description>Percona&#039;s MySQL &#38; InnoDB performance and scalability blog</description>
	<lastBuildDate>Sat, 11 Feb 2012 00:45:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Benchmarks of new  innodb_flush_neighbor_pages</title>
		<link>http://www.mysqlperformanceblog.com/2012/01/17/benchmarks-of-new-innodb_flush_neighbor_pages/</link>
		<comments>http://www.mysqlperformanceblog.com/2012/01/17/benchmarks-of-new-innodb_flush_neighbor_pages/#comments</comments>
		<pubDate>Wed, 18 Jan 2012 05:27:54 +0000</pubDate>
		<dc:creator>Vadim Tkachenko</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[Insight for DBAs]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=8256</guid>
		<description><![CDATA[In our recent release of Percona Server 5.5.19 we introduced new value for innodb_flush_neighbor_pages=cont. This way we are trying to deal with the problem of InnoDB flushing. Actually there is also the second fix to what we think is bug in InnoDB, where it blocks queries while it is not needed (I will refer to [...]]]></description>
			<content:encoded><![CDATA[<p>In our recent release of <a href="http://www.mysqlperformanceblog.com/2012/01/13/announcing-percona-server-5-5-19-24-0/">Percona Server 5.5.19</a> we introduced new value for <code>innodb_flush_neighbor_pages=cont</code>.<br />
This way we are trying to deal with the problem of <a href="http://www.mysqlperformanceblog.com/2011/09/18/disaster-mysql-5-5-flushing/">InnoDB flushing</a>.</p>
<p><span id="more-8256"></span></p>
<p>Actually there is also the second fix to what we think is bug in InnoDB, where it blocks queries while it is not needed (I will refer to it as &#8220;sync fix&#8221;). In this post I however will focus on <strong>innodb_flush_neighbor_pages</strong>.</p>
<p>By default InnoDB flushes so named neighbor pages, which really are not neighbors.<br />
Say we want to flush page P. InnoDB is looking in an area of 128 pages around page P, and flushes all the pages in that area that are dirty. To illustrate, say we have an area of memory like this: <code>...D...D...D....P....D....D...D....D</code> where each dot is a page that does not need flushing, each “D” is a dirty page that InnoDB will flush, and P is our page.<br />
So, as the result of how it works, instead of performing 1 random write, InnoDB will perform 8 random writes.<br />
This is quite far from original intention to flush as many pages as possible in singe sequential write.</p>
<p>So we added new <code>innodb_flush_neighbor_pages=cont</code> method, with it, only really sequential write will be performed<br />
That is case <code>...D...D...D..DDDPD....D....D...D....D</code> only following pages will be flushed:<br />
<code>...D...D...D..FFFFF....D....D...D....D</code> (marked as &#8220;F&#8221;)</p>
<p>Beside &#8220;cont&#8221;, in Percona Server 5.5.19 <code>innodb_flush_neighbor_pages</code> also accepts values &#8220;area&#8221; (default) and &#8220;none&#8221; (recommended for SSD).</p>
<p>What kind of effect does it have ? Let&#8217;s run some benchmarks.</p>
<p>We repeated the same benchmark I ran in <a href="http://www.mysqlperformanceblog.com/2011/09/18/disaster-mysql-5-5-flushing/">Disaster MySQL 5.5 flushing</a>, but now we used two servers: <a href="http://www.percona.com/docs/wiki/benchmark:hardware:cisco_ucs_c250">Cisco UCS C250</a> and <a href="http://www.percona.com/docs/wiki/benchmark:hardware:hp_proliant_dl380">HP ProLiant DL380 G6</a></p>
<p>First results from HP ProLiant.</p>
<p>Throughput graph:<br />
<a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/hppro.res_.thrp_.png"><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/hppro.res_.thrp_.png" alt="" title="hppro.res.thrp" width="600" height="400" class="aligncenter size-full wp-image-8269" /></a></p>
<p>Response time graph (axe y has logarithmic scale):<br />
<a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/hppro.res_.resp_.png"><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/hppro.res_.resp_.png" alt="" title="hppro.res.resp" width="600" height="400" class="aligncenter size-full wp-image-8268" /></a></p>
<p>As you see with &#8220;cont&#8221; we are able to get stable line. And even with default innodb_flush_neighbor_pages, Percona Server has smaller dips than MySQL.</p>
<p>So this is to show effect of &#8220;sync fix&#8221;, let&#8217;s compare Percona Server 5.5.18 (without fix) and 5.5.19 (with fix).</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/hppro.sync_.res_.thrp_.png"><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/hppro.sync_.res_.thrp_.png" alt="" title="hppro.sync.res.thrp" width="600" height="400" class="aligncenter size-full wp-image-8270" /></a></p>
<p>You see that the fix helps to have queries running in cases when before it was &#8220;hard&#8221; stop, and no<br />
transaction processed.</p>
<p>The previous result may give you impression that &#8220;cont&#8221; guarantees stable line, but unfortunately this is not always the case.</p>
<p>There are results ( throughput and response time) from  Cisco UCS 250 server:</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/cisco.res_.thrp_.png"><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/cisco.res_.thrp_.png" alt="" title="cisco.res.thrp" width="600" height="400" class="aligncenter size-full wp-image-8267" /></a></p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/cisco.res_.resp_.png"><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2012/01/cisco.res_.resp_.png" alt="" title="cisco.res.resp" width="600" height="400" class="aligncenter size-full wp-image-8266" /></a></p>
<p>You see, on this server we have longer and deeper periods when MySQL stuck in flushing, and in such cases, the<br />
 <code>innodb_flush_neighbor_pages=cont</code> only helps to relief the problem, not completely solving it.<br />
Which, I believe, is still better than complete stop for significant amount of time.</p>
<p>The raw results, scripts and different CPU/IO metrics are available from our <a href="http://bazaar.launchpad.net/~vadim-tk/percona-benchmark-result/cisco-hppro-flushing/files">Benchmarks Launchpad</a></p>
<p><script type="text/javascript">mbgc='f5f5f5';ww='320';mbc='cecece';bbc='3F79D5';bmobc='3b71c6';bbgc='4889F0';bmoc='3F79D5';bfc='FFFFFF';bmofc='ffffff';tlc='cecece';tc='6a6a6a';nc='6a6a6a';bc='6a6a6a';l='y';fs='16';fsb='13';bw='100';ff='4';pc='4889F0';b='s'; pid='109242749016593233313';</script><script type="text/javascript" src="http://widgetsplus.com/google_plus_widget.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2012/01/17/benchmarks-of-new-innodb_flush_neighbor_pages/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Percona testing:  Quick test clusters with kewpie!</title>
		<link>http://www.mysqlperformanceblog.com/2012/01/13/percona-testing-quick-test-clusters-with-kewpie/</link>
		<comments>http://www.mysqlperformanceblog.com/2012/01/13/percona-testing-quick-test-clusters-with-kewpie/#comments</comments>
		<pubDate>Fri, 13 Jan 2012 18:27:48 +0000</pubDate>
		<dc:creator>patrick.crews</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=8212</guid>
		<description><![CDATA[The announcement of Percona XtraDB Cluster seems to have generated a fair bit of interest : ) Although the documentation contains more formal instructions for setting up a test cluster, I wanted to share a quick way to set up an ad-hoc cluster on a single machine to help people play with this (imho) rather [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.mysqlperformanceblog.com/2012/01/09/announcement-of-percona-xtradb-cluster-alpha-release/">announcement of Percona XtraDB Cluster</a> seems to have generated a fair bit of interest : )</p>
<p>Although the <a href="http://www.percona.com/doc/percona-xtradb-cluster/index.html">documentation</a> contains more formal <a href="http://www.percona.com/doc/percona-xtradb-cluster/index.html#how-to">instructions for setting up a test cluster</a>, I wanted to share a quick way to set up an ad-hoc cluster on a single machine to help people play with this (imho) rather amazing bit of software.</p>
<p>To do this, you will need kewpie (PXC will have <a href="https://code.launchpad.net/~patrick-crews/percona-xtradb-cluster/qp-update">kewpie in-tree</a> soon)<br />
cd basedir;<br />
bzr branch lp:kewpie</p>
<p>edit the file kewpie.py like so:</p>
<pre>
=== modified file 'kewpie.py'
--- kewpie.py    2012-01-09 21:17:09 +0000
+++ kewpie.py    2012-01-11 18:32:17 +0000
@@ -49,9 +49,9 @@ from lib.test_mgmt.execution_management
# We base / look for a lot of things based on the location of
# the kewpie.py file
qp_rootdir = os.path.dirname(os.path.abspath(sys.argv[0]))
-#project_name = 'percona-xtradb-cluster'
+project_name = 'percona-xtradb-cluster'
#project_name = 'xtrabackup'
-project_name = None
+#project_name = None
defaults = get_defaults(qp_rootdir,project_name)
variables = test_run_options.parse_qp_options(defaults)
variables['qp_root'] = qp_rootdir
</pre>
<p>Or you may branch kewpie anywhere and simply pass appropriate &#8211;basedir and &#8211;wsrep-provider-path instructions and use &#8211;default-server-type=galera</p>
<p>* A default location of /usr/lib/galera/libgalera_smm.so is assumed</p>
<p>To get your cluster, run the tests with &#8211;start-and-exit:<br />
./kewpie.py  &#8211;start-and-exit<br />
This will start up 3 nodes and join them into a cluster:</p>
<pre>
percona-xtradb-cluster/kewpie$ ./kewpie.py --start-and-exit
Setting --no-secure-file-priv=True for randgen usage...
20120113-125552 INFO Using --no-shm, will not link workdir to shm
20120113-125552 INFO Using mysql source tree:
20120113-125552 INFO basedir: /percona-xtradb-cluster
20120113-125552 INFO clientbindir: /percona-xtradb-cluster/client
20120113-125552 INFO testdir: /percona-xtradb-cluster/kewpie
20120113-125552 INFO server_version: 5.5.17
20120113-125552 INFO server_compile_os: Linux
20120113-125552 INFO server_platform: x86_64
20120113-125552 INFO server_comment: (Source distribution wsrep_22.3.r3683)
20120113-125552 INFO Using default-storage-engine: innodb
20120113-125552 INFO Using testing mode: native
20120113-125552 INFO Processing test suites...
20120113-125552 INFO Found 35 test(s) for execution
20120113-125552 INFO Creating 1 bot(s)
20120113-125604 INFO Taking clean db snapshot...
20120113-125610 INFO Taking clean db snapshot...
20120113-125616 INFO Taking clean db snapshot...
20120113-125621 INFO bot0 server:
20120113-125621 INFO NAME: s0
20120113-125621 INFO MASTER_PORT: 9317
20120113-125621 INFO GALERA_LISTEN_PORT: 9318
20120113-125621 INFO GALERA_RECV_PORT: 9319
20120113-125621 INFO SOCKET_FILE: /percona-xtradb-cluster/kewpie/workdir/bot0/var_s0/my.sock
20120113-125621 INFO VARDIR: /percona-xtradb-cluster/kewpie/workdir/bot0/var_s0
20120113-125621 INFO STATUS: 1
20120113-125621 INFO bot0 server:
20120113-125621 INFO NAME: s1
20120113-125621 INFO MASTER_PORT: 9320
20120113-125621 INFO GALERA_LISTEN_PORT: 9321
20120113-125621 INFO GALERA_RECV_PORT: 9322
20120113-125621 INFO SOCKET_FILE: /percona-xtradb-cluster/kewpie/workdir/bot0/var_s1/my.sock
20120113-125621 INFO VARDIR: /percona-xtradb-cluster/kewpie/workdir/bot0/var_s1
20120113-125621 INFO STATUS: 1
20120113-125621 INFO bot0 server:
20120113-125621 INFO NAME: s2
20120113-125621 INFO MASTER_PORT: 9323
20120113-125621 INFO GALERA_LISTEN_PORT: 9324
20120113-125621 INFO GALERA_RECV_PORT: 9325
20120113-125621 INFO SOCKET_FILE: /percona-xtradb-cluster/kewpie/workdir/bot0/var_s2/my.sock
20120113-125621 INFO VARDIR: /percona-xtradb-cluster/kewpie/workdir/bot0/var_s2
20120113-125621 INFO STATUS: 1
20120113-125621 INFO User specified --start-and-exit.  kewpie.py exiting and leaving servers running...
</pre>
<p>Now for some play:</p>
<pre>
$ mysql -uroot --protocol=tcp --port=9317 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.5.17-log Source distribution wsrep_22.3.r3683

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql&gt; create table t1 (a int not null auto_increment, primary key(a));
Query OK, 0 rows affected (0.11 sec)

mysql&gt; insert into t1 values (),(),(),(),();
Query OK, 5 rows affected (0.06 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql&gt; select * from t1;
+----+
| a  |
+----+
|  1 |
|  4 |
|  7 |
| 10 |
| 13 |
+----+
5 rows in set (0.00 sec)

mysql&gt; exit;
Bye
$ mysql -uroot --protocol=tcp --port=9320 test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.5.17-log Source distribution wsrep_22.3.r3683

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql&gt; select * from t1;
+----+
| a  |
+----+
|  1 |
|  4 |
|  7 |
| 10 |
| 13 |
+----+
5 rows in set (0.00 sec)

mysql&gt; exit
Bye
$ mysql -uroot --protocol=tcp --port=9323 test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.5.17-log Source distribution wsrep_22.3.r3683

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql&gt; select * from t1;
+----+
| a  |
+----+
|  1 |
|  4 |
|  7 |
| 10 |
| 13 |
+----+
5 rows in set (0.00 sec)

mysql&gt; exit
Bye
</pre>
<p>Should you wish to alter the number of nodes or their configuration, you can edit the percona_tests/cluster_basic/suite_config.py file:</p>
<pre>
server_requirements = [[],[],[]]
server_requests = {'join_cluster':[(0,1), (0,2)]}
servers = []
</pre>
<p>Each &#8216;[]&#8216; in the server_requirements list is a server.  You can add new servers by adding a new list.  If you want specific options, put them into the list representing the server:<br />
[['--innodb-file-per-table']]</p>
<p>You will need to add an entry into the server_requests dictionary as well.  If you added a new node and want it in the cluster you would simply change it as follows:<br />
server_requests = {&#8216;join_cluster&#8217;:[(0,1), (0,2), (0,3)]}</p>
<p>When you are done, you may use mode=cleanup to kill off any servers:</p>
<pre>
./kewpie.py --mode=cleanup
Setting --no-secure-file-priv=True for randgen usage...
Setting --start-dirty=True for cleanup mode...
20120113-132229 INFO Using --start-dirty, not attempting to touch directories
20120113-132229 INFO Using mysql source tree:
20120113-132229 INFO basedir: /percona-xtradb-cluster
20120113-132229 INFO clientbindir: /percona-xtradb-cluster/client
20120113-132229 INFO testdir: /percona-xtradb-cluster/kewpie
20120113-132229 INFO server_version: 5.5.17
20120113-132229 INFO server_compile_os: Linux
20120113-132229 INFO server_platform: x86_64
20120113-132229 INFO server_comment: (Source distribution wsrep_22.3.r3683)
20120113-132229 INFO Using default-storage-engine: innodb
20120113-132229 INFO Using testing mode: cleanup
20120113-132229 INFO Killing pid 17040 from /percona-xtradb-cluster/kewpie/workdir/bot0/var_s0/run/my.pid
20120113-132229 INFO Killing pid 17096 from /percona-xtradb-cluster/kewpie/workdir/bot0/var_s2/run/my.pid
20120113-132229 INFO Killing pid 17070 from /percona-xtradb-cluster/kewpie/workdir/bot0/var_s1/run/my.pid
20120113-132229 INFO Stopping all running servers...
</pre>
<p>Alternately, you can just let the tests run to ensure some basic functionality.  I&#8217;ll be writing more about these tests and other testing efforts soon, but I wanted to help people get started with their own explorations.</p>
<p>Happy testing and I hope you dig Percona XtraDB Cluster as much as we do : )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2012/01/13/percona-testing-quick-test-clusters-with-kewpie/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>kernel_mutex problem cont. Or triple your throughput</title>
		<link>http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-cont-or-triple-your-throughput/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-cont-or-triple-your-throughput/#comments</comments>
		<pubDate>Sat, 03 Dec 2011 00:41:40 +0000</pubDate>
		<dc:creator>Vadim Tkachenko</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7688</guid>
		<description><![CDATA[This is to follow up my previous post with kernel_mutex problem. First, I may have an explanation why the performance degrades to significantly and why innodb_sync_spin_loops may fix it. Second, if that is correct ( or not, but we can try anyway), than playing with innodb_thread_concurrency also may help. So I ran some benchmarks with [...]]]></description>
			<content:encoded><![CDATA[<p>This is to follow up my previous post with <a href="http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-or-double-throughput-with-single-variable/">kernel_mutex problem</a>.</p>
<p>First, I may have an explanation why the performance degrades to significantly and why <strong>innodb_sync_spin_loops</strong> may fix it.<br />
Second, if that is correct ( or not, but we can try anyway), than playing with <strong>innodb_thread_concurrency</strong> also may help. So I ran some benchmarks with innodb_thread_concurrency.<br />
<span id="more-7688"></span></p>
<p>My explanation on the performance degradation is following:<br />
InnoDB still uses some strange mutex implementation, based on sync_arrays (hello 1990ies), I do not have a good reason why it is not yet replaced.<br />
Sync_array internally uses pthread_cond_wait / pthread_cond_broadcast construction, and on pthread_cond_broadcast call, all threads, competing on mutex, wake up and start racing.<br />
This effect has name <a href="http://en.wikipedia.org/wiki/Thundering_herd_problem">thundering herd</a>.</p>
<p>Davi Arnaut <a href="http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-or-double-throughput-with-single-variable/comment-page-1/#comment-850254 ">does not agree with me</a>, where I do not <a href="http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-or-double-throughput-with-single-variable/comment-page-1/#comment-850280">agree with him</a> either. This is the healthy discussion, and it is possible only because InnoDB is still Open Source and we all can check source code. If the problem were in the closed extension <a href="http://dev.mysql.com/doc/refman/5.5/en/thread-pool-plugin.html">Thread Pool</a> I could not participate in it.</p>
<p>We will probably argue more on that topic, but that does not stop us from trying different<br />
<strong>innodb_thread_concurrency</strong> ( 0 by default, that is no restrictions).</p>
<p>This variable has a complex fate. Once it was one solution for <a href="http://bugs.mysql.com/bug.php?id=15815">poor InnoDB scalability</a>, then it changed default value, then it even was <a href="http://mysqlha.blogspot.com/2010/03/do-we-still-need-innodbthreadconcurrenc.html">named useless</a>.</p>
<p>There is results for workload as in previous post, 256 threads and<br />
with innodb_thread_concurrency=0,4,8,16,32,64</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-concur.png"><img class="aligncenter size-full wp-image-7689" title="sysbench-concur" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-concur.png" alt="" width="600" height="400" /></a></p>
<table border="1">
<tbody>
<tr>
<th>innodb_thread_concurrency</th>
<th>Throughput</th>
</tr>
<tr>
<td align="right">0</td>
<td align="right">68369.02</td>
</tr>
<tr>
<td align="right">4</td>
<td align="right">137999.96</td>
</tr>
<tr>
<td align="right">8</td>
<td align="right">194537.48</td>
</tr>
<tr>
<td align="right">16</td>
<td align="right">161985.59</td>
</tr>
<tr>
<td align="right">32</td>
<td align="right">158296.21</td>
</tr>
<tr>
<td align="right">64</td>
<td align="right">153889.72</td>
</tr>
</tbody>
</table>
<p>Wow, this is something. I expected improvement, but not almost 3x times ( 194537÷68369 = 2.8).<br />
The best throughput is with <strong>innodb_thread_concurrency=8</strong>.</p>
<p>So now let&#8217;s compare results for innodb_thread_concurrency= 0 vs 8 for all range of threads:</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-threads-conc.png"><img class="aligncenter size-full wp-image-7696" title="sysbench-threads-conc" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-threads-conc.png" alt="" width="600" height="400" /></a></p>
<table border="1">
<tbody>
<tr>
<th>Threads</th>
<th>innodb concurrency=0</th>
<th>innodb concurrency=8</th>
</tr>
<tr>
<td align="right">1</td>
<td align="right">11178.34</td>
<td align="right"></td>
</tr>
<tr>
<td align="right">2</td>
<td align="right">27741.06</td>
<td align="right"></td>
</tr>
<tr>
<td align="right">4</td>
<td align="right">53364.52</td>
<td align="right"></td>
</tr>
<tr>
<td align="right">8</td>
<td align="right">92546.73</td>
<td align="right">88046.72</td>
</tr>
<tr>
<td align="right">16</td>
<td align="right">144619.58</td>
<td align="right">141781.00</td>
</tr>
<tr>
<td align="right">32</td>
<td align="right">164884.03</td>
<td align="right">168360.95</td>
</tr>
<tr>
<td align="right">64</td>
<td align="right">154235.73</td>
<td align="right">186167.15</td>
</tr>
<tr>
<td align="right">128</td>
<td align="right">147456.33</td>
<td align="right">199260.97</td>
</tr>
<tr>
<td align="right">256</td>
<td align="right">68369.02</td>
<td align="right">194357.78</td>
</tr>
<tr>
<td align="right">512</td>
<td align="right">40509.67</td>
<td align="right">194639.51</td>
</tr>
<tr>
<td align="right">1024</td>
<td align="right">22166.94</td>
<td align="right">183524.16</td>
</tr>
</tbody>
</table>
<p>So <strong>innodb_thread_concurrency</strong> is even more helpful innodb_sync_spin_loops, and allows to get stable result even with 1024 threads. It is yet early to say it useless, and you may play with it.</p>
<p><!-- Place this render call where appropriate --><br />
<script type="text/javascript">// <![CDATA[
  (function() {
    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
    po.src = 'https://apis.google.com/js/plusone.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
  })();
// ]]&gt;</script></p>
<p><script type="text/javascript">// <![CDATA[
mbgc='f5f5f5';ww='320';mbc='cecece';bbc='3F79D5';bmobc='3b71c6';bbgc='4889F0';bmoc='3F79D5';bfc='FFFFFF';bmofc='ffffff';tlc='cecece';tc='6a6a6a';nc='6a6a6a';bc='6a6a6a';l='y';fs='16';fsb='13';bw='100';ff='4';pc='4889F0';b='s'; pid='109242749016593233313';
// ]]&gt;</script><script type="text/javascript" src="http://widgetsplus.com/google_plus_widget.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-cont-or-triple-your-throughput/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>kernel_mutex problem. Or double throughput with single variable</title>
		<link>http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-or-double-throughput-with-single-variable/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-or-double-throughput-with-single-variable/#comments</comments>
		<pubDate>Fri, 02 Dec 2011 18:00:21 +0000</pubDate>
		<dc:creator>Vadim Tkachenko</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[Insight for DBAs]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7629</guid>
		<description><![CDATA[Problem with kernel_mutex in MySQL 5.1 and MySQL 5.5 is known: Bug report. In fact in MySQL 5.6 there are some fixes that suppose to provide a solution, but MySQL 5.6 yet has long way ahead before production, and it is also not clear if the problem is really fixed. Meantime the problem with kernel_mutex [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Problem with kernel_mutex</strong> in MySQL 5.1 and MySQL 5.5 is known: <a href="http://bugs.mysql.com/bug.php?id=54982">Bug report</a>. In fact in MySQL 5.6 there are some fixes that suppose to provide a solution, but MySQL 5.6 yet has long way ahead before production, and it is also not clear if the problem is really fixed.</p>
<p>Meantime the problem with kernel_mutex is raising, I had three customer problems related to performance drops during the last month.</p>
<p>So what can be done there ? Let&#8217;s run some benchmarks.</p>
<p><span id="more-7629"></span></p>
<p>But some theory before benchmarks. InnoDB uses <strong>kernel_mutex</strong> when it starts/stop transactions, and when InnoDB starts the transaction, usually there is loop through <strong>ALL active transactions</strong>, and this loop is inside <strong>kernel_mutex</strong>. That is to see kernel_mutex in action, we need many concurrent but short transactions.</p>
<p>For this we will take sysbench running only simple select PK queries against 48 tables, 5,000,000 rows each.</p>
<p>Hardware is <a href="http://www.percona.com/docs/wiki/benchmark:hardware:cisco_ucs_c250">Cisco UCS C250</a> server. The workload is <strong>read-only and fully in memory</strong>.</p>
<p>There is the result for different threads (against Percona Server 5.5.17):</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-base.png"><img class="aligncenter size-full wp-image-7658" title="sysbench-base" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-base.png" alt="" width="600" height="400" /></a></p>
<table border="1">
<tbody>
<tr>
<th>Threads</th>
<th>Throughput, q/s</th>
</tr>
<tr>
<td align="right">1</td>
<td align="right">11178.34</td>
</tr>
<tr>
<td align="right">2</td>
<td align="right">27741.06</td>
</tr>
<tr>
<td align="right">4</td>
<td align="right">53364.52</td>
</tr>
<tr>
<td align="right">8</td>
<td align="right">92546.73</td>
</tr>
<tr>
<td align="right">16</td>
<td align="right">144619.58</td>
</tr>
<tr>
<td align="right">32</td>
<td align="right">164884.03</td>
</tr>
<tr>
<td align="right">64</td>
<td align="right">154235.73</td>
</tr>
<tr>
<td align="right">128</td>
<td align="right">147456.33</td>
</tr>
<tr>
<td align="right">256</td>
<td align="right">68369.02</td>
</tr>
<tr>
<td align="right">512</td>
<td align="right">40509.67</td>
</tr>
<tr>
<td align="right">1024</td>
<td align="right">22166.94</td>
</tr>
</tbody>
</table>
<p>The peak throughput is <strong>164884 q/s</strong> for 32 threads, and it declines to <strong>68369 q/s</strong> for 256 threads, that is <strong>2.4x</strong> times drop.</p>
<p>The reason, as you may guess, is kernel_mutex. How you can see it ? It is easy. In <code>SHOW ENGINE INNODB STATUS\G</code> you will see a lot of lines like:</p>
<pre>--Thread 140370743510784 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore:
Mutex at 0x2b0ccc8 '&amp;kernel_mutex', lock var 1
waiters flag 0
--Thread 140370752542464 has waited at trx0trx.c line 1772 for 0.0000 seconds the semaphore:
Mutex at 0x2b0ccc8 '&amp;kernel_mutex', lock var 1
waiters flag 0
--Thread 140088222295808 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore:
Mutex at 0x2b0ccc8 '&amp;kernel_mutex', lock var 1
waiters flag 0
--Thread 140370746922752 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore:
Mutex at 0x2b0ccc8 '&amp;kernel_mutex', lock var 1
waiters flag 0
--Thread 140088223500032 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore:
Mutex at 0x2b0ccc8 '&amp;kernel_mutex', lock var 1
waiters flag 0
--Thread 140088231528192 has waited at trx0trx.c line 795 for 0.0000 seconds the semaphore:
Mutex at 0x2b0ccc8 '&amp;kernel_mutex', lock var 1
waiters flag 0
...</pre>
<p>This problem is actually quite serious. In the real workloads I saw this happening with less than 256 threads, and not all production systems can tolerate 2x times drop of throughput in the peak times.</p>
<p><strong>So what can be done there ?</strong></p>
<p>In the first try, let&#8217;s recall that kernel_mutex (and all InnoDB mutexes) has complex handling with spin loops, and there are two variables that affects mutex loops: <strong>innodb_sync_spin_loops</strong> and <strong>innodb_spin_wait_delay</strong>. I actually think that tuning system with these variable is something closer to <a href="http://eclecticarksageadvice.blogspot.com/2011/05/shamanic-drumming.html">dance with drum</a> than to scientific method, but nothing else helps, why not to try.</p>
<p>There we vary <strong>innodb_sync_spin_loops</strong> from 0 to 100 (default is 30):</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-spinloops.png"><img class="aligncenter size-full wp-image-7651" title="sysbench-spinloops" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-spinloops.png" alt="" width="600" height="400" /></a></p>
<table border="1">
<tbody>
<tr>
<th>Threads</th>
<th>Throughput</th>
<th>NA</th>
</tr>
<tr>
<td align="right">1</td>
<td align="right">11178.34</td>
<td></td>
</tr>
<tr>
<td align="right">2</td>
<td align="right">27741.06</td>
<td></td>
</tr>
<tr>
<td align="right">4</td>
<td align="right">53364.52</td>
<td></td>
</tr>
<tr>
<td align="right">8</td>
<td align="right">92546.73</td>
<td></td>
</tr>
<tr>
<td align="right">16</td>
<td align="right">144619.58</td>
<td></td>
</tr>
<tr>
<td align="right">32</td>
<td align="right">164884.03</td>
<td></td>
</tr>
<tr>
<td align="right">64</td>
<td align="right">154235.73</td>
<td></td>
</tr>
<tr>
<td align="right">128</td>
<td align="right">147456.33</td>
<td></td>
</tr>
<tr>
<td align="right">256</td>
<td align="right">68369.02</td>
<td></td>
</tr>
<tr>
<td align="right">512</td>
<td align="right">40509.67</td>
<td></td>
</tr>
<tr>
<td align="right">1024</td>
<td align="right">22166.94</td>
<td></td>
</tr>
</tbody>
</table>
<p>I was surprised to see that with <strong>innodb_sync_spin_loops</strong>=100 we can improve to <strong>145324</strong> q/s , almost to peak throughput from first experiment.</p>
<p>With <strong>innodb_sync_spin_loops</strong>=100 the <strong>kernel_mutex</strong> is still the main point of contention, but InnoDB tries to prevent the current thread from pausing, and that seems helping.</p>
<p>Further experiments showed that 100 is not enough for 512 threads, and it should be increased to 200.</p>
<p>So there is final results with <strong>innodb_sync_spin_loops</strong>=200 for 1-1024 threads.</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-base-spin.png"><img class="aligncenter size-full wp-image-7680" title="sysbench-base-spin" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/12/sysbench-base-spin.png" alt="" width="600" height="400" /></a></p>
<table border="1">
<tbody>
<tr>
<th>Threads</th>
<th>Throughput</th>
<th>Throughput spin 200</th>
</tr>
<tr>
<td align="right">1</td>
<td align="right">11178.34</td>
<td align="right">11288.42</td>
</tr>
<tr>
<td align="right">2</td>
<td align="right">27741.06</td>
<td align="right">28387.62</td>
</tr>
<tr>
<td align="right">4</td>
<td align="right">53364.52</td>
<td align="right">53575.52</td>
</tr>
<tr>
<td align="right">8</td>
<td align="right">92546.73</td>
<td align="right">92184.65</td>
</tr>
<tr>
<td align="right">16</td>
<td align="right">144619.58</td>
<td align="right">143688.91</td>
</tr>
<tr>
<td align="right">32</td>
<td align="right">164884.03</td>
<td align="right">164392.94</td>
</tr>
<tr>
<td align="right">64</td>
<td align="right">154235.73</td>
<td align="right">154022.57</td>
</tr>
<tr>
<td align="right">128</td>
<td align="right">147456.33</td>
<td align="right">152280.84</td>
</tr>
<tr>
<td align="right">256</td>
<td align="right">68369.02</td>
<td align="right">150089.31</td>
</tr>
<tr>
<td align="right">512</td>
<td align="right">40509.67</td>
<td align="right">127680.65</td>
</tr>
<tr>
<td align="right">1024</td>
<td align="right">22166.94</td>
<td align="right">61507.08</td>
</tr>
</tbody>
</table>
<p>So playing with this variable we can double throughput to the level with 32-64 threads.<br />
I am not really can explain how it does work internally, but I wanted to show one of possible ways<br />
to deal with problem when you hit by kernel_mutex problem.</p>
<p>Further direction I want to try to limit <strong>innodb_thread_concurrency</strong> and also bind mysqld to less CPUs, and also it is interesting to see if MySQL 5.6.3 really fixes this problem.</p>
<p><!-- Place this render call where appropriate --><br />
<script type="text/javascript">// <![CDATA[
  (function() {
    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
    po.src = 'https://apis.google.com/js/plusone.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
  })();
// ]]&gt;</script></p>
<p><script type="text/javascript">// <![CDATA[
mbgc='f5f5f5';ww='320';mbc='cecece';bbc='3F79D5';bmobc='3b71c6';bbgc='4889F0';bmoc='3F79D5';bfc='FFFFFF';bmofc='ffffff';tlc='cecece';tc='6a6a6a';nc='6a6a6a';bc='6a6a6a';l='y';fs='16';fsb='13';bw='100';ff='4';pc='4889F0';b='s'; pid='109242749016593233313';
// ]]&gt;</script><script type="text/javascript" src="http://widgetsplus.com/google_plus_widget.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/12/02/kernel_mutex-problem-or-double-throughput-with-single-variable/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Virident FlashMAX MLC in tpcc-mysql workload</title>
		<link>http://www.mysqlperformanceblog.com/2011/11/29/virident-flashmax-mlc-in-tpcc-mysql-workload/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/11/29/virident-flashmax-mlc-in-tpcc-mysql-workload/#comments</comments>
		<pubDate>Tue, 29 Nov 2011 17:38:54 +0000</pubDate>
		<dc:creator>Vadim Tkachenko</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[Hardware and Storage]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7589</guid>
		<description><![CDATA[As I mentioned in previous post on Virident FlashMAX MLC, beside sysbench benchmark, I also run tpcc-mysql (to compare performance Virident FlashMAX vs Fusion-io ioDrive Duo) The report with results is there: http://www.percona.com/files/white-papers/virident-mlc-tpcc.pdf The graphical result for tpcc-mysql 5000W: My conclusions from this benchmark: Virident FlashMAX provides stability of performance and reveals a denser throughput. [...]]]></description>
			<content:encoded><![CDATA[<p>As I mentioned in previous post on <a href="http://www.mysqlperformanceblog.com/2011/11/10/review-of-virident-flashmax-mlc/">Virident FlashMAX MLC</a>, beside sysbench benchmark, I also run tpcc-mysql (to compare performance Virident FlashMAX  vs Fusion-io ioDrive Duo)</p>
<p>The report with results is there: <a href="http://www.percona.com/files/white-papers/virident-mlc-tpcc.pdf">http://www.percona.com/files/white-papers/virident-mlc-tpcc.pdf</a></p>
<p><span id="more-7589"></span></p>
<p>The graphical result for tpcc-mysql 5000W:</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/virident-tpcc-5000.png"><img src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/virident-tpcc-5000.png" alt="" title="virident-tpcc-5000" width="600" height="400" class="aligncenter size-full wp-image-7591" /></a></p>
<p><strong>My conclusions from this benchmark:</strong></p>
<ul>
<li>Virident FlashMAX provides stability of performance and reveals a denser throughput.</li>
<li>In addition to stability, in many cases there is also a better throughput in MySQL (up to 40\%) using the Virident FlashMAX card.
</li>
</ul>
<p><strong>DISCLOSURE</strong>: This benchmark was done as part of our consulting practice for which we compensated by Virident. However, this benchmark was run independently of Virident, and reflects our opinion of this product.</p>
<p><!-- Place this render call where appropriate --><br />
<script type="text/javascript">
  (function() {
    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
    po.src = 'https://apis.google.com/js/plusone.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
  })();
</script></p>
<p><script type="text/javascript">mbgc='f5f5f5';ww='320';mbc='cecece';bbc='3F79D5';bmobc='3b71c6';bbgc='4889F0';bmoc='3F79D5';bfc='FFFFFF';bmofc='ffffff';tlc='cecece';tc='6a6a6a';nc='6a6a6a';bc='6a6a6a';l='y';fs='16';fsb='13';bw='100';ff='4';pc='4889F0';b='s'; pid='109242749016593233313';</script><script type="text/javascript" src="http://widgetsplus.com/google_plus_widget.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/11/29/virident-flashmax-mlc-in-tpcc-mysql-workload/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Fishing with dynamite, brought to you by the randgen and dbqp</title>
		<link>http://www.mysqlperformanceblog.com/2011/11/16/fishing-with-dynamite-brought-to-you-by-the-randgen-and-dbqp/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/11/16/fishing-with-dynamite-brought-to-you-by-the-randgen-and-dbqp/#comments</comments>
		<pubDate>Wed, 16 Nov 2011 11:37:47 +0000</pubDate>
		<dc:creator>patrick.crews</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7348</guid>
		<description><![CDATA[I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, [...]]]></description>
			<content:encoded><![CDATA[<p>I tend to speak highly of the <a href="https://launchpad.net/randgen">random query generator</a> as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, I would also like to think that the way we have packaged our randgen tests into unittest format for <a href="https://launchpad.net/dbqp">dbqp</a> played some small part, but I might be mildly biased.</p>
<p>The best description of the randgen&#8217;s power comes courtesy of <a href="http://www.linuxjedi.co.uk/">Andrew Hutchings</a> &#8211; &#8220;<a href="http://en.wikipedia.org/wiki/Blast_fishing">fishing with dynamite</a>&#8220;. This is a very apt metaphor for how the tool works &#8211; it can be quite effective for stressing a server and finding bugs, but it can also be quite messy, possibly even fatal if one is careless. ; ) However, I am not writing this to share any horror stories, but glorious tales of bug hunting!</p>
<p>The randgen uses yacc-style <a href="http://forge.mysql.com/wiki/RandomQueryGeneratorGrammar">grammar files</a> that define a realm of possible queries (provided you did it right&#8230;the zen of grammar writing is a topic for another day). Doing this allows us to produce high volumes of queries that are hopefully interesting (see previous comment about grammar-writing-zen).</p>
<p>It takes a certain amount of care to produce a grammar that is useful and interesting, but the gamble is that this effort will produce more interesting effects on the database than the hand-written queries that could be produced in similar time. This is especially useful when you aren&#8217;t quite sure where a problem is and are just trying to see what shakes out under a certain type of stress.  Another win is that a well-crafted grammar can be used for a variety of scenarios.  The transactional grammars that were originally written for testing Drizzle&#8217;s replication system have been reused many times (including for two of these bugs!)</p>
<p>This brings us to our first bug:<br />
<a href="https://bugs.launchpad.net/percona-server/+bug/758788"> mysql process crashes after setting innodb_dict_size</a></p>
<p>The basics of this were that the server was crashing under load when <a href="http://www.percona.com/doc/percona-server/5.5/management/innodb_dict_size_limit.html?id=percona-server:features:innodb_dict_size_limit&amp;redirect=1#">innodb_dict_size_limit</a> was set to a smaller value. In order to simulate the situation, <a href="http://www.flamingspork.com/blog/">Stewart</a> suggested we use a transactional load against a large number of tables. We were able to make this happen in 4 easy steps:<br />
1) Create a test case module that we can execute. All of the randgen test cases are structured similarly, so all we had to do was copy an existing test case and tweak our server options and randgen command line as needed.</p>
<p>2) Make an altered copy of the general, percona.zz gendata file. This file is used by the randgen to determine the number, composition, and population of any test tables we want to use and generate them for us. As the original reporter indicated they had a fair number of tables:</p>
<pre>
$tables = {
rows =&gt; [1..50],
partitions =&gt; [ undef ]
};
</pre>
<p>The value in the &#8216;rows&#8217; section tells the data generator to produce 50 tables, with sizes from 1 row to 50 rows.</p>
<p>3) Specify the server options. We wanted the server to hit similar limits as the original bug reporter, but we were working on a smaller scale.<br />
To make this happen, we set the following options in the test case:</p>
<pre>
server_requirements = [["--innodb-dict-size-limit=200k --table-open-cache=10"]]
</pre>
<p>Granted, these are insanely small values, but this is a test and we&#8217;re trying to do horrible things to the server ; )</p>
<p>4) Set up our test_* method in our testcase class. This is all we need to specify in our test case:</p>
<pre>
def test_bug758788(self):
test_cmd = ("./gentest.pl "
            "--gendata=conf/percona/innodb_dict_size_limit.zz "
            "--grammar=conf/percona/translog_concurrent1.yy "
            "--queries=1000 "
            "--threads=1")
retcode, output = execute_randgen(test_cmd, test_executor, servers)
self.assertTrue(retcode==0, output)
</pre>
<p>The test is simply to ensure that the server remains up and running under a basic transactional load</p>
<p>From there, we only need to use the following command to execute the test:<br />
./dbqp.py &#8211;default-server-type=mysql &#8211;basedir=/path/to/Percona-Server &#8211;suite=randgen_basic innodbDictSizeLimit_test<br />
This enabled us to reproduce the crash within 5 seconds.</p>
<p>The reason I think this is interesting is that we were unable to duplicate this bug otherwise. The combination of the randgen&#8217;s power and dbqp&#8217;s organization helped us knock this out with about 15 minutes of tinkering.</p>
<p>Once we had a bead on this bug, we went on to try a couple of other bugs:</p>
<p><a href="https://bugs.launchpad.net/percona-server/+bug/856404"> Crash when query_cache_strip_comments enabled</a></p>
<p>For this one, we only modified the grammar file to include this as a possible WHERE clause for SELECT queries:</p>
<pre>
WHERE X . char_field_name != 'If you need to translate Views labels into other languages, consider installing the &lt;a href=\" !path\"&gt;Internationalization&lt;/a&gt; package\'s Views translation module.'
</pre>
<p>The test value was taken from the original bug report.<br />
Similar creation of a test case file + modifications resulted in another easily reproduced crash.<br />
I will admit that there may be other ways to go about hitting that particular bug, but we *were* practicing with new tools and playing with dynamite can be quite exhilarating ; )<br />
<a href="https://bugs.launchpad.net/percona-xtrabackup/+bug/826632"> parallel option breaks backups and restores</a></p>
<p>For this bug, we needed to ensure that the server used &#8211;innodb_file_per_table and that we used <a href="http://www.percona.com/software/percona-xtrabackup/">Xtrabackup</a>&#8216;s <a href="http://www.percona.com/doc/percona-xtrabackup/innobackupex/parallel_copy_ibk.html">&#8211;parallel</a> option. I also wanted to create multiple schemas and we did via a little randgen / python magic:</p>
<pre>
# populate our server with a test bed
test_cmd = "./gentest.pl --gendata=conf/percona/bug826632.zz "
retcode, output = execute_randgen(test_cmd, test_executor, servers)
# create additional schemas for backup
schema_basename='test'
for i in range(6):
    schema = schema_basename+str(i)
    query = "CREATE SCHEMA %s" %(schema)
    retcode, result_set = execute_query(query, master_server)
    self.assertEquals(retcode,0, msg=result_set)
    retcode, output = execute_randgen(test_cmd, test_executor, servers, schema)
</pre>
<p>This gave us 7 schemas, all with 100 tables per schema (with rows 1-100). From here we take a backup with &#8211;parallel=50 and then try to restore it. These are basically the same steps we use in our basic_test from the xtrabackup suite. We just copied and modified the test case to suit our needs for this bug. With this setup, we need a crash / failure during the prepare phase of the backup. Interestingly this only happens with this number of tables, schemas, and &#8211;parallel threads.</p>
<p>Not too shabby for about 30 minutes of hacking + explaining things, if I do say so myself. One of the biggest difficulties in fixing bugs comes from being able to recreate them reliably and easily. Between the randgen&#8217;s brutal ability to produce test data and queries and dbqp&#8217;s efficient test organization, we are now able to quickly produce complicated test scenarios and reproduce more bugs so our amazing dev team can fix them into oblivion : )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/11/16/fishing-with-dynamite-brought-to-you-by-the-randgen-and-dbqp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>dbqp and Xtrabackup testing</title>
		<link>http://www.mysqlperformanceblog.com/2011/11/16/dbqp-and-xtrabackup-testing/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/11/16/dbqp-and-xtrabackup-testing/#comments</comments>
		<pubDate>Wed, 16 Nov 2011 11:36:03 +0000</pubDate>
		<dc:creator>patrick.crews</dc:creator>
				<category><![CDATA[Benchmarks]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7344</guid>
		<description><![CDATA[So I’m back from the Percona dev team’s recent meeting.  While there, we spent a fair bit of time discussing Xtrabackup development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  However, it seems a constant in the MySQL world that available QA tools often [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>So I’m back from the <a href="http://www.percona.com/">Percona</a> dev team’s recent meeting.  While there, we spent a fair bit of time discussing <a href="http://www.percona.com/software/percona-xtrabackup/">Xtrabackup</a> development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  However, it seems a constant in the MySQL world that available QA tools often leave something to be desired.  The <a href="https://launchpad.net/randgen">randgen</a> is a literal wonder-tool for database testing, but it is also occasionally frustrating / doesn’t scratch every testing itch.  It is based on technology SQL Server was using in 1998 (MySQL began using it in ~2007, IIRC).  So this is no knock, it is merely meant to be an example of a poor QA engineer’s frustrations ; )  While the current <a href="http://www.percona.com/software/percona-xtrabackup/">Xtrabackup</a> test suite is commendable, it also has its limitations. Enter the flexible, adaptable, and expressive answer: <a href="http://www.wc220.com/?category_name=dbqp">dbqp</a>.</p>
<p>One of my demos at the dev meeting was showing how we can set up tests for Xtrabackup using the unittest paradigm.  While this sounds fancy, basically, we take advantage of <a href="http://docs.python.org/library/unittest.html">Python’s unittest</a> and write classes that use their code.  The biggest bit <a href="http://docs.drizzle.org/testing/dbqp.html">dbqp</a> does is search the specified server code (to make sure we have everything we should), allocate and manage servers as requested by the test cases, and do some reporting and management of the test cases.  As the tool matures, I will be striving to let more of the work be done by unittest code rather than things I have written : )</p>
<p>To return to my main point, we now have two basic tests of xtrabackup:</p>
<h4>Basic test of backup + restore:</h4>
<ol>
<li>Populate server</li>
<li>Take a validation snapshot (mysqldump)</li>
<li>Take the backup (via innobackupex)</li>
<li>Clean datadir</li>
<li>Restore from backup</li>
<li>Take restored state snapshot and compare to original state</li>
</ol>
<h4>Slave setup</h4>
<ol>
<li>Similar to our basic test except we create a slave from the backup, replicating from the backed up server.</li>
<li>After the initial setup, we ensure replication is set up ok, then we do additional work on the master and compare master and slave states</li>
</ol>
<p>One of the great things about this is that we have the <a href="http://docs.python.org/library/unittest.html#assert-methods">magic of assertions</a>.  We can insert them at any point of the test we feel like validating and the test will fail with useful output at that stage.  The backup didn’t take correctly?  No point going through any other steps — FAIL! : )  The assertion methods just make it easy to express what behavior we are looking for.  We want the innobackupex prepare call to run without error?<br />
<a href="http://www.youtube.com/watch?v=W45DRy7M1no">Boom goes the dynamite!</a>:</p>
<pre>
# prepare our backup
cmd = ("%s --apply-log --no-timestamp --use-memory=500M "
"--ibbackup=%s %s" %( innobackupex
, xtrabackup
, backup_path))
retcode, output = execute_cmd(cmd, output_path, exec_path, True)
self.assertEqual(retcode, 0, msg = output)
</pre>
<p>From these basic tests, it will be easy to craft more complex test cases.  Creating the slave test was simply matter of adapting the initial basic test case slightly.  Our plans include: *heavy* crash testing of both xtrabackup and the server, enhancing / expanding replication tests by creating heavy randgen loads against the master during backup and slave setup, and other assorted crimes against database software.  We will also be porting the existing test suite to use dbqp entirely…who knows, we may even start working on Windows one day ; )</p>
<p>These tests are by no means the be-all-end-all, but I think they do represent an interesting step forward.  We can now write actual, <a href="http://xkcd.com/353/">honest-to-goodness Python code</a> to test the server.  On top of that, we can make use of the included unittest module to give us all sorts of assertive goodness to express what we are looking for.  We will need to and plan to refine things as time moves forward, but at the moment, we are able to do some cool testing tricks that weren’t easily do-able before.</p>
<p>If you’d like to try these tests out, you will need the following:<br />
* <a href="https://launchpad.net/dbqp">dbqp</a> (bzr branch lp:dbqp)<br />
* <a href="http://search.cpan.org/%7Ecapttofu/DBD-mysql-4.018/lib/DBD/mysql.pm">DBD:mysql</a> installed (test tests use the randgen and this is required…hey, it is a WONDER-tool!) : )<br />
* <a href="http://www.percona.com/doc/percona-xtrabackup/innobackupex/innobackupex_script.html">Innobackupex</a>, a MySQL / Percona server and the appropriate xtrabackup binary.</p>
<p>The tests live in dbqp/percona_tests/xtrabackup_basic and are named basic_test.py and slave_test.py, respectively.</p>
<p>To run them:<br />
$./dbqp.py –suite=xtrabackup_basic –basedir=/path/to/mysql –xtrabackup-path=/mah/path –innobackupex-path=/mah/other/path –default-server-type=mysql –no-shm</p>
<p>Some next steps for dbqp include:<br />
1)  Improved docs<br />
2)  Merging into the Percona Server trees<br />
3)  Setting up test jobs in Jenkins (crashme / sqlbench / randgen)<br />
4)  Other assorted awesomeness</p>
<p>Naturally, this testing goodness will also find its way into <a href="http://www.drizzle.org/">Drizzle</a> (which currently has a <a href="http://blog.drizzle.org/2011/10/25/fremont-beta-2011-10-28-has-been-released/">7.1 beta out</a>).  We definitely need to see some Xtrabackup test cases for <a href="http://www.flamingspork.com/blog/2011/04/01/online-non-blocking-backup-for-drizzle-with-xtrabackup/">Drizzle’s version of the tool</a> (mwa ha ha!) &gt;: )</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/11/16/dbqp-and-xtrabackup-testing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Side load may massively impact your MySQL Performance</title>
		<link>http://www.mysqlperformanceblog.com/2011/11/13/side-load-may-massively-impact-your-mysql-performance/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/11/13/side-load-may-massively-impact-your-mysql-performance/#comments</comments>
		<pubDate>Sun, 13 Nov 2011 23:40:11 +0000</pubDate>
		<dc:creator>Peter Zaitsev</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[Insight for DBAs]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7354</guid>
		<description><![CDATA[When we&#8217;re looking at benchmarks we typically run some stable workload and we run it in isolation &#8211; nothing else is happening on the system. This is not however how things happen in real world when we have significant variance in the load and many things can be happening concurrently. It is very typical to [...]]]></description>
			<content:encoded><![CDATA[<p>When we&#8217;re looking at benchmarks we typically run some stable workload and we run it in isolation &#8211; nothing else is happening on the system.  This is not however how things happen in real world when we have significant variance in the load and many things can be happening concurrently. </p>
<p>It is very typical to hear complains about MySQL interactive performance &#8211; serving simple standard web traffic is drastically impacted when some heavy queries are ran in background or backup is done with <strong>mysqldump</strong>  &#8211; a lot more than you would expect from simple resource competition.   I finally found some time to look further in this problem and see what can be done to remedy it.<br />
<span id="more-7354"></span></p>
<p>We designed the benchmark the following way &#8211; there is a small table (200MB) which completely fits in the Innodb Buffer Pool (512MB).  We also have larger table 4GB which does not fit in the buffer pool.  We&#8217;re running uniform sysbench OLTP on the small table and mysqldump on the second table.  First we run tests individually and when concurrently.   </p>
<p>In the perfect world what we would like to see is performance is staying about the same when we run tests concurrently because Sysbench should run completely in memory and use a lot of CPU resources but none of disk IO and mysqldump  should have relatively little CPU needs and be bound by disk.  Also these are just 2 &#8220;threads&#8221; running on 4 core system so there should be plenty CPU to spare.</p>
<p>We&#8217;re using Percona Server 5.5.15 for this test with buffer pool size of 512MB and innodb_flush_method=O_DIRECT </p>
<p>Test Setup:</p>
<blockquote><p>
[root@localhost msb_ps_5_5_15]# sysbench &#8211;test=oltp &#8211;db-driver=mysql &#8211;mysql-host=localhost &#8211;mysql-table-engine=innodb &#8211;mysql-db=test &#8211;oltp-table-name=md_cache_test_small &#8211;oltp-table-size=1100000 &#8211;mysql-user=msandbox &#8211;mysql-password=msandbox &#8211;mysql-socket=/tmp/mysql_sandbox5516.sock prepare</p>
<p>[root@localhost msb_ps_5_5_15]# sysbench &#8211;test=oltp &#8211;db-driver=mysql &#8211;mysql-host=localhost &#8211;mysql-table-engine=innodb &#8211;mysql-db=test &#8211;oltp-table-name=md_cache_test_big &#8211;oltp-table-size=17600000 &#8211;mysql-user=msandbox &#8211;mysql-password=msandbox &#8211;mysql-socket=/tmp/mysql_sandbox5516.sock prepare
</p></blockquote>
<p>Running Sysbench and MySQLDump. Note we run them in the loop to see how result stabilizes.</p>
<blockquote><p>
[root@localhost msb_ps_5_5_15]# sysbench &#8211;test=oltp &#8211;db-driver=mysql &#8211;num-threads=1 &#8211;max-requests=0 &#8211;oltp-dist-type=uniform &#8211;max-time=180 &#8211;oltp-read-only &#8211;mysql-host=localhost &#8211;mysql-table-engine=innodb &#8211;mysql-db=test &#8211;oltp-table-name=md_cache_test_small &#8211;oltp-table-size=1100000 &#8211;mysql-user=msandbox &#8211;mysql-password=msandbox &#8211;mysql-socket=/tmp/mysql_sandbox5516.sock run</p>
<p>[root@localhost msb_ps_5_5_15]# time mysqldump &#8211;defaults-file=my.sandbox.cnf test md_cache_test_big > /dev/null
</p></blockquote>
<p>Baseline Run:<br />
When we run the tests individually   Sysbench gives about  <strong>330 req/sec</strong>   and mysqldump for large table completes in about <strong>95 seconds</strong>.<br />
If we run them concurrently after system reaches steady state we get about <strong>2 req/sec</strong> and mysqldump takes about <strong>180 seconds</strong>.</p>
<p>Yes you get it right.  Performance of sysbench OLTP on small table drops  <strong>more than 150 times</strong> when heavy mysqldump is running concurrently.  mysqldump itself also slows down<br />
about 2x.</p>
<p>What is going on here ?  To understand it we should take a look at the buffer pool contents. </p>
<blockquote><p>
mysql [localhost] {msandbox} (information_schema) >  select concat_ws(&#8216;.&#8217;, t.schema, t.name, i.name) as index_name, sum(data_size)/1024/1024 as data_size_mb from innodb_sys_tables as t inner join innodb_sys_indexes as i using(table_id) inner join innodb_buffer_pool_pages_index as p using(index_id) where t.schema=&#8217;test&#8217; group by i.index_id \G</p>
<p>INDEX_NAME				DATA_SIZE_MB<br />
test.md_cache_test_small.PRIMARY	216.31397057<br />
test.md_cache_test_small.k	2.66948509<br />
test.md_cache_test_big.PRIMARY	250.76164627</p>
<p>&#8230;..</p>
<p>INDEX_NAME				DATA_SIZE_MB<br />
test.md_cache_test_small.PRIMARY	12.10487175<br />
test.md_cache_test_big.PRIMARY	457.70432472
</p></blockquote>
<p>When we&#8217;re running sysbench OLTP on its own we have the primary key of the table fit completely in the buffer pool.  However when mysqldump is ran concurrently it reads so many pages from the disk it pushes out most of the smaller table from the buffer pool with only 12MB remaining. This makes workload extremely IO bound hence such drop in performance. </p>
<p>The performance of mysqldump is impacted too because we now have 2 threads competing for what is single hard drive on this test system. </p>
<p>It is worth to note MySQL actually uses midpoint insertion for its <a href="http://dev.mysql.com/doc/refman/5.5/en/innodb-buffer-pool.html">buffer pool replacement policy </a>. Unfortunately by default it is configured in a way it is quite useless.  The blocks are indeed first placed in the head of &#8220;old&#8221; sublist which mean they should not push any hot data which is in &#8220;young&#8221; sublist. However when you&#8217;re doing mysqldump (or running some complex batch job query)  you are likely going to have multiple accesses to the data on the same page before being done with it for good. Because there are several accesses page really gets immediately moved to the young sublist and as such placing high pressure on buffer pool. </p>
<p>There is ingenious feature though to deal with this problem, it is just you have to enable it separately.  There is a variable <strong>innodb_old_blocks_time</strong> which specifies amount of milliseconds which needs to pass before table can be moved to the young sublist.  In typical cases like mysqldump all accesses to the majority of pages will be concentrated within very small period of time so setting innodb_old_blocks_time variable to some value will prevent important data to be pushed out of buffer pool.</p>
<p>Lets repeat the benchmark with <strong>innodb_old_blocks_time=1000</strong> which will correspond to 1 sec.</p>
<p>Separate Sysbench gives about<strong> 330 req/sec</strong> and mysqldump <strong>about 95 seconds</strong> which is the same.  Note we ran test on virtualized system in this case so we would not be able to measure small variances in performance reliably.</p>
<p>Running Sysbench and MySQLDump convurrently gives about  <strong>325 req/sec</strong> for sysbench and some <strong>100 seconds for mysqldump</strong>  which is a dramatic improvement of over<br />
150x for sysbench and results now going inline with what you would expect.</p>
<p>Lets see what is going on with buffer pool contents:</p>
<blockquote><p>
INDEX_NAME				DATA_SIZE_MB<br />
test.md_cache_test_small.PRIMARY	216.35031509<br />
test.md_cache_test_small.k	0.13414192<br />
test.md_cache_test_big.PRIMARY	253.21095276<br />
test.md_cache_test_big.k	0.01491451</p>
<p>&#8230;..<br />
INDEX_NAME				DATA_SIZE_MB<br />
test.md_cache_test_small.PRIMARY	216.35031509<br />
test.md_cache_test_big.PRIMARY	253.19661140<br />
test.md_cache_test_big.k	0.01491451
</p></blockquote>
<p>As you can see now the small table PRIMARY KEY (which is what used by benchmark) is not pushed from buffer pool at all. </p>
<p>For advanced tuning you might also look into changing how buffer pool is split into young and old sublists via <strong>innodb_old_blocks_pct</strong> variable though we did not need to do it in this case.</p>
<p>I&#8217;m not sure if there are any bad side effects from setting innodb_old_blocks_time to non zero value, if not I would strongly suggest changing default from zero in MySQL 5.6 as it would offer much better &#8220;out of box&#8221; user experience.</p>
<p><strong>Summary</strong><br />
As we can see in default configuration MySQL has buffer pool which can be easily washed away by large table scans or heavy batch jobs.  If this happen the workload which is normally in memory becomes disk IO bound which can slow it down more than 100 times. The solution is rather easy though. Setting innodb_old_blocks_time to 1000 or other meaningful number is an easy remedy for this problem. </p>
<p>I want to thank <a href="http://www.percona.com/about-us/our-team/ovais-tariq/">Ovais Tariq</a> for doing a lot of heavy lifting running benchmarks for this post. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/11/13/side-load-may-massively-impact-your-mysql-performance/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Review of Virident FlashMAX MLC cards</title>
		<link>http://www.mysqlperformanceblog.com/2011/11/10/review-of-virident-flashmax-mlc/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/11/10/review-of-virident-flashmax-mlc/#comments</comments>
		<pubDate>Thu, 10 Nov 2011 10:58:24 +0000</pubDate>
		<dc:creator>Vadim Tkachenko</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[Hardware and Storage]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7296</guid>
		<description><![CDATA[I have been following Virident for a long time (e.g. http://www.mysqlperformanceblog.com/2010/06/15/virident-tachion-new-player-on-flash-pci-e-cards-market/). They have great PCIe Flash cards based on SLC NAND. I always thought that Virident needed to come up with an MLC card, and I am happy to see they have finally done so. At Virident&#8217;s request, I performed an evaluation of their MLC [...]]]></description>
			<content:encoded><![CDATA[<p>I have been following <a href="http://www.virident.com/">Virident</a> for a long time (e.g. <a href="http://www.mysqlperformanceblog.com/2010/06/15/virident-tachion-new-player-on-flash-pci-e-cards-market/">http://www.mysqlperformanceblog.com/2010/06/15/virident-tachion-new-player-on-flash-pci-e-cards-market/</a>). They have great PCIe Flash cards based on SLC NAND.<br />
I always thought that Virident needed to come up with an MLC card, and I am happy to see they have finally done so.</p>
<p>At Virident&#8217;s request, I performed an evaluation of their MLC card to assess how it handles MySQL workload. As I am very satisfied with the results, I wish to share my findings in this post.<br />
<span id="more-7296"></span></p>
<p>But first, I wish to offer an overview of the card.</p>
<p><a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/FlashMAX-Product-Photo.jpg"><img class="aligncenter size-medium wp-image-7314" title="FlashMAX Product Photo" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/FlashMAX-Product-Photo-260x300.jpg" alt="" width="260" height="300" /></a></p>
<p>Virident FlashMax Cards are available in 1TB and 1.4TB usable capacities (the models names are M1000 and M1400)<br />
These specified sizes are already available for end users.<br />
I evaluated M1400 (1.4TB size) model, which I will discuss:</p>
<p>Because Virident has competition in the SSD market, they have stated their goals to distinguish themselves from their competitors:</p>
<ul>
<li>Stability of performance: That is to minimize variations in throughput</li>
<li>Better response times: This is very important for database performance and I appreciate that Virident has made this a priority.</li>
<li>Performance at full capacity: As we know, SSD-based cards have special characteristics; the throughput declines when space utilization increases. Virident’s design/programming minimizes this decline.</li>
<li>RAID5 on the card: The card comes with RAID5 support on the card to give better protection.</li>
</ul>
<p>To deal with a throughput decline, all Flash cards have reserved space. The 1.4TB card, that I have, internally holds 2TB worth of space.</p>
<p>This additional space is used for two purposes:</p>
<ul>
<ol>1. To amortize write-intensive workloads, by using additional space.</ol>
<ol>2. To have replacements for failed MLC modules. When one MLC module fails, it is marked as unused, and gets replaced by one from the pool of reserved modules.</ol>
</ul>
<p>Internally, Virident uses 25nm Intel NAND Flash MLC modules, this is the same technology that Intel uses for the Intel SSD 320 cards. 25nm modules allow the user a greater capacity, Physically you can place<br />
more GBs into a given area. However, the drawback is that 25nm has worse reading and writing latencies, compared to previous generations. However, I have yet to determine how this affects MySQL workloads.</p>
<p>Virident has provided the following price list:</p>
<ul>
<li>M1000 (1000GB Usable) &#8211; $13,000</li>
<li>M1400 (1400GB Usable) &#8211; $18,200</li>
<li>This amounts to <strong>$13/GB</strong></li>
</ul>
<p>Second, it is important to compare the performance of Virident FlashMAX MLC with available competing solutions.<br />
It is fair to say Fusion-io ioDrive Duo 1.28TB MLC is the most well-known and most advanced competitor in the market.<br />
I had a chance to administer a head-to-head comparison of sysbench and tpcc-mysql workloads between FlashMAX 1.4TB and ioDrive Duo 1.28TB.</p>
<p>It is important to highlight that Fusion-io ioDrive Duo is based on 34nm NAND technology, which is a full generation behind the 25nm NAND. However at this point, I have no access to Fusion-io ioDrive2, which is based on 25nm NAND.<br />
Another important factor is that ioDrive Duo is actually two cards visible in the OS, and the user needs to use a software RAID. For Virident all 1400GB shows up as one single drive so no software RAID is necessary.</p>
<p>To compare performances I ran sysbench oltp and tpcc-mysql benchmarks. I will present the results<br />
for sysbench oltp (with full report available later) below, and the results for tpcc-mysql in a followup post.</p>
<p>For sysbench, I used our multi-tables sysbench implementation with 256 tables and 10,000,000 rows each. This is a total of around 630GB of data, which allows one to adequately fill both cards in comparison.</p>
<p>Some hardware used in benchmarks include:</p>
<ul>
<li>Server: Cisco UCS C250, running Oracle Linux 6.1 and Percona Server 5.5.15</li>
<li>Client: HP ProLiant DL380 G6, sysbench v5</li>
</ul>
<p>Of course, our Percona Server was optimized for Flash cards, with variations for two settings.<br />
I tested combinations of <strong>innodb_buffer_pool_size</strong>=120GB, 174GB and <strong>innodb_flush_log_at_trx_commit</strong>=1, 2.</p>
<p>The results in this post are for case <strong>innodb_buffer_pool_size</strong>=174GB and <strong>innodb_flush_log_at_trx_commit</strong>=1</p>
<p>As in all my recent benchmarks, I use long runs of 1 hour each with measurements every 10 seconds. This methodology allows me to observe trends and the stability of the performance on graphs.</p>
<p>The first graph represents throughput in transactions per second for different amounts of user threads (more is better). More concentrated dots represent less variance and better stability of throughput.<br />
<a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/sysbench-thrp-174GB-trx1.png"><img class="aligncenter size-full wp-image-7310" title="sysbench-thrp-174GB-trx1" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/sysbench-thrp-174GB-trx1.png" alt="" width="600" height="400" /></a></p>
<p>A tabular format, for throughput I use a median of measurements for last 1800 seconds in each run:</p>
<table border="1">
<tbody>
<tr>
<th></th>
<th>Card / Threads / tps</th>
<th>1</th>
<th>2</th>
<th>4</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
<th>256</th>
<th>512</th>
<th>1024</th>
</tr>
<tr>
<td align="right">1</td>
<td>Fusion-io ioDrive Duo</td>
<td align="right">83.00</td>
<td align="right">177.00</td>
<td align="right">322.00</td>
<td align="right">523.00</td>
<td align="right">644.00</td>
<td align="right">740.00</td>
<td align="right">801.00</td>
<td align="right">798.00</td>
<td align="right">761.00</td>
<td align="right">784.00</td>
<td align="right">162.00</td>
</tr>
<tr>
<td align="right">2</td>
<td>Virident FlashMAX</td>
<td align="right">96.00</td>
<td align="right">179.00</td>
<td align="right">357.00</td>
<td align="right">607.00</td>
<td align="right">821.00</td>
<td align="right">975.00</td>
<td align="right">1083.00</td>
<td align="right">1156.00</td>
<td align="right">1064.00</td>
<td align="right">1091.00</td>
<td align="right">465.00</td>
</tr>
</tbody>
</table>
<p>In order to examine the details of how throughput varies we have taken 32 threads and examined the timeline graph for each one:<br />
<a href="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/sysbench-stab-174-trx1.png"><img class="aligncenter size-full wp-image-7311" title="sysbench-stab-174-trx1" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2011/11/sysbench-stab-174-trx1.png" alt="" width="600" height="400" /></a></p>
<p>While you can see that with Virident FlashMAX we have a pretty stable line of around <strong>975 tps</strong>, the Fusion-io ioDrive Duo has a variance of <strong>700-800</strong> tps.</p>
<p>My conclusions are as follows:</p>
<ul>
<li>It is great to see another player on MLC Flash cards market.</li>
<li>It is also great that Virident focuses on stability of performance for competitive advantage.</li>
<li>Beside stability, we also see better throughput in MySQL using the Virident FlashMAX card for every thread count. On 32-64 threads we have about a 35-40% advantage of using Virident FlashMAX.</li>
</ul>
<p><strong>DISCLOSURE</strong>: This review was done as part of our consulting practice for which we compensated by Virident. However, this review was written independently of Virident, and reflects our opinion of this product.</p>
<p>The full report is available <a href="http://www.percona.com/redir/files/white-papers/virident-mlc-sysbench.pdf">there</a></p>
<p><!-- Place this render call where appropriate --><br />
<script type="text/javascript">
  (function() {
    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
    po.src = 'https://apis.google.com/js/plusone.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
  })();
</script></p>
<p><script type="text/javascript">mbgc='f5f5f5';ww='320';mbc='cecece';bbc='3F79D5';bmobc='3b71c6';bbgc='4889F0';bmoc='3F79D5';bfc='FFFFFF';bmofc='ffffff';tlc='cecece';tc='6a6a6a';nc='6a6a6a';bc='6a6a6a';l='y';fs='16';fsb='13';bw='100';ff='4';pc='4889F0';b='s'; pid='109242749016593233313';</script><script type="text/javascript" src="http://widgetsplus.com/google_plus_widget.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/11/10/review-of-virident-flashmax-mlc/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Improved InnoDB fast index creation</title>
		<link>http://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/</link>
		<comments>http://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/#comments</comments>
		<pubDate>Mon, 07 Nov 2011 06:42:00 +0000</pubDate>
		<dc:creator>Alexey Kopytov</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[Insight for DBAs]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=7255</guid>
		<description><![CDATA[One of the serious limitations in the fast index creation feature introduced in the InnoDB plugin is that it only works when indexes are explicitly created using ALTER TABLE or CREATE INDEX. Peter has already blogged about it before, here I&#8217;ll just briefly reiterate other cases that might benefit from that feature: when ALTER TABLE [...]]]></description>
			<content:encoded><![CDATA[<p>One of the serious limitations in the <a href="http://dev.mysql.com/doc/refman/5.5/en/innodb-create-index.html">fast index creation</a> feature introduced in the InnoDB plugin is that it only works when indexes are explicitly created using <code>ALTER TABLE</code> or <code>CREATE INDEX</code>. Peter has already <a href="http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/">blogged</a> about it before, here I&#8217;ll just briefly reiterate other cases that might benefit from that feature:</p>
<ul>
<li>when <code>ALTER TABLE</code> does require copying the data into a temporary table, secondary indexes are updated by inserting individual rows rather than sorting;</li>
<li><code>OPTIMIZE TABLE</code> could be faster if secondary indexes were temporarily dropped and then recreated using fast index creation;</li>
<li>dumps produced by <em>mysqldump</em> first create tables with all secondary indexes and then load the data, which is also inefficient.</li>
</ul>
<p>Percona Server as of versions 5.1.56 and 5.5.11 allows utilizing fast index creation for all of the above cases, which can potentially speed them up greatly. This feature is controlled by the <code>expand_fast_index_creation</code> system variable which is OFF by default.</p>
<p>Let&#8217;s look at each of the above cases in more detail.</p>
<p>&nbsp;</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1"><code>ALTER TABLE</code></h2>
<div id="text-1" class="outline-text-2">
<p>By temporarily dropping secondary indexes from the new table before copying the data, and then recreating them later, <code>ALTER TABLE</code> can take advantage of the fast index creation feature even when it has to copy the entire table.</p>
<p>To illustrate this, I have performed a number of simple benchmarks. Let’s start with a table containing 4 million rows and one secondary key:</p>
<pre class="example">mysql&gt; CREATE TABLE t(id INT AUTO_INCREMENT PRIMARY KEY, c FLOAT) ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; INSERT INTO t(c) VALUES (RAND());
Query OK, 1 row affected (0.00 sec)

mysql&gt; INSERT INTO t(c) SELECT RAND() FROM t;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

. . .

mysql&gt; INSERT INTO t(c) SELECT RAND() FROM t;
Query OK, 2097152 rows affected (10.11 sec)
Records: 2097152  Duplicates: 0  Warnings: 0

mysql&gt; ALTER TABLE t ADD KEY (c);
Query OK, 0 rows affected (18.56 sec)
Records: 0  Duplicates: 0  Warnings: 0</pre>
<p>Let’s trigger a table rebuild by adding a new column and see what execution time is like when the default method is used:</p>
<pre class="example">mysql&gt; SET profiling=1;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; ALTER TABLE t ADD COLUMN v VARCHAR(1);
Query OK, 4194304 rows affected (1 min 1.97 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; SHOW PROFILE;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
| starting                     |  0.000054 |
| checking permissions         |  0.000004 |
| checking permissions         |  0.000004 |
| init                         |  0.000008 |
| Opening tables               |  0.000118 |
| System lock                  |  0.000007 |
| setup                        |  0.000027 |
| creating table               |  0.002255 |
| After create                 |  0.000050 |
| copy to tmp table            | 61.816063 |
| rename result table          |  0.161528 |
| end                          |  0.000007 |
| Waiting for query cache lock |  0.000002 |
| end                          |  0.000007 |
| query end                    |  0.000003 |
| closing tables               |  0.000008 |
| freeing items                |  0.000021 |
| cleaning up                  |  0.000003 |
+------------------------------+-----------+
18 rows in set (0.00 sec)</pre>
<p>Now let’s see how performance is affected when turning <code>expand_fast_index_creation</code> on. Here and in later examples I’m extending the <code>VARCHAR</code> column to trigger table rebuilds without affecting the table size.</p>
<pre class="example">mysql&gt; SET expand_fast_index_creation=ON;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; ALTER TABLE t MODIFY v VARCHAR(2);
Query OK, 4194304 rows affected (36.07 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; SHOW PROFILE;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
| starting                     |  0.000054 |
| checking permissions         |  0.000004 |
| checking permissions         |  0.000005 |
| init                         |  0.000010 |
| Opening tables               |  0.000027 |
| System lock                  |  0.000008 |
| setup                        |  0.000040 |
| creating table               |  0.002176 |
| After create                 |  0.000058 |
| copy to tmp table            | 18.083490 |
| restoring secondary keys     | 17.824109 |
| rename result table          |  0.162041 |
| end                          |  0.000008 |
| Waiting for query cache lock |  0.000002 |
| end                          |  0.000007 |
| query end                    |  0.000003 |
| closing tables               |  0.000008 |
| freeing items                |  0.000019 |
| cleaning up                  |  0.000003 |
+------------------------------+-----------+
19 rows in set (0.00 sec)</pre>
<p>As seen from the <code>SHOW PROFILE</code> output, copying the data to a temporary table without updating indexes took 18 seconds, and about the same time was spent on rebuilding the index using fast index creation. So we have 36 seconds in total which is about 1.7 times faster than updating indexes by insertion.</p>
<p>Let’s see if having more secondary indexes in the table makes any difference:</p>
<pre class="example">mysql&gt; SET expand_fast_index_creation=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; ALTER TABLE t ADD KEY (c), ADD KEY(c);
Query OK, 0 rows affected (36.42 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql&gt; ALTER TABLE t MODIFY v VARCHAR(3);
Query OK, 4194304 rows affected (3 min 4.87 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; SHOW PROFILE;
+------------------------------+------------+
| Status                       | Duration   |
+------------------------------+------------+
. . .
| copy to tmp table            | 184.694432 |
. . .
+------------------------------+------------+
18 rows in set (0.00 sec)

mysql&gt; SET expand_fast_index_creation=ON;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; ALTER TABLE t MODIFY v VARCHAR(4);
Query OK, 4194304 rows affected (1 min 11.12 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; SHOW PROFILE;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
. . .
| copy to tmp table            | 18.396514 |
| restoring secondary keys     | 52.567644 |
. . .
+------------------------------+-----------+
19 rows in set (0.00 sec)</pre>
<p>So with 3 secondary indexes <code>expand_fast_index_creation</code> gave us a 2.6x speedup.</p>
<p>Also note that unlike the default method, where the execution time is proportional to the number of indexes, with fast index creation the time required to copy the data to a temporary table is constant. The reason is that when using merge sort, InnoDB has to scan the clustered index only once, even though the actual sorting is done separately for each index.</p>
<p>The above has a couple of important implications:</p>
<ul>
<li>when the data does not fit in the buffer pool, fast index creation provides even better performance as compared to the default method, because it does not have to do random disk seeks to fetch secondary index pages to the buffer pool. A benchmark is worth a thousand words, so let’s repeat the last test with <code>innodb_buffer_pool_size</code>set to approximately 1/10th of the dataset:
<pre class="example">mysql&gt; SET expand_fast_index_creation=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; ALTER TABLE t MODIFY v VARCHAR(7);
Query OK, 4194304 rows affected (9 min 15.08 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; SET expand_fast_index_creation=ON;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; ALTER TABLE t MODIFY v VARCHAR(8);
Query OK, 4194304 rows affected (1 min 13.69 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; ALTER TABLE t MODIFY v VARCHAR(8);
Query OK, 4194304 rows affected (1 min 13.69 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql&gt; SHOW PROFILE;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
. . .
| copy to tmp table            | 19.805849 |
| restoring secondary keys     | 53.885502 |
. . .
+------------------------------+-----------+
19 rows in set (0.00 sec)</pre>
<p>So, as expected, a small buffer pool had a huge impact on <code>ALTER TABLE</code> with the optimization disabled, and absolutely no effect on the optimized case, which resulted in an almost 8x speedup.</li>
<li>having tmpdir on a fast storage is essential for <code>expand_fast_index_creation</code>, because temporary files for merge-sorting are created in tmpdir. The constant “copying to tmp table” part will not be affected by a slow tmpdir, but rebuilding the indexes will obviously take longer.</li>
</ul>
<p>Another important thing that is worth mentioning is fragmentation. Fast index creation results in much less fragmented indexes because records are inserted in the correct order into sequentially allocated pages after merge-sorting. So besides optimizing DDL directly, <code>expand_fast_index_creation</code> may also optimize index access for subsequent DML statements. In my test setup I got about 178 MB index size after fast index creation as reported by <code>SHOW TABLE STATUS</code> versus 265 MB index size with the optimization disabled.</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2"><code>OPTIMIZE TABLE</code></h2>
<div id="text-2" class="outline-text-2">
<p><code>OPTIMIZE TABLE</code> is mapped to <code>ALTER TABLE ... ENGINE=InnoDB</code> for InnoDB tables and thus, is just a special case of the previous one:</p>
<pre class="example">mysql&gt; SET expand_fast_index_creation=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; OPTIMIZE TABLE t;
+--------+----------+----------+-------------------------------------------------------------------+
| Table  | Op       | Msg_type | Msg_text                                                          |
+--------+----------+----------+-------------------------------------------------------------------+
| test.t | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.t | optimize | status   | OK                                                                |
+--------+----------+----------+-------------------------------------------------------------------+
2 rows in set (2 min 57.65 sec)

mysql&gt; SHOW TABLE STATUS LIKE 't'\G
*************************** 1. row ***************************
           Name: t
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 4195067
 Avg_row_length: 29
    Data_length: 125452288
Max_data_length: 0
   Index_length: 278839296
      Data_free: 1838153728
 Auto_increment: 4587468
    Create_time: 2011-11-06 10:01:18
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.01 sec)

mysql&gt; SET expand_fast_index_creation=ON;
Query OK, 0 rows affected (0.00 sec)

mysql&gt; OPTIMIZE TABLE t;
+--------+----------+----------+-------------------------------------------------------------------+
| Table  | Op       | Msg_type | Msg_text                                                          |
+--------+----------+----------+-------------------------------------------------------------------+
| test.t | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.t | optimize | status   | OK                                                                |
+--------+----------+----------+-------------------------------------------------------------------+
2 rows in set (1 min 12.19 sec)

mysql&gt; SHOW TABLE STATUS LIKE 't'\G
*************************** 1. row ***************************
           Name: t
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 4195067
 Avg_row_length: 29
    Data_length: 125452288
Max_data_length: 0
   Index_length: 187465728
      Data_free: 1930428416
 Auto_increment: 4587468
    Create_time: 2011-11-06 10:04:10
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.01 sec)</pre>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3"><em>mysqldump</em></h2>
<div id="text-3" class="outline-text-2">
<p>Quoting <a href="http://dev.mysql.com/doc/refman/5.5/en/innodb-create-index-overview.html">the fast index creation chapter</a> in the MySQL manual:</p>
<blockquote><p>“… you can generally speed the overall process of creating and loading<br />
an indexed table by creating the table with only the clustered index,<br />
and adding the secondary indexes after the data is loaded.”</p></blockquote>
<p><em>mysqldump</em> in Percona Server supports the new option <code>--innodb-optimize-keys</code> which does just that, i.e. it tries to optimize dumps of InnoDB tables by first creating the table with only the clustered index and adding the secondary indexes after the data dump when possible (see <strong>Caveats</strong> below).</p>
<p>Let’s compare the restore time for a regular dump with a dump created with <code>--innodb-optimize-keys</code> (the <code>test</code> database contained only the table I used in my previous examples):</p>
<pre class="example">$ mysqldump -uroot test &gt; dump_unoptimized.sql
$ mysqldump -uroot test --innodb-optimize-keys &gt; dump_optimized.sql

$ time mysql -uroot test &lt; dump_unoptimized.sql 

real    2m52.785s
user    0m3.179s
sys     0m0.069s

$ time mysql -uroot test &lt; dump_optimized.sql 

real    1m20.958s
user    0m3.204s
sys     0m0.062s</pre>
</div>
</div>
<div id="outline-container-4" class="outline-2">
<h2 id="sec-4">Caveats:</h2>
<div id="text-4" class="outline-text-2">
<p>As I mentioned previously, InnoDB fast index creation uses temporary files in <code>tmpdir</code> for all indexes being created. So make sure you have enough <code>tmpdir</code> space when using <code>expand_fast_index_creation</code>. It is a session variable, so you can temporarily switch it off if you are short on <code>tmpdir</code> space and/or don’t want this optimization to be used for a specific table.</p>
<p>There’s also a number of cases when this optimization is not applicable:</p>
<ul>
<li><code>UNIQUE</code> indexes in <code>ALTER TABLE</code> are ignored to enforce uniqueness where necessary when copying the data to a temporary table;</li>
<li><code>ALTER TABLE</code> and <code>OPTIMIZE TABLE</code> always process tables containing foreign keys as if <code>expand_fast_index_creation</code> is <code>OFF</code> to avoid dropping keys that are part of a <code>FOREIGN KEY</code> constraint;</li>
<li><code>mysqldump --innodb-optimize-keys</code> ignores foreign keys because InnoDB requires a full table rebuild on foreign key changes. So adding them back with a separate <code>ALTER TABLE</code> after restoring the data from a dump would actually make the restore slower;</li>
<li><code>mysqldump --innodb-optimize-keys</code> ignores indexes on <code>AUTO_INCREMENT</code> columns, because they must be indexed, so it is impossible to temporarily drop the corresponding index;</li>
<li><code>mysqldump --innodb-optimize-keys</code> ignores the first <code>UNIQUE</code> index on non-nullable columns when the table has no <code>PRIMARY KEY</code> defined, because in this case InnoDB picks such an index as the clustered one.</li>
</ul>
</div>
</div>
<div id="outline-container-5" class="outline-2">
<h2 id="sec-5">References:</h2>
<div id="text-5" class="outline-text-2">
<ul>
<li><a href="http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/">Peter&#8217;s post</a></li>
<li><a href="http://bugs.mysql.com/bug.php?id=57583">MySQL bug #57583</a></li>
<li><a href="http://bugs.mysql.com/bug.php?id=49120">MySQL bug #49120</a></li>
<li><a href="http://www.percona.com/doc/percona-server/5.5/management/innodb_fast_index_creation.html">Fast Index Creation page in Percona Server documentation</a></li>
</ul>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

