<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MySQL Performance Blog &#187; benchmarks</title>
	<atom:link href="http://www.mysqlperformanceblog.com/category/benchmarks/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com</link>
	<description>Everything about MySQL Performance</description>
	<lastBuildDate>Sat, 21 Nov 2009 03:11:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>table_cache negative scalability</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 02:18:00 +0000</pubDate>
		<dc:creator>peter</dc:creator>
				<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[myisam]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1754</guid>
		<description><![CDATA[Couple of months ago there was a post by FreshBooks on getting great performance improvements by lowering table_cache variable.    So I decided to investigate what is really happening here.
The &#8220;common sense&#8221; approach to tuning caches is to get them as large as you can if you have enough resources (such as memory). [...]]]></description>
			<content:encoded><![CDATA[<p>Couple of months ago there was a <a href="http://www.freshbooks.com/blog/2008/09/09/now-were-flying/">post</a> by FreshBooks on getting great performance improvements by lowering <strong>table_cache</strong> variable.    So I decided to investigate what is really happening here.</p>
<p>The &#8220;common sense&#8221; approach to tuning caches is to get them as large as you can if you have enough resources (such as memory).  With MySQL common sense however does not always works &#8211; we&#8217;ve seen performance issues with large <strong>query_cache_size</strong> also <a href="http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/">sort_buffer_size</a> and <a href="http://www.mysqlperformanceblog.com/2007/09/17/mysql-what-read_buffer_size-value-is-optimal/">read_buffer_size</a> may not give you better performance if you increase them. I found this also applies to <a href="http://www.mysqlperformanceblog.com/2006/06/06/are-larger-buffers-always-better/">some other buffers</a>.</p>
<p>Even though having previous experience of surprised behavior I did not expect such a <strong>table_cache</strong> issue &#8211; the LRU for cache management is classics and there are scalable algorithms to deal with it.  I would expect Monty to implement one of them.</p>
<p>To do the test I have created 100.000 empty tables containing single integer column and no indexes and when ran <strong>SELECT * FROM tableN</strong> in the loop.    Each table in such case is accessed only once and on any but first run each access would require table replacement in table cache based on LRU logic.<br />
<a href="http://mysqlsandbox.net/">MySQL Sandbox</a> helped me to test this with different servers easily.</p>
<p>I did test on CentOS 5.3,  Xeon E5405,  16GB RAM and EXT3 file system on the SATA hard drive. </p>
<p><strong>MySQL 5.0.85</strong>  Created 100.000 tables in around 3min 40 sec which is about <strong>450 tables/sec </strong> &#8211; This indicates the &#8220;fsync&#8221; is lying on this test system as  default sync_frm option is used.</p>
<p>With default <strong>table_cache=64</strong>   accessing all tables take 12 sec which is almost <strong>8500 tables/sec </strong>   which is a great speed.    We can note significant writes to the disk during this read-only benchmark. Why ?  Because for MyISAM tables table header has to be modified each time the table is opened.   In this case the performance was so great because all 100.000 tables data (first block of index) was placed close by on disk as well as fully cached which made updates to headers very slow.  In the production systems with table headers not in OS cache you often will see significantly low numbers &#8211; 100 or less.</p>
<p>With significantly larger<strong> table_cache=16384</strong> (and appropriately adjusted number of open files)  the same operation takes 660 seconds which is <strong>151 tables/sec</strong>  which is around 50 times slower.  Wow. This is the slow down.  We can see the load becomes very CPU bound in this case and it looks like some of the table_cache algorithms do not scale well.</p>
<p>The absolute numbers are also very interesting &#8211; 151 tables/sec is not that bad if you look at it as an absolute number.   So if you tune table cache is &#8220;normal&#8221; case and is able to bring down your miss rate (<strong>opened_tables</strong>) to 10/sec or less by using large <strong>table_cache </strong> you should do so.   However if you have so many tables you still see 100+ misses/sec  while your data (at least table headers)  is well cached so the cost of table cache miss is not very high, you may be better of with significantly reduced table cache size. </p>
<p>The next step for me was to see if the problem was fixed in MySQL 5.1 &#8211;  in this version table_cache was significantly redone and split in <strong>table_open_cache</strong> and  <strong>table_definition_cache</strong> and I assumed the behavior may be different as well.</p>
<p><strong>MySQL 5.1.40</strong><br />
I started testing with default <strong>table_open_cache=64</strong> and <strong>table_definition_cache=256</strong> &#8211; the read took about 12 seconds very close to MySQL 5.0.85.<br />
As I increased <strong>table_definition_cache</strong> to 16384 result remained the same so this variable is not causing the bottleneck.   However increasing <strong>table_open_cache</strong>  to 16384 causes scan to take about 780 sec which is a bit worse than MySQL 5.0.85. So the problem is not fixed in MySQL 5.1, lets see how MySQL 5.4 behaves.</p>
<p><strong>MySQL 5.4.2</strong><br />
MySQL 5.4.2  has higher default <strong>table_open_cache</strong> so I took it down to 64 so we can compare apples to apples.   It performs same as MySQL 5.0 and MySQL 5.1 with small table cache.<br />
With <strong>table_open_cache</strong> increased  to 16384 the test took  750 seconds  so the problem exists in MySQL 5.4 as well.</p>
<p>So the problem is real and it is not fixed even in Performance focused MySQL 5.4.  As we can see large table_cache (or table_open_cache_ values indeed can cause significant performance problems.  Interesting enough  Innodb has a very similar task of managing its own cache of file descriptors (set by <strong>innodb_open_files</strong>)  As the time allows I should test if Heikki knows how to implement LRU properly so it does not have problem with large number.  We&#8217;ll see.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/#comments">13 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/&amp;title=table_cache negative scalability" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/&amp;title=table_cache negative scalability" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/&amp;title=table_cache negative scalability" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/&amp;T=table_cache negative scalability" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/&amp;title=table_cache negative scalability" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Tokyo Tyrant -The Extras Part III : Write Bottleneck</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/#comments</comments>
		<pubDate>Thu, 12 Nov 2009 14:00:39 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1669</guid>
		<description><![CDATA[This is part 3 of my Tyrant extra&#8217;s, part 1 focused on durability, part 2 focused on the perceived performance wall.
#3.  Tokyo Cabinet Can have only a single writer thread, bottlenecking performance
When writing an application using Tokyo Cabinet only one connection can be opened as a “writer”  while the rest are readers.  Tyrant allows for [...]]]></description>
			<content:encoded><![CDATA[<p>This is part 3 of my Tyrant extra&#8217;s, part 1 focused on durability, part 2 focused on the perceived performance wall.</p>
<p>#3.  Tokyo Cabinet Can have only a single writer thread, bottlenecking performance</p>
<p>When writing an application using Tokyo Cabinet only one connection can be opened as a “writer”  while the rest are readers.  Tyrant allows for multiple “writes”  to be sent in from multiple applications but it still single threads them when writing out to disk.   If you run several threads all just inserting into Tyrant your will see tyrant hit 100% Cpu on 1 core, and your writes will start to peter out quickly.</p>
<p><img class="aligncenter size-full wp-image-1670" title="Single Threaded Writes" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/TC_PART_2_html_m5c0e618c.gif" alt="Single Threaded Writes" width="676" height="276" /></p>
<p>In my tests when I was not disk bound (FS Cache writes) I was able to complete 4Million inserts in a little over 91 seconds using 8 threads.  I actually averaged 43896.98 inserts per second during my 8 thread test.  Moving to 10 threads doing the same 4Million inserts I completed the test in 96 seconds and averaged 41649.42 inserts per second.    Compare this to 4 Million rows using 4 threads which averaged  40933.86 and you start to see that around 40K inserts per second is the most this particular server is capable of ( single threaded ).  Hopefully this is something that maybe able to be fixed internally in the near future.  Until then you may consider breaking up your data into multiple tables each with there own cache.  This limit is per TC DB so this should work.  I had an idea about using the memcached client to distribute the data accross multiple TC database files in the back end.  This should work, I just need to test it <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Ever notice how as my multi-part posts go on they get shorter and shorter:)  This will be the last Tyrant related post for a little bit.  The 4th &amp; 5th posts were supposed to deal with replication and scaling&#8230; this may take a little while.  Thanks for reading!</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by matt |
      <a href="http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/#comments">3 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/&amp;title=Tokyo Tyrant -The Extras Part III : Write Bottleneck" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/&amp;title=Tokyo Tyrant -The Extras Part III : Write Bottleneck" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/&amp;title=Tokyo Tyrant -The Extras Part III : Write Bottleneck" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/&amp;T=Tokyo Tyrant -The Extras Part III : Write Bottleneck" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/&amp;title=Tokyo Tyrant -The Extras Part III : Write Bottleneck" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/12/tokyo-tyrant-%e2%80%93-the-extras-part-iii-write-bottleneck/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Tokyo Tyrant &#8211; The Extras Part II :  The Performance Wall</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/#comments</comments>
		<pubDate>Wed, 11 Nov 2009 15:00:41 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1662</guid>
		<description><![CDATA[Continuing my look at Tokyo Tyrant/Cabinet and addressing some of the concerns I have seen people have brought up this is post #2.
#2.  As your data grows does  Tokyo Cabinet slow down?
Yes your performance can degrade. One obvious performance decrease with a larger dataset  is you start to increase the likelihood that your data no [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing my look at Tokyo Tyrant/Cabinet and addressing some of the concerns I have seen people have brought up this is post #2.</p>
<p>#2.  As your data grows does  Tokyo Cabinet slow down?</p>
<p>Yes your performance can degrade. One obvious performance decrease with a larger dataset  is you start to increase the likelihood that your data no longer fits into memory.  This decreases the number of memory operations and trades them for more expensive disk based operations.    As fast as any application is, as you read off disk opposed to memory performance is going to drop off substantially.  One of the more difficult things to test with Tyrant is disk bound performance.  The FS Cache can make Tyrant seem like small amounts of memory will still make it scream.  Once your data set is larger then that, people start to claim they hit the performance “wall”.</p>
<p>In order to help test this I went ahead an mounted the FS with my data files with the sync option which effectively disables the FS cache.  This should help show the real performance of the hash engine.  Here performance dips substantially, as expected :</p>
<p><img class="aligncenter size-full wp-image-1663" title="FS Mounted As Sync" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/TC_PART_2_html_144b51c.gif" alt="FS Mounted As Sync" width="610" height="332" /></p>
<p>Look at the IO rate:<br />
NoSync:  31 MB/s<br />
Sync:  3.2 MB/s</p>
<p>As one would expect the IO goes crazy when the drive is mounted with the sync option hitting 99% IO wait.  The interesting this here is we are actually bottlenecking on writes and not reads.  You see without the FS cache to buffer the writes when we need to remove data from memory we now have to rely on the internal Tyrant cache and when that is exhausted have to then really write to disk not the FS Cache.  Now Tyrant starts to take on the same characteristics as your classic DB, the bigger the buffer pool the faster the performance:</p>
<p><img class="aligncenter size-full wp-image-1664" title="Difference Memory Sizes for Tyrant" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/TC_PART_2_html_1bdc78f4.gif" alt="Difference Memory Sizes for Tyrant" width="585" height="372" /><br />
Even here the performance drop-off once you exhaust memory is relative.  The focus here should be the drop off versus other solutions with the same configuration, not the drop off versus a completely cached version.  In this case ask yourself given similar datasets and similar memory requirements what is the performance?  Take the above sync test, when I use 256M of memory and run my test with writes going directly to disk I hit 964 TPS, in previous MySQL tests the same setup (256M BP) netted ~160 TPS.  So 5x improvement all things being equal.  Of course this is a far drop off from the 13K I was getting when everything was effectively in the file system cache or in memory, but 5x is still a very solid improvement.</p>
<p>Next up is looking at Tyrant&#8217;s and Cabinet&#8217;s write bottleneck.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by matt |
      <a href="http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/#comments">3 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/&amp;title=Tokyo Tyrant &#8211; The Extras Part II :  The Performance Wall" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/&amp;title=Tokyo Tyrant &#8211; The Extras Part II :  The Performance Wall" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/&amp;title=Tokyo Tyrant &#8211; The Extras Part II :  The Performance Wall" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/&amp;T=Tokyo Tyrant &#8211; The Extras Part II :  The Performance Wall" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/&amp;title=Tokyo Tyrant &#8211; The Extras Part II :  The Performance Wall" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/11/tokyo-tyrant-the-extras-part-ii-the-performance-wall/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 14:00:11 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1650</guid>
		<description><![CDATA[You know how in addition to the main movie you have extras on the DVD.  Extra commentary, bloopers, extra scenes, etc? Well welcome the Tyrant extras.  With my previous blog posts I was trying to set-up a case for looking at NOSQL tools, and not meant to be a decision making tool.  Each solution has [...]]]></description>
			<content:encoded><![CDATA[<p>You know how in addition to the main movie you have extras on the DVD.  Extra commentary, bloopers, extra scenes, etc? Well welcome the Tyrant extras.  With my<a href="http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/"> previous blog posts</a> I was trying to set-up a case for looking at NOSQL tools, and not meant to be a decision making tool.  Each solution has pros and cons that will impact how well the technology works for you.  Based on some of the comments and questions to the other blogs, I thought I would put together a little more detail into some of the deficiencies and strengths of Tokyo Tyrant.</p>
<p>#1.  How durable is Tokyo Tyrant?</p>
<p>Well I went ahead and built a quick script that just inserted data into a TC table ( an id, and a timestamp) and did a kill -9 on the  the server in the middle of it.</p>
<p style="padding-left: 30px;"><code>Insert:<br />
159796,1256131127.17329<br />
159797,1256131127.17338<br />
159798,1256131127.17345<br />
159799,1256131127.17355<br />
put error: recv error<br />
159800,1256131127.17364</code></p>
<p>Here we failed at a time of 1256131127.17355 , before the next record was inserted.</p>
<p>After bringing the server up from a crash:</p>
<p style="padding-left: 30px;"><code>159795,1256131127.1732<br />
159796,1256131127.17329<br />
159797,1256131127.17338<br />
159798,1256131127.17345<br />
159799,1256131127.17355</code></p>
<p>All the records are still there.  So we are good right?  Looking in the code,  Tokyo Cabinet actually utilizes memory mapped files.  I personally have not using mmaped files, so feel free to correct me if you know better then I.  Using mmap here and performing a kill -9 seems to preserve the changes in memory, while powering down the server does not:</p>
<p style="padding-left: 30px;"><code>163,1257780699.10123<br />
164,1257780699.35172<br />
165,1257780699.60209<br />
166,1257780699.85246</code></p>
<p>insert yanking of power cord here&#8230; gives us Post crash data of:</p>
<p style="padding-left: 30px;"><code>142,1257780693.84303<br />
143,1257780694.09345</code></p>
<p>So we basically lost 5 secondish of data.</p>
<p>Looking at the Tyrant &amp; Cabinet  documentation you will see mention of a  SYNC command which they say does the following:</p>
<p>“The function `tcrdbsync&#8217; is used in order to synchronize updated contents of a remote database object with the file and the device.”</p>
<p>Let&#8217;s dig a little deeper into the code and see what&#8217;s going on:</p>
<p style="padding-left: 30px;"><code>/* Synchronize updated contents of a hash database object with the file and the device. */<br />
bool tchdbsync(TCHDB *hdb){<br />
assert(hdb);<br />
if(!HDBLOCKMETHOD(hdb, true)) return false;<br />
if(hdb-&gt;fd &lt; 0 || !(hdb-&gt;omode &amp; HDBOWRITER) || hdb-&gt;tran){<br />
tchdbsetecode(hdb, TCEINVALID, __FILE__, __LINE__, __func__);<br />
HDBUNLOCKMETHOD(hdb);<br />
return false;<br />
}<br />
if(hdb-&gt;async &amp;&amp; !tchdbflushdrp(hdb)){<br />
HDBUNLOCKMETHOD(hdb);<br />
return false;<br />
}<br />
bool rv = tchdbmemsync(hdb, true);<br />
HDBUNLOCKMETHOD(hdb);<br />
return rv;<br />
}</code></p>
<p>If it first checks if the file descriptor for the database is less then 0, or your not operating as a writer&#8230;  in which case it errors.  Then if checks if your running in async io mode.  If your running async it flushes the records from the delayed record pool.  If your running async and you do not flush your records, then your at the mercy of Tokyo cabinet, or your application to call one of the numerous operations that flushes the delayed record pool ( i.e.  all regular sync operations like tchdbput will flush it ).  I did not test with async, in fact to the best of my knowledge it does not look like tyrant supports async, even though cabinet does.   Which means the meat of the sync command coming from tyrant is tchdbmemsync.</p>
<p style="padding-left: 30px;"><code>/* Synchronize updating contents on memory of a hash database object. */<br />
bool tchdbmemsync(TCHDB *hdb, bool phys){<br />
assert(hdb);<br />
if(hdb-&gt;fd &lt; 0 || !(hdb-&gt;omode &amp; HDBOWRITER)){<br />
tchdbsetecode(hdb, TCEINVALID, __FILE__, __LINE__, __func__);<br />
return false;<br />
}<br />
bool err = false;<br />
char hbuf[HDBHEADSIZ];<br />
tchdbdumpmeta(hdb, hbuf);<br />
memcpy(hdb-&gt;map, hbuf, HDBOPAQUEOFF);<br />
if(phys){<br />
size_t xmsiz = (hdb-&gt;xmsiz &gt; hdb-&gt;msiz) ? hdb-&gt;xmsiz : hdb-&gt;msiz;<br />
if(msync(hdb-&gt;map, xmsiz, MS_SYNC) == -1){<br />
tchdbsetecode(hdb, TCEMMAP, __FILE__, __LINE__, __func__);<br />
err = true;<br />
}<br />
if(fsync(hdb-&gt;fd) == -1){<br />
tchdbsetecode(hdb, TCESYNC, __FILE__, __LINE__, __func__);<br />
err = true;<br />
}<br />
}<br />
return !err;<br />
}</code></p>
<p>Here you see the call to msync.  What does msync do?  The man page says:</p>
<p>“The msync() function writes all modified data to permanent storage locations, if any, in those whole pages containing any part of the address space of the process starting at address addr and continuing for len bytes.”</p>
<p>Basically in the Tokyo Tyrant context msync will flush all the changes to a memory mapped object to disk.  This msync is crucial as you can not guarantee data ever makes it to disk if its not called.  (more below)</p>
<p>The tchdbmemsync function is the only place I saw calling msync. What calls  tchdbmemsync?</p>
<p style="padding-left: 30px;">tchdbmemsync Called via:<br />
<code>tchdboptimize<br />
tchdbsync<br />
tchdbtranbegin<br />
tchdbtrancommit<br />
tchdbtranabort<br />
tchdbcloseimpl<br />
tchdbcopyimpl</code></p>
<p>The commands that will indirectly call an msync are : running the optimize command, calling a sync directly, closing a connection to the db, or starting,commiting, or aborting a transaction.  Note a transaction in TC is actually a global transaction and locks all write operations ( used for maintenance ).  What is missing here is a scheduled call to msync.  I looked and traced back the calls from Tyrant into Cabinet and could not find anything that is called by automatically.</p>
<p>The documentation on msync actually says without calling msync there is no guarantee of the data making it to disk.  This implies that it may eventually get written without a direct msync call ( When you purge/lru old data from memory ).    Testing this theory I crashed my server several times and found that data written out to disk without calling msync was very flaky indeed.  I had anywhere from 5 seconds of missing data to 60 seconds post crash.</p>
<p>This means for durability you really need to directly call the sync command.  In my previous post someone pointed out a flaw in this approach saying that they had seen that calling a sync after writes ruined performance.  Looking at the code you can see why calling a sync after each write can severely degrade performance.  Before I explain lets look at the performance hit:</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-1680" title="Sync After every Call" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/TC_PART_2_html_m33bce6bb1.gif" alt="Sync After every Call" width="621" height="325" /></p>
<p>Saying there is a performance hit here is an understatement.  The reason for this however is really how msync works and how its used in Tokyo Cabinet.  In a sense it is implemented as a global sync, not a record sync. i.e.  all changes  to the underlying database are flushed at once.  So instead of sync the record you just changed, all of the changed records in the DB will be flushed and synced.  In order to perform this operation a lock is required, which blocks other SYNC calls.   So if you have 32 threads, you could have 1 sync running and 31 others blocked.  This means calling a sync after every call is going to severely degrade performance.</p>
<p>So what can we do to Make Cabinet more durable?   Well the best option in my opinion is to steal a trick from Innodb:</p>
<p>We can easily write a a script that calls a background sync every second ( i.e. like innodb_flush_log_at_trx_commit = 0/2).  I have tested this and I see almost 0 impact on my gaming benchmark from when this is running to when it is not.</p>
<p style="text-align: center;"><img class="size-full wp-image-1655 aligncenter" title="Once a Second Sync" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/TC_PART_2_html_m16b3bee4.gif" alt="Once a Second Sync" width="642" height="304" /></p>
<p>You can write this and cron the script or TTSERVER actually provides you a method to call functions periodically:</p>
<p><code>-ext path : specify the script language extension file.<br />
-extpc name period : specify the function name and the calling period of a periodic command.</code></p>
<p>Now while I did not see a drop in my benchmark, heavy write operations will see a drop in performance&#8230; for instance with 8 threads simply update/inserting data is saw this:</p>
<p style="text-align: center;"><img class="size-full wp-image-1656 aligncenter" title="heavy insert sync once a second" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/11/TC_PART_2_html_m458df6c3.gif" alt="heavy insert sync once a second" width="641" height="320" /></p>
<p>Ouch, a 2X hit.  But you can configure the frequency of the sync  up or down as needed to ensure you have the proper recovery -vs- performance setting.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by matt |
      <a href="http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/#comments">3 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/&amp;title=Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/&amp;title=Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/&amp;title=Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/&amp;T=Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/&amp;title=Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Air traffic queries in MyISAM and Tokutek (TokuDB)</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/#comments</comments>
		<pubDate>Fri, 06 Nov 2009 06:21:03 +0000</pubDate>
		<dc:creator>Vadim</dc:creator>
				<category><![CDATA[OLAP]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[dw]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1641</guid>
		<description><![CDATA[This is next post in series
Analyzing air traffic performance with InfoBright and MonetDB
Air traffic queries in LucidDB
Air traffic queries in InfiniDB: early alpha
Let me explain the reason of choosing these engines.  After initial three posts I am often asked "What is baseline ? Can we compare results with standard MySQL engines ?". So there [...]]]></description>
			<content:encoded><![CDATA[<p>This is next post in series<br />
<a href="http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/">Analyzing air traffic performance with InfoBright and MonetDB</a><br />
<a href="http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/">Air traffic queries in LucidDB</a><br />
<a href="http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/">Air traffic queries in InfiniDB: early alpha</a></p>
<p>Let me explain the reason of choosing these engines.  After initial three posts I am often asked "What is baseline ? Can we compare results with standard MySQL engines ?". So there come MyISAM to consider it as base point to see how column-oriented-analytic engines are better here. </p>
<p>However, take into account, that for MyISAM we need to choose proper indexes to execute queries effectively, and there is pain coming with indexes: - load of data is getting slower; - to design proper indexes is additional research,  especially when MySQL optimizer is not smart in picking best one.</p>
<p>The really nice thing about MonetDB, InfoBright, InfiniDB is that they do not need indexes, so you may not worry about maintaining them and picking best one. I am not sure about LucidDB, I was told indexes are needed, but creating new index was really fast even on full database, so I guess, it's not B-Tree indexes. So this my reflexion on indexes turned me onto TokuDB direction.</p>
<p>What is so special about TokuDB ? There two things: indexes have special structure and are "cheap", by "cheap" I mean the maintenance cost is constant and independent on datasize. With regular B-Tree indexes cost grows  exponentially on datasize (Bradley Kuszmaul from Tokutek will correct me if I am wrong in this statement). Another point with TokuDB, it uses compression, so I expect less size of loaded data and less IO operations during query execution.</p>
<p>So what indexes we need for queries. To recall you details, the schema is available in this post<br />
<a href="http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/">http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/</a>, and<br />
queries I posted on sheet "Queries" in my summary <a href="https://spreadsheets.google.com/a/percona.com/ccc?key=0AjsVX7AnrCYwdERIZFVqakRrcXplM0g0UktaUkRwenc&#038;hl=en#">Spreadsheet</a>.</p>
<p>With Bradley's help we chose  next indexes:</p>
<div class="igBar"><span id="lcode-2"><a href="#" onclick="javascript:showPlainTxt('code-2'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-2">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">KEY `Year` <span style="color:#006600; font-weight:bold;">&#40;</span>`Year`,`Month`<span style="color:#006600; font-weight:bold;">&#41;</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; KEY `Year_2` <span style="color:#006600; font-weight:bold;">&#40;</span>`Year`,`DayOfWeek`<span style="color:#006600; font-weight:bold;">&#41;</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; KEY `DayOfWeek` <span style="color:#006600; font-weight:bold;">&#40;</span>`DayOfWeek`,`Year`,`DepDelay`<span style="color:#006600; font-weight:bold;">&#41;</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; KEY `DestCityName` <span style="color:#006600; font-weight:bold;">&#40;</span>`DestCityName`,`OriginCityName`,`Year`<span style="color:#006600; font-weight:bold;">&#41;</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; KEY `Year_3` <span style="color:#006600; font-weight:bold;">&#40;</span>`Year`,`DestCityName`,`OriginCityName`<span style="color:#006600; font-weight:bold;">&#41;</span>,</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; KEY `Year_4` <span style="color:#006600; font-weight:bold;">&#40;</span>`Year`,`Carrier`,`DepDelay`<span style="color:#006600; font-weight:bold;">&#41;</span>,</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; KEY `Origin` <span style="color:#006600; font-weight:bold;">&#40;</span>`Origin`,`Year`,`DepDelay`<span style="color:#006600; font-weight:bold;">&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>And I measured load time for both MyISAM and TokuDB in empty table with created indexes.</p>
<p>Load time for MyISAM: <strong>16608 sec</strong><br />
For TokuDB: <strong>19131 sec</strong></p>
<p>Datasize (including indexes)</p>
<p>MyISAM: <strong>36.7GB</strong><br />
TokuDB: <strong>6.7GB</strong></p>
<p>I am a bit surprised that TokuDB is slower loading data, but my guess it is related to compression, and I expect with bigger amount of data TokuDB will be faster MyISAM.</p>
<p>Now to queries. Bradley pointed me that query Q5 <code>SELECT t.carrier, c, c2, c*1000/c2 as c3 FROM (SELECT carrier,<br />
count(*) AS c FROM ontime WHERE DepDelay>10 AND Year=2007 GROUP BY<br />
carrier) t JOIN (SELECT carrier, count(*) AS c2 FROM ontime WHERE<br />
Year=2007 GROUP BY carrier) t2 ON (t.Carrier=t2.Carrier) ORDER BY c3</code> can be rewritten as<br />
<code>SELECT carrier,totalflights,ndelayed,ndelayed*1000/totalflights as c3 FROM (SELECT carrier,count(*) as totalflights,sum(if(depdelay>10,1,0)) as ndelayed from ontime where year=2007 group by carrier) t order by c3 desc;</code> ( I name it as Query Q5i)</p>
<p>The summary table with queries execution time (in sec, less is better):</p>
<table border=1>
<tr>
<td>Query</td>
<td>MyISAM</td>
<td>TokuDB</td>
</tr>
<tr>
<td>Q0</td>
<td>72.84</td>
<td>50.25</td>
</tr>
<tr>
<td>Q1</td>
<td>61.03</td>
<td>55.01</td>
</tr>
<tr>
<td>Q2</td>
<td>98.12</td>
<td>58.36</td>
</tr>
<tr>
<td>Q3</td>
<td>123.04</td>
<td>66.87</td>
</tr>
<tr>
<td>Q4</td>
<td>6.92</td>
<td>6.91</td>
</tr>
<tr>
<td>Q5</td>
<td>13.61</td>
<td>11.86</td>
</tr>
<tr>
<td>Q5i</td>
<td>7.68</td>
<td>6.96</td>
</tr>
<tr>
<td>Q6</td>
<td>123.84</td>
<td>69.03</td>
</tr>
<tr>
<td>Q7</td>
<td>187.22</td>
<td>159.62</td>
</tr>
<tr>
<td>Q8 (1y)</td>
<td>8.75</td>
<td>7.59</td>
</tr>
<tr>
<td>Q8 (2y)</td>
<td>102.17</td>
<td>64.95</td>
</tr>
<tr>
<td>Q8 (3y)</td>
<td>104.7</td>
<td>69.76</td>
</tr>
<tr>
<td>Q8 (4y)</td>
<td>107.05</td>
<td>70.46</td>
</tr>
<tr>
<td>Q8 (10y)</td>
<td>119.54</td>
<td>84.64</td>
</tr>
<tr>
<td>Q9</td>
<td>69.05</td>
<td>47.67</td>
</tr>
</table>
<p>For reference I used 5.1.36-Tokutek-2.1.0 for both MyISAM and TokuDB tests.</p>
<p>And if you are interested to compare MyISAM with previous engines:</p>
<table border=1>
<tr>
<td>Query</td>
<td>MyISAM</td>
<td>MonetDB</td>
<td>InfoBright</td>
<td>LucidDB</td>
<td>InfiniDB</td>
</tr>
<tr>
<td>Q0</td>
<td>72.84</td>
<td>29.9</td>
<td>4.19</td>
<td>103.21</td>
<td>NA</td>
</tr>
<tr>
<td>Q1</td>
<td>61.03</td>
<td>7.9</td>
<td>12.13</td>
<td>49.17</td>
<td>6.79</td>
</tr>
<tr>
<td>Q2</td>
<td>98.12</td>
<td>0.9</td>
<td>6.73</td>
<td>27.13</td>
<td>4.59</td>
</tr>
<tr>
<td>Q3</td>
<td>123.04</td>
<td>1.7</td>
<td>7.29</td>
<td>27.66</td>
<td>4.96</td>
</tr>
<tr>
<td>Q4</td>
<td>6.92</td>
<td>0.27</td>
<td>0.99</td>
<td>2.34</td>
<td>0.75</td>
</tr>
<tr>
<td>Q5</td>
<td>13.61</td>
<td>0.5</td>
<td>2.92</td>
<td>7.35</td>
<td>NA</td>
</tr>
<tr>
<td>Q6</td>
<td>123.84</td>
<td>12.5</td>
<td>21.83</td>
<td>78.42</td>
<td>NA</td>
</tr>
<tr>
<td>Q7</td>
<td>187.22</td>
<td>27.9</td>
<td>8.59</td>
<td>106.37</td>
<td>NA</td>
</tr>
<tr>
<td>Q8 (1y)</td>
<td>8.75</td>
<td>0.55</td>
<td>1.74</td>
<td>6.76</td>
<td>8.13</td>
</tr>
<tr>
<td>Q8 (2y)</td>
<td>102.17</td>
<td>1.1</td>
<td>3.68</td>
<td>28.82</td>
<td>16.54</td>
</tr>
<tr>
<td>Q8 (3y)</td>
<td>104.7</td>
<td>1.69</td>
<td>5.44</td>
<td>35.37</td>
<td>24.46</td>
</tr>
<tr>
<td>Q8 (4y)</td>
<td>107.05</td>
<td>2.12</td>
<td>7.22</td>
<td>41.66</td>
<td>32.49</td>
</tr>
<tr>
<td>Q8 (10y)</td>
<td>119.54</td>
<td>29.14</td>
<td>17.42</td>
<td>72.67</td>
<td>70.35</td>
</tr>
<tr>
<td>Q9</td>
<td>69.05</td>
<td>6.3</td>
<td>0.31</td>
<td>76.12</td>
<td>9.54</td>
</tr>
</table>
<p>The all results are available in <a href="https://spreadsheets.google.com/a/percona.com/ccc?key=0AjsVX7AnrCYwdERIZFVqakRrcXplM0g0UktaUkRwenc&#038;hl=en#">summary Spreadsheet</a></p>
<p>I especially do not put TokuDB in the same table with analytic oriented databases, to highlight TokuDB is  OLTP engine for general purposes.<br />
As you see it is doing better than MyISAM in all queries.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Vadim |
      <a href="http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/#comments">25 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air traffic queries in MyISAM and Tokutek (TokuDB)" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air traffic queries in MyISAM and Tokutek (TokuDB)" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air traffic queries in MyISAM and Tokutek (TokuDB)" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;T=Air traffic queries in MyISAM and Tokutek (TokuDB)" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air traffic queries in MyISAM and Tokutek (TokuDB)" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
		</item>
		<item>
		<title>Air traffic queries in InfiniDB: early alpha</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/#comments</comments>
		<pubDate>Mon, 02 Nov 2009 21:29:28 +0000</pubDate>
		<dc:creator>Vadim</dc:creator>
				<category><![CDATA[OLAP]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[dw]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1593</guid>
		<description><![CDATA[As Calpont announced availability of InfiniDB I surely couldn't miss a chance to compare it with previously tested databases in the same environment.
See my previous posts on this topic:
Analyzing air traffic performance with InfoBright and MonetDB
Air traffic queries in LucidDB
I could not run all queries against InfiniDB and I met some hiccups during my experiment, [...]]]></description>
			<content:encoded><![CDATA[<p>As Calpont announced availability of <a href="http://infinidb.org/">InfiniDB</a> I surely couldn't miss a chance to compare it with previously tested databases in the same environment.<br />
See my previous posts on this topic:<br />
<a href="http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/">Analyzing air traffic performance with InfoBright and MonetDB</a><br />
<a href="http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/">Air traffic queries in LucidDB</a></p>
<p>I could not run all queries against InfiniDB and I met some hiccups during my experiment, so it was less plain experience than with other databases.</p>
<p>So let's go by the same steps:</p>
<p><strong>Load data</strong></p>
<p>InfiniDB supports MySQL's <code>LOAD DATA</code> statement and it's own <code>colxml / cpimport</code> utilities. As <code>LOAD DATA</code> is more familiar for me, I started with that, however after issuing LOAD DATA on 180MB file ( for 1989 year, 1st month) very soon it caused extensive swapping (my box has 4GB of RAM) and statement failed with<br />
<code>ERROR 1 (HY000) at line 1: CAL0001: Insert Failed:  St9bad_alloc</code></p>
<p>Alright, <code>colxml / cpimport</code> was more successful, however it has less flexibility  in syntax than <code>LOAD DATA</code>, so I had to transform the input files  into a format that <code>cpimport</code> could understand.</p>
<p>Total load time was <strong>9747 sec</strong> or  <strong>2.7h</strong> (not counting time spent on files transformation)</p>
<p>I put summary data into on load data time, datasize and query time to <a href="https://spreadsheets.google.com/ccc?key=0AjsVX7AnrCYwdERIZFVqakRrcXplM0g0UktaUkRwenc&#038;hl=en">Google Spreadsheet</a> so you can easy compare with previous results. There are different sheets for queries, datasize and time of load.</p>
<p><strong>Datasize</strong></p>
<p>Size  of database after loading is another confusing point. InfiniDB data directory has complex structure like</p>
<div class="igBar"><span id="lcode-7"><a href="#" onclick="javascript:showPlainTxt('code-7'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-7">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">233</span>.<span style="">dir</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">233</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">233</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/FILE000.<span style="">cdf</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">241</span>.<span style="">dir</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">241</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">241</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/FILE000.<span style="">cdf</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">238</span>.<span style="">dir</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">238</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">238</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/FILE000.<span style="">cdf</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">235</span>.<span style="">dir</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">235</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">./<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">003</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">235</span>.<span style="">dir</span>/<span style="color:#800000;color:#800000;">000</span>.<span style="">dir</span>/FILE000.<span style="">cdf</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>so it's hard to day what files are related to table. But after load, the size of 000.dir is <strong>114G</strong>, which is as twice big as original data files. <strong>SHOW TABLE STATUS</strong> does not really help there, it shows</p>
<div class="igBar"><span id="lcode-8"><a href="#" onclick="javascript:showPlainTxt('code-8'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-8">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Name: ontime</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Engine: InfiniDB</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; Version: <span style="color:#800000;color:#800000;">10</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp;Row_format: Dynamic</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Rows: <span style="color:#800000;color:#800000;">2000</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;Avg_row_length: <span style="color:#800000;color:#800000;">0</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; Data_length: <span style="color:#800000;color:#800000;">0</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Max_data_length: <span style="color:#800000;color:#800000;">0</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp;Index_length: <span style="color:#800000;color:#800000;">0</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; Data_free: <span style="color:#800000;color:#800000;">0</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;Auto_increment: NULL</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; Create_time: NULL</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; Update_time: NULL</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp;Check_time: NULL</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; Collation: latin1_swedish_ci</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp;Checksum: NULL</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;Create_options: </div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; Comment: </div>
</li>
</ol>
</div>
</div>
</div>
<p>
with totally misleading information.</p>
<p>So I put <strong>114GB</strong> as size of data after load, until someone points me how to get real size, and also explains what takes so much space.</p>
<p><strong>Queries</strong></p>
<p>First count start query <code>SELECT count(*) FROM ontime</code> took <strong>2.67 sec</strong>, which shows that InfiniDB does not store counter of records, however calculates it pretty fast.</p>
<p>Q0:<br />
<code>select avg(c1) from (select year,month,count(*) as c1 from ontime group by YEAR,month) t;</code></p>
<p>Another bumper, on this query InfiniDB complains<br />
<code><br />
ERROR 138 (HY000):<br />
The query includes syntax that is not supported by InfiniDB. Use 'show warnings;' to get more information. Review the Calpont InfiniDB Syntax guide for additional information on supported distributed syntax or consider changing the InfiniDB Operating Mode (infinidb_vtable_mode).<br />
mysql> show warnings;<br />
+-------+------+------------------------------------------------------------+<br />
| Level | Code | Message                                                    |<br />
+-------+------+------------------------------------------------------------+<br />
| Error | 9999 | Subselect in From clause is not supported in this release. |<br />
+-------+------+------------------------------------------------------------+<br />
</code></p>
<p>Ok, so InfiniDB does not support DERIVED TABLES, which is big limitation from my point of view.<br />
As workaround I tried to create temporary table, but got another error:</p>
<div class="igBar"><span id="lcode-9"><a href="#" onclick="javascript:showPlainTxt('code-9'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-9">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mysql&gt; create temporary table tq2 as <span style="color:#006600; font-weight:bold;">&#40;</span>select Year,Month,count<span style="color:#006600; font-weight:bold;">&#40;</span>*<span style="color:#006600; font-weight:bold;">&#41;</span> as c1 from ontime group by Year, Month<span style="color:#006600; font-weight:bold;">&#41;</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">ERROR <span style="color:#800000;color:#800000;">122</span> <span style="color:#006600; font-weight:bold;">&#40;</span>HY000<span style="color:#006600; font-weight:bold;">&#41;</span>: Cannot open table handle for ontime. </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>As warning suggests I turned <code>infinidb_vtable_mode = 2</code>, which is:</p>
<div class="igBar"><span id="lcode-10"><a href="#" onclick="javascript:showPlainTxt('code-10'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-10">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#800000;color:#800000;">2</span><span style="color:#006600; font-weight:bold;">&#41;</span> auto-switch mode: InfiniDB will attempt to process the query internally, if it </div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">cannot, it will automatically switch the query to run in row-by-row mode. </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>but query took <strong>667 sec</strong> :</p>
<p>so I skip queries Q5, Q6, Q7 from consideration, which are also  based on DERIVED TABLES,  as not supported by InfiniDB.</p>
<p>Other queries: (again look on comparison with other engines in <a href="https://spreadsheets.google.com/ccc?key=0AjsVX7AnrCYwdERIZFVqakRrcXplM0g0UktaUkRwenc&#038;hl=en">Google Spreadsheet</a> or in summary table at the bottom)</p>
<p>Query Q1:<br />
<code>mysql> SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC;</code><br />
7 rows in set (<strong>6.79 sec</strong>)</p>
<p>Query Q2:<br />
<code>mysql> SELECT DayOfWeek, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC;<br />
</code><br />
7 rows in set (<strong>4.59 sec</strong>)</p>
<p>Query Q3:<br />
<code>SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year BETWEEN 2000 AND 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;<br />
</code><br />
<strong>4.96 sec</strong></p>
<p>Query Q4:<br />
<code>mysql> SELECT Carrier, count(*) FROM ontime WHERE DepDelay > 10 AND YearD=2007 GROUP BY Carrier ORDER BY 2 DESC;<br />
</code></p>
<p>I had another surprise with query, after 15 min it did not return results, I check system and it was totally idle, but query stuck. I killed query, restarted mysqld but could not connect to mysqld anymore.  In processes I see that InfiniDB started couple external processes: <code>ExeMgr, DDLProc, PrimProc, controllernode fg, workernode DBRM_Worker1 fg</code> which cooperate each with other using IPC shared memory and semaphores. To clean system I rebooted  server, and only after that mysqld was able to start.</p>
<p>After that query Q4 took <strong>0.75 sec<br />
</strong> </p>
<p>Queries Q5-Q7 skipped.</p>
<p>Query Q8:</p>
<p><code>SELECT DestCityName, COUNT( DISTINCT OriginCityName) FROM ontime WHERE YearD BETWEEN 2008 and 2008 GROUP BY DestCityName ORDER BY 2 DESC LIMIT 10;<br />
</code></p>
<p>And times for InfiniDB:</p>
<p><strong>1y:  8.13 sec<br />
2y:  16.54 sec<br />
3y:  24.46 sec<br />
4y: 32.49 sec<br />
10y: 1 min 10.35 sec</strong></p>
<p>Query Q9:</p>
<p>Q9:<br />
<code>select Year ,count(*) as c1 from ontime group by Year;<br />
</code><br />
Time: <strong>9.54 sec</strong></p>
<p>Ok, so there is summary table with queries times (in sec, less is better)</p>
<table border=1>
<tr>
<td>Query</td>
<td>MonetDB</td>
<td>InfoBright</td>
<td>LucidDB</td>
<td>InfiniDB</td>
</tr>
<tr>
<td>Q0</td>
<td>29.9</td>
<td><strong>4.19</strong></td>
<td>103.21</td>
<td>NA</td>
</tr>
<tr>
<td>Q1</td>
<td>7.9</td>
<td>12.13</td>
<td>49.17</td>
<td><strong>6.79</strong></td>
</tr>
<tr>
<td>Q2</td>
<td><strong>0.9</strong></td>
<td>6.73</td>
<td>27.13</td>
<td>4.59</td>
</tr>
<tr>
<td>Q3</td>
<td><strong>1.7</strong></td>
<td>7.29</td>
<td>27.66</td>
<td>4.96</td>
</tr>
<tr>
<td>Q4</td>
<td><strong>0.27</strong></td>
<td>0.99</td>
<td>2.34</td>
<td>0.75</td>
</tr>
<tr>
<td>Q5</td>
<td><strong>0.5</strong></td>
<td>2.92</td>
<td>7.35</td>
<td>NA</td>
</tr>
<tr>
<td>Q6</td>
<td><strong>12.5</strong></td>
<td>21.83</td>
<td>78.42</td>
<td>NA</td>
</tr>
<tr>
<td>Q7</td>
<td>27.9</td>
<td><strong>8.59</strong></td>
<td>106.37</td>
<td>NA</td>
</tr>
<tr>
<td>Q8 (1y)</td>
<td><strong>0.55</strong></td>
<td>1.74</td>
<td>6.76</td>
<td>8.13</td>
</tr>
<tr>
<td>Q8 (2y)</td>
<td><strong>1.1</strong></td>
<td>3.68</td>
<td>28.82</td>
<td>16.54</td>
</tr>
<tr>
<td>Q8 (3y)</td>
<td><strong>1.69</strong></td>
<td>5.44</td>
<td>35.37</td>
<td>24.46</td>
</tr>
<tr>
<td>Q8 (4y)</td>
<td><strong>2.12</strong></td>
<td>7.22</td>
<td>41.66</td>
<td>32.49</td>
</tr>
<tr>
<td>Q8 (10y)</td>
<td>29.14</td>
<td><strong>17.42</strong></td>
<td>72.67</td>
<td>70.35</td>
</tr>
<tr>
<td>Q9</td>
<td>6.3</td>
<td><strong>0.31</strong></td>
<td>76.12</td>
<td>9.54</td>
</tr>
</table>
<p><strong>Conclusions</strong></p>
<ul>
<li>InfiniDB server version shows <code>Server version: 5.1.39-community InfiniDB Community Edition 0.9.4.0-5-alpha (GPL)</code>, so I consider it as alpha release, and it is doing OK for alpha. I will wait for more stable release for further tests, as it took good amount of time to deal with different glitches.</li>
<li>InfiniDB shows really good time for queries it can handle, quite often better than InfoBright.</li>
<li> Inability to handle derived tables is significant drawback for me, I hope it will be fixed</li>
</ul>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Vadim |
      <a href="http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/#comments">18 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/&amp;title=Air traffic queries in InfiniDB: early alpha" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/&amp;title=Air traffic queries in InfiniDB: early alpha" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/&amp;title=Air traffic queries in InfiniDB: early alpha" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/&amp;T=Air traffic queries in InfiniDB: early alpha" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/&amp;title=Air traffic queries in InfiniDB: early alpha" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Air traffic queries in LucidDB</title>
		<link>http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 17:10:31 +0000</pubDate>
		<dc:creator>Vadim</dc:creator>
				<category><![CDATA[OLAP]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[dw]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1537</guid>
		<description><![CDATA[After my first post Analyzing air traffic performance with InfoBright and MonetDB where I was not able to finish task with LucidDB, John Sichi contacted me with help to setup. You can see instruction how to load data on LucidDB Wiki page
You can find the description of benchmark in original post, there I will show [...]]]></description>
			<content:encoded><![CDATA[<p>After my first post <a href="http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/">Analyzing air traffic performance with InfoBright and MonetDB</a> where I was not able to finish task with LucidDB, John Sichi contacted me with help to setup. You can see instruction how to load data on <a href="http://pub.eigenbase.org/wiki/LucidDbOtp">LucidDB Wiki page</a></p>
<p>You can find the description of benchmark in original post, there I will show number I have for LucidDB vs previous systems.</p>
<p><strong>Load time</strong><br />
To load data into LucidDB in single thread, it took for me 15273 sec or <strong>4.24h</strong>. In difference with other systems LucidDB support multi-threaded load, with concurrency 2 (as I have only 2 cores on that box), the load time is 9955 sec or <strong>2.76h</strong>. For comparison<br />
for InforBright load time is <strong>2.45h</strong> and for MonetDB it is <strong>2.6h</strong></p>
<p><strong>DataSize</strong><br />
Another interesting metric is datasize after load. In LucidDB db file after load takes <strong>9.3GB</strong>.<br />
<strong>UPDATE 27-Oct-2009</strong> From metadata table the actual size of data is <strong>4.5GB</strong>, the 9.3GB is size of physical file db.dat, which probably was not truncated after several loads of data.</p>
<p>For InfoBright it is <strong>1.6GB</strong>, and for MonetDB - <strong>65GB</strong>. Obviously LucidDB uses some compression, but it is not so aggressive as in InfoBright case. As original dataset is 55GB, compression rate for LucidDB is somewhat <strong>1:12</strong></p>
<p><strong>Queries time<br />
</strong></p>
<p>Let me put list of queries and times for all systems.</p>
<p>- Lame query "count start"<br />
LucidDB:<br />
<code>SELECT count(*) FROM otp."ontime";<br />
</code>1 row selected (55.165 seconds)</p>
<p>Both InfoBright and MonetDB returned result immediately.<br />
It seems LucidDB has to scan whole table to get result.</p>
<p>- Q0:<br />
<code>select avg(c1) from (select "Year","Month",count(*) as c1 from otp."ontime" group by "Year","Month") t;</code><br />
LucidDB: <strong>103.205 seconds</strong><br />
InfoBright: <strong>4.19 sec</strong><br />
MonetDB: <strong>29.9 sec</strong></p>
<p>- Q1:<br />
SELECT "DayOfWeek", count(*) AS c FROM OTP."ontime" WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER BY c DESC;<br />
LucidDB: <strong>49.17 seconds</strong><br />
InfoBright: <strong>12.13 sec</strong><br />
MonetDB: <strong>7.9 sec</strong></p>
<p>- Q2:<br />
SELECT "DayOfWeek", count(*) AS c FROM otp."ontime" WHERE "DepDelay">10 AND "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER BY c DESC;<br />
LucidDB: <strong>27.131 seconds</strong><br />
InfoBright: <strong>6.37 sec</strong><br />
MonetDB: <strong>0.9 sec</strong></p>
<p>- Q3:<br />
!set rowlimit 10<br />
SELECT "Origin", count(*) AS c FROM otp."ontime" WHERE "DepDelay">10 AND "Year" BETWEEN 2000 AND 2008 GROUP BY "Origin" ORDER BY c DESC;<br />
LucidDB: <strong>27.664 seconds</strong><br />
InfoBright: <strong>7.29 sec</strong><br />
MonetDB: <strong>1.7 sec</strong></p>
<p>- Q4:<br />
SELECT "Carrier", count(*) FROM otp."ontime" WHERE "DepDelay">10 AND "Year"=2007 GROUP BY "Carrier" ORDER BY 2 DESC;<br />
LucidDB: <strong>2.338 seconds</strong><br />
InfoBright: <strong>0.99 sec</strong><br />
MonetDB: <strong>0.27 sec</strong></p>
<p>- Q5:<br />
SELECT t."Carrier", c, c2, c*1000/c2 as c3 FROM (SELECT "Carrier", count(*) AS c FROM OTP."ontime" WHERE "DepDelay">10 AND "Year"=2007 GROUP BY "Carrier") t JOIN (SELECT "Carrier", count(*) AS c2 FROM OTP."ontime" WHERE "Year"=2007 GROUP BY "Carrier") t2 ON (t."Carrier"=t2."Carrier") ORDER BY c3 DESC;<br />
LucidDB: <strong>7.351 seconds</strong><br />
InfoBright: <strong>2.92 sec</strong><br />
MonetDB: <strong>0.5 sec</strong></p>
<p>- Q6:<br />
SELECT t."Carrier", c, c2, c*1000/c2 as c3 FROM (SELECT "Carrier", count(*) AS c FROM OTP."ontime" WHERE "DepDelay">10 AND "Year" BETWEEN 2000 AND 2008 GROUP BY "Carrier") t JOIN (SELECT "Carrier", count(*) AS c2 FROM OTP."ontime" WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "Carrier") t2 ON (t."Carrier"=t2."Carrier") ORDER BY c3 DESC;<br />
LucidDB: <strong>78.423 seconds</strong><br />
InfoBright: <strong>21.83 sec</strong><br />
MonetDB: <strong>12.5 sec</strong></p>
<p>- Q7:<br />
SELECT t."Year", c1/c2 FROM (select "Year", count(*)*1000 as c1 from OTP."ontime" WHERE "DepDelay">10 GROUP BY "Year") t JOIN (select "Year", count(*) as c2 from OTP."ontime" GROUP BY "Year") t2 ON (t."Year"=t2."Year");<br />
LucidDB: <strong>106.374 seconds</strong><br />
InfoBright: <strong>8.59 sec</strong><br />
MonetDB: <strong>27.9 sec</strong></p>
<p>- Q8:<br />
SELECT "DestCityName", COUNT( DISTINCT "OriginCityName") FROM "ontime" WHERE "Year" BETWEEN 2008 and 2008 GROUP BY "DestCityName" ORDER BY 2 DESC;</p>
<p>Years, LucidDB, InfoBright, MonetDB<br />
1y, 6.76s, 1.74s, 0.55s<br />
2y, 28.82s, 3.68s, 1.10s<br />
3y, 35.37s, 5.44s, 1.69s<br />
4y, 41.66s, 7.22s, 2.12s<br />
10y, 72.67s, 17.42s, 29.14s</p>
<p>- Q9:<br />
select "Year" ,count(*) as c1 from "ontime" group by "Year";<br />
LucidDB: <strong>76.121 seconds</strong><br />
InfoBright: <strong>0.31 sec</strong><br />
MonetDB: <strong>6.3 sec</strong></p>
<p>As you see LucidDB is not showing best results. However on good side about LucidDB I can mention it is very reach featured, with full support of DML statement. ETL features is also very impressive, you can extract, filter, transform external data (there is even access to MySQL via JDBC driver) just in SQL queries (compare with single LOAD DATA statement in InfoBright ICE edition). Also I am not so much in Java, but as I understood LucidDB can be easily integrated with Java applications,  which is important if your development is Java based.</p>
<p>Worth to mention that in LucidDB single query execution takes 100% of user time in single CPU, which may signal that there some low-hanging fruits for optimization. OProfile can show clear places to fix.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Vadim |
      <a href="http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/#comments">10 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/&amp;title=Air traffic queries in LucidDB" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/&amp;title=Air traffic queries in LucidDB" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/&amp;title=Air traffic queries in LucidDB" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/&amp;T=Air traffic queries in LucidDB" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/&amp;title=Air traffic queries in LucidDB" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 3</title>
		<link>http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/#comments</comments>
		<pubDate>Mon, 19 Oct 2009 16:00:43 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tuning]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1435</guid>
		<description><![CDATA[This is part 3 of our series.  In part 1 we talked about boosting performance with memcached on top of MySQL, in Part 2 we talked about running 100% outside the data with memcached, and now in Part 3 we are going to look at a possible solution to free you from the database.  The [...]]]></description>
			<content:encoded><![CDATA[<p>This is part 3 of our series.  In <a href="http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/">part 1</a> we talked about boosting performance with memcached on top of MySQL, in<a href="http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/"> Part 2</a> we talked about running 100% outside the data with memcached, and now in Part 3 we are going to look at a possible solution to free you from the database.  The solution I am going to discuss here is Tokyo Cabinet and Tyrant.</p>
<p>I am not going to give you a primer  or Tutorial on Tyrant and Cabinet, there are plenty of these out there already.  Instead I want to see what sort of performance we can see compared to MySQL and Memcached, and later on other NoSQL solutions.  Tokyo actually allows you to use several types of databases that are supported, there are hash databases which are very similar to memcached, a table database which is similar to your classic database tables where you can add a where clause and search individual columns, and a ton more "database options"  beyond just those two.  Again my goal is to not make this a Tokyo Tyrant tutorial but rather show one potential role it can play.</p>
<p>More details can be read about here:<br />
<a href="http://1978th.net/tokyotyrant/"> http://1978th.net/tokyotyrant/</a><br />
<a href="http://1978th.net/tokyocabinet/"> http://1978th.net/tokyocabinet/</a></p>
<p>So if we can get performance similar to memcached with Tokoyo Tyrant when using disk based hash tables it would be a compelling replacement for our application here.  It should provide the interface and the same access we saw in memcached but with disk persistence. So let's look at the numbers:</p>
<p><img class="aligncenter size-full wp-image-1519" title="Tyrant -vs- memcached" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/more_tc_numbers_html_m31feebd3.jpg" alt="Tyrant -vs- memcached" width="662" height="354" /></p>
<p>Tyrant's disk based hash was almost 2x faster then combining memcached and MySQL, and about 20% slower then the all memory memcached approach.  So for this particular application I would have been much better off not storing my data in MySQL and instead looking outside the database for an answer.  Now sure there are other reasons you may want to keep data in the database... but I am trying to get you to think about your application and if those reasons are really valid.  Helping clients pick the right solution is one of the things we do here at Percona.  If an application requires a database great, but if there is a better solution we want to suggest it.  It's our goal to make your application perform optimally.</p>
<p>Finally, one concern you have to have is the scalability of your storage solution.  As load, number of threads, and data size increases how does performance differ or change?  One knock on Tokyo -vs- Memcached is Tokyo is not distributed by default.  Now that's not to say we could not shard it based on a hash, or even build an api with the capability built in ( or use the memcached clients which works! )...  but native support is lacking.  It does support replication which could make some rather interesting architectures in the future.</p>
<p>So lets look at some scalability benchmarks, my server resources are rather limited but I thought I should try throwing more threads and work at the server until it hit its limit and fell over dead.  It's interesting to see the number of transactions that occur with a given number of threads.  let's look at some of these:</p>
<p><img class="size-full wp-image-1426" title="Threads &amp; Application performance" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m52d249fc.gif" alt="Tyrant/MySQL/Memcached Thread Benchmark performance" width="663" height="291" /></p>
<p>As expected the smaller buffer pool struggled ( why a smaller buffer pool?  This simulates a much larger data set.  A BP of 256M with 1GB of data, can give similar performance to 20GB of data and a 5GB BP ).  So with 256M BP and 4GB of memcached we were well off the numbers we hit with a 4GB BP+4Gb of memcached ( which is expected ).   Adding more threads even up to 128 threads increased overall throughput but my load average on the server hit 40 and my CPU was pegged.  At 128 threads I was pegging out my CPU across the board.  Also interesting is I started to hit bottlenecks in MySQL/Innodb when I had enough memory but I increased the threads from 64 to 128.  As time permits I should revisit this and look at increased datasets, and look for area's where Tyrant may stumble a bit.</p>
<p>Bottom line given a specific application and data pattern sometimes a relational database is not the appropriate place for storing data.  A tool like Tokyo Tyrant may not be for everyone or every application, but neither is a relational database.  Before building your next application try and understand whether an RDBMS is really needed or not.</p>
<p>How did I do these tests:</p>
<p>The above number were run with 32 Threads, Tyrant was started with 8 threads and 128M of memory,  memached was started with 16 threads ( 1.4 memcached ), mysql was 5.1 XtraDB.  Each environment had 2 tables each with 2 million rows.  The data was identical. memcached and Tyrant stored a comma delimited string to represent the row.   Mysql was running with 256M allocated to the innodb buffer unless otherwise noted.</p>
<p>What's next?  Well next I am going to try and continue this series by exploring and benchmarking other NOSQL options and comparing them to database based solutions.  I think showing the performance of a couple of different Tokyo database formats would also be interesting.  What other solutions are people interested in?  I know I have gotten a lot of requests for cassandra #'s, but what else?  Drop a comment and let me know!</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by matt |
      <a href="http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/#comments">24 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 3" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 3" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 3" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/&amp;T=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 3" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 3" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/10/19/mysql_memcached_tyrant_part3/feed/</wfw:commentRss>
		<slash:comments>24</slash:comments>
		</item>
		<item>
		<title>MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 2</title>
		<link>http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/#comments</comments>
		<pubDate>Fri, 16 Oct 2009 16:00:41 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1433</guid>
		<description><![CDATA[Part 1 of our series set-up our "test"  application and looked at boosting performance of the application by buffer MySQL with memcached.  Our test application is simple and requires only 3 basic operations per transaction 2 reads and 1 write.  Using memcached combined with MySQL we ended up nearly getting a 10X performance boost from [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/">Part 1</a> of our series set-up our "test"  application and looked at boosting performance of the application by buffer MySQL with memcached.  Our test application is simple and requires only 3 basic operations per transaction 2 reads and 1 write.  Using memcached combined with MySQL we ended up nearly getting a 10X performance boost from the application.  Now we are going to look at what we could achieve if we did not have to write to the database at all.  So let's look at what happens if we push everything including writes into memcached.</p>
<p><img class="size-full wp-image-1427" title="Benchmarks if everything is in memcached" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m62d9ce6b.gif" alt="Benchmarks if everything is in memcached" width="641" height="335" /></p>
<p>Wow that's shockingly fast isn't it! I guess being completely in memory helps for this app.  What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB ( part 1 benchmarked a 4GB bp as being able to handle 7K TPS)... something is not 100% right here.  It stands to reason that memcached should be faster for this application then the DB.  Its just doing two gets via key and 1 set.  So why the similar numbers?</p>
<p>Well glad you asked.  It's the API.  The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:</p>
<p><img class="size-full wp-image-1428" title="Memcached API - Fast" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m74e8a5a8.gif" alt="Memcached API - Fast" width="704" height="317" /></p>
<p>That is a nice jump in performance!</p>
<p>Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:</p>
<p><img class="size-full wp-image-1421" title="Memcached Api's" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_2ff88cd9.gif" alt="Sometimes Api changes can make a huge difference" width="667" height="340" /></p>
<p>In this case I think the Fast api was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load.  When we eliminate this bottleneck with the Fast api, MySQL gets hammered.  This type of thing happens a lot.  For example an application is CPU bound, so you add more processing power, but then you hit disks harder and  now your disk bound.</p>
<p>A couple of good things to remember here:  #1 resolving 1 bottleneck can open another bottleneck that is much worse.  #2  is to understand that not all API's are created equal.  Additionally the configuration and setup that works well on one system may not work well on another.  Because of this people often leave lots of performance on the table.  Don't just trust that your current API or config is optimal, test and make sure it fits your application.</p>
<p>So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance over a mixed solution and 100X over just stock MySQL.  As the number of writes against the database increase this gap will increase.  So let's ditch the database!  But wait!  you need the DB for  persistence, right?</p>
<p>It depends.  A database may not be the best fit for every application.  There are several “NOSQL”  solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for.   Each application is different and understanding the application's requirements is key to picking an appropriate solution.   I am going to look at several database alternatives over the next few months.  I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.    So stop in next time for part 3 of this series where we will focus on running the same tests against Tokyo Tyrant.</p>
<div id="_mcePaste" style="overflow: hidden; position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px;"><!-- 		@page { margin: 0.79in } 		P { margin-bottom: 0.08in } --></p>
<p style="margin-bottom: 0in;">Wow that's shockingly fast isn't it! I guess being completely in memory helps for this app.  What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB... something is not 100% right here.  It stands to reason that memcached should be faster for this application then the DB, two gets via key and 1 set.  So why the similar numbers?</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Well glad you asked.  It's the API.  The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">That is a nice jump in performance!</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">In this case I think Fast was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load.  When we eliminate this bottleneck with the Fast api, MySQL gets hammered.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">A couple of good things to remember here:  #1 resolving 1 bottleneck can open another bottleneck that is much worse.  #2  is to understand that not all API's are created equal.  Additionally the configuration and setup that works well on one system may not work well on another.  Because of this people often leave lots of performance on the table.  Don't just trust that your current API or config is optimal, test and make sure it fits your application.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance.  As the number of writes against the database increase this gap will increase.  So let's ditch the database!  But wait!  you need the DB for  persistence, right?</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">It depends.  A database may not be the best fit for every application.  There are several “NOSQL”  solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for.   Each application is different and understanding the application's requirements is key to picking an appropriate solution.   I am going to look at several database alternatives over the next few months.  I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.</p>
</div>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by matt |
      <a href="http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/#comments">7 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 2" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 2" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 2" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/&amp;T=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 2" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 2" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/10/16/mysql_memcached_tyrant_part2/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 1</title>
		<link>http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/</link>
		<comments>http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 18:24:30 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[NOSQL]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1419</guid>
		<description><![CDATA[All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why?  Because their other applications use it, so why not the new application?  Over the past couple of months I have been doing a ton of work for clients who [...]]]></description>
			<content:encoded><![CDATA[<p>All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why?  Because their other applications use it, so why not the new application?  Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached .  Lookup a row based on a key, update the data in the row, stuff the row back in the database.  Rinse and repeat.  Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count.  But for the most part they are designed to be simple.</p>
<p>A classic example is a simple online game.  An online game may only require that an application retrieve a single record from the database.  The record may contain all the vital stats for the game, be updated and stuffed back into the database.  You would be surprised how many people use this type of system as I run into this type of application frequently.  Keeping it simple, ensures that application is generally mean and lean and performs well.  The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory.  Is there a better architecture?  Is there a way to get more scalability out of your database?  Is the database even the best place for this data?</p>
<p>I decided to walk through setting up a very simple application that does what I have seen many clients do.  Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra.   My Application does the following:</p>
<p>A.)  read a row from a database based on an integer based primary key<br />
B.)  Update data from that row and replace the stored contents on disk<br />
C.)  Use the data from that row to lookup up a row in another table based on a text field ( called email address ).</p>
<p>Seems simple enough right?  My two tables each contain 5M rows of data.  let's see what happens:</p>
<p><img class="size-full wp-image-1420" title="DB Fits into Memory" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_1f7313bd.gif" alt="DB Fits into Memory" width="619" height="353" /></p>
<p><img class="size-full wp-image-1429" title="Chart of numbers" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m767bebf7.gif" alt="Chart of TPS for benchmark application" width="257" height="69" /></p>
<p>You can see a dramatic drop off in performance as my data falls out of memory, that's not cool is it?  After all database sizes tend to always grow and very rarely shrink.  Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?</p>
<p>Here is where people start to scratch their heads.  They naturally assume they need to scale more, we need more memory!   If performance sucks, we must need more.  So here comes the bigger boxes, the read-only slaves,  the complex sharding systems, the discussions on cluster, more memcached.  We need to cover up the databases inefficiencies to ensure that our application scales.</p>
<p>The problem is for some applications, we are fixing symptoms, not the problem itself.  No matter how much you want it to fit,  some things may not work (like the Godfather 3).    The issue is people assume that data storage has to be in the database.  “It's data, it needs to go into the database.” is often the battle cry.   But hold on to your hat,  I  am going to shock you.  For some applications, putting your data in the database is silly.  Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database.  Heresy I know!  But for many of us we already accept storing data ( at least temporarily ) outside the DB.  Think memcached.</p>
<p>Almost everyone loves memcached, it's simple, fast, and just works.  When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance.  I know you're thinking my simple key lookup should really benefit from memcached. So let's try it!  I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required.  I tested with a memcached size of 1GB, 2GB, and 4GB.  For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.</p>
<p>let's look at the 1GB Setting:</p>
<div class="mceTemp mceIEcenter">
<dl id="attachment_1425" class="wp-caption aligncenter" style="width: 636px;">
<dt class="wp-caption-dt"><img class="size-full wp-image-1425" title="Ensure you have enough memory " src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m5daa4d5f.gif" alt="Ensure you have enough memory for memcached" width="626" height="320" /> </dt>
</dl>
</div>
<p>What, a performance regression?  But we threw more memory at it!!   How can that be!</p>
<p>Memcached is not a cure all.  I have talked to many client's who say “we will just throw memcached at it”.   Sometimes an app will scream other times it won't... and yet others require lots and lots of memory allocated to memcached to be successful.    This application selects a random # between 1 and 2 Million and looks up the result via that key.  It then uses data from that random row to look up a second piece of information via email address.  Because the entire dataset  is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful.   I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll.  Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached.  For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows... sequentially accessing this data will destroy the use of cache:</p>
<p>get record 1, miss, read from disk, cache record 1<br />
….<br />
get record 500,001, miss, read from disk, expunge record 1, cache record 500,001<br />
....<br />
get record 1, miss, read from disk, expunge record 500,001, cache record 1</p>
<p>you keep overwriting the cache before you can use it.  So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.</p>
<p>Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS.  My guess is the movement is caused by the random #'s being generated and simply the luck of the draw.</p>
<p>Finally let's look when we have 4GB of memory allocated to memcached (full dataset fits ):</p>
<p><img class="size-full wp-image-1423" title="Transactions with and without Memcached" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m2dc4c2b4.gif" alt="Transactions with and without Memcached" width="625" height="331" /></p>
<p>Here you can see that our “transactions”  per second for this app increased almost 10Xby using memcached.  The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS.   I think we would all be very happy if we could get a 10X performance boost from your application.</p>
<p>But wouldn't it be great if we could get more?  I mean our reads are going pretty fast, but our writes leave a lot to be desired:</p>
<div class="mceTemp mceIEcenter"><img class="size-full wp-image-1424" title="Read -vs- write times " src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/10/db_right_option_html_m4d18b08e.gif" alt="Read -vs- write times with memcached + mysql mixed" width="648" height="286" /></div>
<p>Over 17 MS to do an update.  Wouldn't be great to just eliminate all the updates as well?  What sort of throughput would we get?   I will show you in part 2.  Part 2 of this post will talk about performance in a 100% pure memcached environment. Part 3 will focus on these same benchmarks in Tokyo tyrant.</p>
<div id="_mcePaste" style="overflow: hidden; position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px;"><!-- 		@page { margin: 0.79in } 		P { margin-bottom: 0.08in } --></p>
<p style="margin-bottom: 0in;">All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why?  Because their other applications use it, so why not the new application?  Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached .  Lookup a row based on a key, update the data in the row, stuff the row back in the database.  Rinse and repeat.  Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count.  But for the most part they are designed to be simple.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">A classic example is a simple online game.  An online game may only require that an application retrieve a single record from the database.  The record may contain all the vital stats for the game, be updated and stuffed back into the database.  You would be surprised how many people use this type of system as I run into this type of application frequently.  Keeping it simple, ensures that application is generally mean and lean and performs well.  The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory.  Is there a better architecture?  Is there a way to get more scalability out of your database?  Is the database even the best place for this data?</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">I decided to walk through setting up a very simple application that does what I have seen many clients do.  Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra.   My Application does the following:</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">A.)  read a row from a database based on an integer based primary key</p>
<p style="margin-bottom: 0in;">B.)  Update data from that row and replace the stored contents on disk</p>
<p style="margin-bottom: 0in;">C.)  Use the data from that row to lookup up a row in another table based on a text field ( called email address ).</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Seems simple enough right?  My two tables each contain 5M rows of data.  let's see what happens:</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">You can see a dramatic drop off in performance as my data falls out of memory, that's not cool is it?  After all database sizes tend to always grow and very rarely shrink.  Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Here is where people start to scratch their heads.  They naturally assume they need to scale more, we need more memory!   If performance sucks, we must need more.  So here comes the bigger boxes, the read-only slaves,  the complex sharding systems, the discussions on cluster, more memcached.  We need to cover up the databases inefficiencies to ensure that our application scales.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">The problem is for some applications, we are fixing symptoms, not the problem itself.  No matter how much you want it to fit,  somethings may not work (like the Godfather 3).    The issue is people assume that data storage has to be in the database.  “It's data, it needs to go into the database.” is often the battle cry.   But hold on to your hat,  I  am going to shock you.  For some applications, putting your data in the database is silly.  Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database.  Heresy I know!  But for many of us we already accept storing data ( at least temporarily ) outside the DB.  Think memcached.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Almost everyone loves memcached, it's simple, fast, and just works.  When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance.  I know you're thinking my simple key lookup should really benefit from memcached. So let's try it!  I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required.  I tested with a memcached size of 1GB, 2GB, and 4GB.  For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">let's look at the 1GB Setting:</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">What, a performance regression?  But we threw more memory at it!!   How can that be!</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Memcached is not a cure all.  I have talked to many client's who say “we will just throw memcached as it”.   Sometimes an app will scream other times it won't... and yet others require lots and lots of memory allocated to memcached to be successful.    This application selects a random # between 1 and 2 Million and looks up the result via that key.  It then uses data from that random row to look up a second piece of information via email address.  Because the entire dataset  is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful.   I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll.  Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached.  For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows... sequentially accessing this data will destroy the use of cache:</p>
<p style="margin-bottom: 0in;">
<ul>
<li>
<p style="margin-bottom: 0in;">get record 1, miss, read from 	disk, cache record 1</p>
</li>
<li>
<p style="margin-bottom: 0in;">….</p>
</li>
<li>
<p style="margin-bottom: 0in;">get record 500,001, miss, read 	from disk, expunge record 1, cache record 500,001</p>
</li>
<li>
<p style="margin-bottom: 0in;">....</p>
</li>
<li>
<p style="margin-bottom: 0in;">get record 1, miss, read from 	disk, expunge record 500,001, cache record 1</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">you keep overwriting the cache before 	you can use it.  So here the complexity of adding memcached hurts 	us, because the cache is not actually buying us anything.</p>
</li>
</ul>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS.  My guess is the movement is caused by the random #'s being generated and simply the luck of the draw.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Finally let's look when we have 4GB of memory allocated to memcached (full dataset fits ):</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Here you can see that our “transactions”  per second for this app increased almost 10Xby using memcached.  The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS.   I think we would all be very happy if we could get a 10X performance boost from your application.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">But wouldn't it be great if we could get more?  I mean our reads are going pretty fast, but our writes leave a lot to be desired:</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Over 17 MS to do an update.  Wouldn't be great to just eliminate all the updates as well?  What sort of throughput would we get?</p>
</div>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by matt |
      <a href="http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/#comments">13 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 1" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 1" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 1" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/&amp;T=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 1" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/&amp;title=MySQL-Memcached or NOSQL Tokyo Tyrant &#8211; part 1" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.mysqlperformanceblog.com/2009/10/15/mysql-memcached-or-nosql-tokyo-tyrant-part-1/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>
