<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Tokyo Tyrant &#8211; The Extras Part I :  Is it Durable?</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/</link>
	<description>Percona&#039;s MySQL &#38; InnoDB performance and scalability blog</description>
	<lastBuildDate>Sat, 11 Feb 2012 16:45:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: leebert</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/comment-page-1/#comment-771995</link>
		<dc:creator>leebert</dc:creator>
		<pubDate>Fri, 20 Aug 2010 14:54:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1650#comment-771995</guid>
		<description>If ACID conformance is crucial then sync&#039;ing once per second won&#039;t be sufficient, hardware tuning is the last alternative here.

If TC has transaction logging (Berkeley DB has it) you could isolate the tranlog file to a separate &quot;fast&quot; spindle &amp; do an fsync against just that tranlog. The tranlog then plays the actual commits against the main tables. 

Another thing to consider is the hotspot in the filesystem (at the end of the data tables or tranlog). Also tuning your I/O buffers &amp; cache to actually be *smaller* might help, avoiding the pitfall of big deferred writes or having to make manual sync/fsync calls from your application. 

Likewise if you can push the table&#039;s indices off to a different set of spindle that&#039;s yet more I/O wait averted by hardware-based tuning.

As for indices, if the DBM (TC) allows for using clustered PK indices (esp. with tunable fillfactor) the table itself can be tuned so that  write hotspots - in conjunction with any striped RAID - are more likely to spread out across the IO system &amp; reduce IO wait. The table bloats more quickly but the performance gains are readily had.

Basically all tuning tricks for a file-based DBMS are the same as with any RDBMS (like DB/2) apply. The difference is that with a file-baesd DBMS you have to tune the OS more in keeping with a database machine. Think: AS/400 or S390. One can can even schedule such tuning parameters, where the OS buffer/caches might be tuned for certain times of the day/month/year for the a write-intensive periods of heavy writes to flush its dirty pages quickly &amp; regularly in the background. VLDB systems are often tuned accordingly.

Another trick is the appropriate use of RAID. RAID5 brings write CRC overhead (pure striping RAID0 doesn&#039;t). RAID1+0 to the rescue - the hotspots can spread out across a LVM on RAID10 - the writes will be appreciably faster &amp; if you mirror twice the odds are much lower readers will contend on disk heads against a big writer. Although RAID10 brings 2x (or 3x) overhead in terms of disk usage, if it&#039;s reliability &amp; speed you want then hardware to the rescue....

As for speed, the type of IO hardware also plays a role here - SSA is going to be faster &amp; more reliable than SCSI b/c SSA runs on a loop. There&#039;s less I/O contention for starters &amp; a failed drive on a SSA loop won&#039;t disrupt the loop. 

Also a good HDD controller would have a cache battery that ensures buffers are flushed if there&#039;s power interruption. The beauty of this is that you can tune your buffers &amp; cache down near the size of the controller&#039;s own cache &amp; know your OS is keeping the data pumped at the pace of the IO controller&#039;s ability to safely perform writes. 

This requires understanding what parameters to set in sysctl, but it ain&#039;t rocket science either....</description>
		<content:encoded><![CDATA[<p>If ACID conformance is crucial then sync&#8217;ing once per second won&#8217;t be sufficient, hardware tuning is the last alternative here.</p>
<p>If TC has transaction logging (Berkeley DB has it) you could isolate the tranlog file to a separate &#8220;fast&#8221; spindle &amp; do an fsync against just that tranlog. The tranlog then plays the actual commits against the main tables. </p>
<p>Another thing to consider is the hotspot in the filesystem (at the end of the data tables or tranlog). Also tuning your I/O buffers &amp; cache to actually be *smaller* might help, avoiding the pitfall of big deferred writes or having to make manual sync/fsync calls from your application. </p>
<p>Likewise if you can push the table&#8217;s indices off to a different set of spindle that&#8217;s yet more I/O wait averted by hardware-based tuning.</p>
<p>As for indices, if the DBM (TC) allows for using clustered PK indices (esp. with tunable fillfactor) the table itself can be tuned so that  write hotspots &#8211; in conjunction with any striped RAID &#8211; are more likely to spread out across the IO system &amp; reduce IO wait. The table bloats more quickly but the performance gains are readily had.</p>
<p>Basically all tuning tricks for a file-based DBMS are the same as with any RDBMS (like DB/2) apply. The difference is that with a file-baesd DBMS you have to tune the OS more in keeping with a database machine. Think: AS/400 or S390. One can can even schedule such tuning parameters, where the OS buffer/caches might be tuned for certain times of the day/month/year for the a write-intensive periods of heavy writes to flush its dirty pages quickly &amp; regularly in the background. VLDB systems are often tuned accordingly.</p>
<p>Another trick is the appropriate use of RAID. RAID5 brings write CRC overhead (pure striping RAID0 doesn&#8217;t). RAID1+0 to the rescue &#8211; the hotspots can spread out across a LVM on RAID10 &#8211; the writes will be appreciably faster &amp; if you mirror twice the odds are much lower readers will contend on disk heads against a big writer. Although RAID10 brings 2x (or 3x) overhead in terms of disk usage, if it&#8217;s reliability &amp; speed you want then hardware to the rescue&#8230;.</p>
<p>As for speed, the type of IO hardware also plays a role here &#8211; SSA is going to be faster &amp; more reliable than SCSI b/c SSA runs on a loop. There&#8217;s less I/O contention for starters &amp; a failed drive on a SSA loop won&#8217;t disrupt the loop. </p>
<p>Also a good HDD controller would have a cache battery that ensures buffers are flushed if there&#8217;s power interruption. The beauty of this is that you can tune your buffers &amp; cache down near the size of the controller&#8217;s own cache &amp; know your OS is keeping the data pumped at the pace of the IO controller&#8217;s ability to safely perform writes. </p>
<p>This requires understanding what parameters to set in sysctl, but it ain&#8217;t rocket science either&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/comment-page-1/#comment-728017</link>
		<dc:creator>Mark</dc:creator>
		<pubDate>Sun, 21 Feb 2010 20:43:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1650#comment-728017</guid>
		<description>FYI: TokyoTyrant / LUA doesn&#039;t provide a mechanism to call sync() or any of the other methods required except optimize but that sounds like something that shouldn&#039;t be called each second on a live database...
Source: http://1978th.net/tokyotyrant/spex.html#luaext

perl does, but using -ext requires a LUA extension.
I&#039;m referencing the latest docs. Did I miss something?

Thanks.</description>
		<content:encoded><![CDATA[<p>FYI: TokyoTyrant / LUA doesn&#8217;t provide a mechanism to call sync() or any of the other methods required except optimize but that sounds like something that shouldn&#8217;t be called each second on a live database&#8230;<br />
Source: <a href="http://1978th.net/tokyotyrant/spex.html#luaext" rel="nofollow">http://1978th.net/tokyotyrant/spex.html#luaext</a></p>
<p>perl does, but using -ext requires a LUA extension.<br />
I&#8217;m referencing the latest docs. Did I miss something?</p>
<p>Thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nicolas</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/comment-page-1/#comment-677071</link>
		<dc:creator>Nicolas</dc:creator>
		<pubDate>Sat, 14 Nov 2009 09:42:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1650#comment-677071</guid>
		<description>Very good post!  I stopped testing TT after getting timeouts when calling sync.  However, I was calling it every 5 minutes.  I will now try the 1-second sync to see if response time keeps  &lt; 10 ms</description>
		<content:encoded><![CDATA[<p>Very good post!  I stopped testing TT after getting timeouts when calling sync.  However, I was calling it every 5 minutes.  I will now try the 1-second sync to see if response time keeps  &lt; 10 ms</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Toru Maesaka</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/comment-page-1/#comment-675506</link>
		<dc:creator>Toru Maesaka</dc:creator>
		<pubDate>Wed, 11 Nov 2009 06:24:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1650#comment-675506</guid>
		<description>Hi!

Awesome post and analysis of TC/TT. Mikio wrote his thoughts about this matter on his blog. Thought you&#039;d find it interesting.

http://1978th.net/tech-en/promenade.cgi?id=6

Cheers,
Toru</description>
		<content:encoded><![CDATA[<p>Hi!</p>
<p>Awesome post and analysis of TC/TT. Mikio wrote his thoughts about this matter on his blog. Thought you&#8217;d find it interesting.</p>
<p><a href="http://1978th.net/tech-en/promenade.cgi?id=6" rel="nofollow">http://1978th.net/tech-en/promenade.cgi?id=6</a></p>
<p>Cheers,<br />
Toru</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexis</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/10/tokyo-tyrant-the-extras-part-i-is-it-durable/comment-page-1/#comment-675292</link>
		<dc:creator>Alexis</dc:creator>
		<pubDate>Tue, 10 Nov 2009 20:19:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1650#comment-675292</guid>
		<description>A minor comment, the &quot;sync&quot;, &quot;nosync&quot; colors used for the charts change from one to the other; this is confusing.</description>
		<content:encoded><![CDATA[<p>A minor comment, the &#8220;sync&#8221;, &#8220;nosync&#8221; colors used for the charts change from one to the other; this is confusing.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

