<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Just do the math!</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2009/07/06/just-do-the-math/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2009/07/06/just-do-the-math/</link>
	<description>Everything about MySQL Performance</description>
	<lastBuildDate>Sat, 21 Nov 2009 05:23:57 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Apachez</title>
		<link>http://www.mysqlperformanceblog.com/2009/07/06/just-do-the-math/comment-page-1/#comment-613786</link>
		<dc:creator>Apachez</dc:creator>
		<pubDate>Sat, 11 Jul 2009 22:27:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=715#comment-613786</guid>
		<description>For example number of rows have been optimized in myisam to be a single value so table scan isnt needed (compared to innodb) in order to find out how many rows the table contains.

Have there been any talks of optimizing similar stuff for mysql 6.0 or such (like a storage independent aggregation)?

If we take the example in this blogentry it would be nice if mysql would have done the aggregation on its own in the index regarding the count of each unique page entry so the select in the example would go in msecs instead of minutes or hours. So I as a user of a mysql database only need to care to throw in the data (like from the apache log in realtime). Compared to today where I most likely would need to take care of the aggregation on my own with 2 inserts per log line instead of just 1 (one to the large log db and one to the aggregated db).

If im not mistaken mssql does similar stuff since a while back.</description>
		<content:encoded><![CDATA[<p>For example number of rows have been optimized in myisam to be a single value so table scan isnt needed (compared to innodb) in order to find out how many rows the table contains.</p>
<p>Have there been any talks of optimizing similar stuff for mysql 6.0 or such (like a storage independent aggregation)?</p>
<p>If we take the example in this blogentry it would be nice if mysql would have done the aggregation on its own in the index regarding the count of each unique page entry so the select in the example would go in msecs instead of minutes or hours. So I as a user of a mysql database only need to care to throw in the data (like from the apache log in realtime). Compared to today where I most likely would need to take care of the aggregation on my own with 2 inserts per log line instead of just 1 (one to the large log db and one to the aggregated db).</p>
<p>If im not mistaken mssql does similar stuff since a while back.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brian</title>
		<link>http://www.mysqlperformanceblog.com/2009/07/06/just-do-the-math/comment-page-1/#comment-609930</link>
		<dc:creator>Brian</dc:creator>
		<pubDate>Wed, 08 Jul 2009 15:07:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=715#comment-609930</guid>
		<description>You would be surprised how controversial &quot;doing the math&quot; is.  To most the database is supposed to magically compensate for whatever you throw at it regardless of how its configured or your data model is designed.  Saying otherwise is well, &quot;being mean&quot;.</description>
		<content:encoded><![CDATA[<p>You would be surprised how controversial &#8220;doing the math&#8221; is.  To most the database is supposed to magically compensate for whatever you throw at it regardless of how its configured or your data model is designed.  Saying otherwise is well, &#8220;being mean&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2009/07/06/just-do-the-math/comment-page-1/#comment-609136</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Wed, 08 Jul 2009 00:06:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=715#comment-609136</guid>
		<description>Pat,

There are always some surprises.  If you got 10% more data and queries started taking 4x there is a good chance something is wrong.   May be the execution plan has changed ?
It is also however possible to have large drops in performance because of working set.  Consider for example if table has uniform random access when it fits in memory (100% hit ratio) vs 10% miss ratio if it grows 10%.     You can access data in memory at rate of 500.000 rows/sec  while  if it is the disk it is 200 rows/sec  -  with such numbers 10% miss rate will cause 500K -&gt; 2K rows/sec which is 250 _times_ performance loss.</description>
		<content:encoded><![CDATA[<p>Pat,</p>
<p>There are always some surprises.  If you got 10% more data and queries started taking 4x there is a good chance something is wrong.   May be the execution plan has changed ?<br />
It is also however possible to have large drops in performance because of working set.  Consider for example if table has uniform random access when it fits in memory (100% hit ratio) vs 10% miss ratio if it grows 10%.     You can access data in memory at rate of 500.000 rows/sec  while  if it is the disk it is 200 rows/sec  &#8211;  with such numbers 10% miss rate will cause 500K -&gt; 2K rows/sec which is 250 _times_ performance loss.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: pat</title>
		<link>http://www.mysqlperformanceblog.com/2009/07/06/just-do-the-math/comment-page-1/#comment-608774</link>
		<dc:creator>pat</dc:creator>
		<pubDate>Tue, 07 Jul 2009 13:43:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=715#comment-608774</guid>
		<description>There are some non linear points to worry about too though aren&#039;t there? I&#039;d expect to see non linear timing changes when my data:

1) grew large enough that it didn&#039;t fit in buffer cache anymore
2) grew large enough it didn&#039;t fit in sort buffers anymore and had to spill to disk

For that vast &quot;middle&quot; size of data you can get the more pernicious performance problems of &quot;it was fine last week, and its only 10% larger this week, but the queries are all taking 4x as long&quot;.</description>
		<content:encoded><![CDATA[<p>There are some non linear points to worry about too though aren&#8217;t there? I&#8217;d expect to see non linear timing changes when my data:</p>
<p>1) grew large enough that it didn&#8217;t fit in buffer cache anymore<br />
2) grew large enough it didn&#8217;t fit in sort buffers anymore and had to spill to disk</p>
<p>For that vast &#8220;middle&#8221; size of data you can get the more pernicious performance problems of &#8220;it was fine last week, and its only 10% larger this week, but the queries are all taking 4x as long&#8221;.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
