<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A common problem when optimizing COUNT()</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/</link>
	<description>Percona&#039;s MySQL &#38; InnoDB performance and scalability blog</description>
	<lastBuildDate>Sat, 11 Feb 2012 16:45:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: victori</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-396207</link>
		<dc:creator>victori</dc:creator>
		<pubDate>Sat, 29 Nov 2008 19:08:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-396207</guid>
		<description>@2. Pat

count(*) can be fast on postgresql by counting tuples instead of actually rows. Yes the number returned isn&#039;t accurate but works great for pagination. When was the last time a user went to page 10,000?  only the first few pages are relevant. 

select reltuples from pg_class where relname=&#039;&#039;;

Replace  ...    

Just one of the few tricks I am using to getting the most out of our database.</description>
		<content:encoded><![CDATA[<p>@2. Pat</p>
<p>count(*) can be fast on postgresql by counting tuples instead of actually rows. Yes the number returned isn&#8217;t accurate but works great for pagination. When was the last time a user went to page 10,000?  only the first few pages are relevant. </p>
<p>select reltuples from pg_class where relname=&#8221;;</p>
<p>Replace  &#8230;    </p>
<p>Just one of the few tricks I am using to getting the most out of our database.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Wultsch</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357869</link>
		<dc:creator>Rob Wultsch</dc:creator>
		<pubDate>Thu, 25 Sep 2008 19:40:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357869</guid>
		<description>Heck knowing where a sql query is coming from is half the battle. With the popularity of people doing stuff like:
SELECT $fields
FROM $table $join
$where
$order_by

figuring out context is a real pain in the backside.  grep is much less useful with such queries. I wish people would add magic constants like __FILE__ and __LINE__ if they are going to dynamically build the queries...</description>
		<content:encoded><![CDATA[<p>Heck knowing where a sql query is coming from is half the battle. With the popularity of people doing stuff like:<br />
SELECT $fields<br />
FROM $table $join<br />
$where<br />
$order_by</p>
<p>figuring out context is a real pain in the backside.  grep is much less useful with such queries. I wish people would add magic constants like __FILE__ and __LINE__ if they are going to dynamically build the queries&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron Schwartz</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357831</link>
		<dc:creator>Baron Schwartz</dc:creator>
		<pubDate>Thu, 25 Sep 2008 14:27:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357831</guid>
		<description>Yes!  Comments are great for that.  And they can absolutely clarify whether it&#039;s a mistake or intentional.  But most of the customers I work with don&#039;t have comments in their SQL.</description>
		<content:encoded><![CDATA[<p>Yes!  Comments are great for that.  And they can absolutely clarify whether it&#8217;s a mistake or intentional.  But most of the customers I work with don&#8217;t have comments in their SQL.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amit</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357829</link>
		<dc:creator>amit</dc:creator>
		<pubDate>Thu, 25 Sep 2008 14:25:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357829</guid>
		<description>&lt;blockquote&gt;he real question is what did the user intend when writing the query?&lt;/blockquote&gt;
comments can be effectively used for that, no? :)</description>
		<content:encoded><![CDATA[<blockquote><p>he real question is what did the user intend when writing the query?</p></blockquote>
<p>comments can be effectively used for that, no? <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kailash Badu</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357286</link>
		<dc:creator>Kailash Badu</dc:creator>
		<pubDate>Tue, 23 Sep 2008 05:59:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357286</guid>
		<description>If COUNT(col1) is not better (in terms of performance) than COUNT(*), I bet itâ€™s not worse either. I also donâ€™t see why COUNT(Col1) is less readable than COUNT(*) when I can easily make out that the developer intends to find the number of rows because col1 is PK. Ok, I have no grudges against COUNT(*). Itâ€™s  fine in itself but there is no point in preferring it over COUNT(PK). Even so more in making the massive, system-wide alteration from COUNT(PK) to COUNT(*). 

The argument that COUNT(*) is better because itâ€™s immune to changes in table schema doesnâ€™t hold water because if a column name can change, so can a table name. Thus invalidating SELECT COUNT(*) FROM table_name;</description>
		<content:encoded><![CDATA[<p>If COUNT(col1) is not better (in terms of performance) than COUNT(*), I bet itâ€™s not worse either. I also donâ€™t see why COUNT(Col1) is less readable than COUNT(*) when I can easily make out that the developer intends to find the number of rows because col1 is PK. Ok, I have no grudges against COUNT(*). Itâ€™s  fine in itself but there is no point in preferring it over COUNT(PK). Even so more in making the massive, system-wide alteration from COUNT(PK) to COUNT(*). </p>
<p>The argument that COUNT(*) is better because itâ€™s immune to changes in table schema doesnâ€™t hold water because if a column name can change, so can a table name. Thus invalidating SELECT COUNT(*) FROM table_name;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357206</link>
		<dc:creator>Eric</dc:creator>
		<pubDate>Mon, 22 Sep 2008 23:00:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357206</guid>
		<description>I just found a case of #3 in my code which works as intended, but it was easily rewritten to use count(*).  The count(*) version seems to be slightly faster, particularly since the application didn&#039;t care that the zero rows disappeared and was happy enough to not loop over them.</description>
		<content:encoded><![CDATA[<p>I just found a case of #3 in my code which works as intended, but it was easily rewritten to use count(*).  The count(*) version seems to be slightly faster, particularly since the application didn&#8217;t care that the zero rows disappeared and was happy enough to not loop over them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron Schwartz</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357184</link>
		<dc:creator>Baron Schwartz</dc:creator>
		<pubDate>Mon, 22 Sep 2008 20:50:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357184</guid>
		<description>Kalaish, the real question is not what is faster.  The real question is what did the user intend when writing the query?

To address the off-topic ;-) The hope that COUNT(col1) is faster than COUNT(*) is false indeed.  Take a look at this:

create table test(
  pk char(100) not null,
  col1 tinyint not null,
  primary key(pk),
  key(col1)
) engine = innodb;

Now what do you think MySQL will do with various types of COUNT() queries?  Try putting a few rows into the table, then using EXPLAIN with COUNT(*), COUNT(pk) and COUNT(col1).  You might be surprised.  Convert to MyISAM and try again.  Make col1 nullable and try again.

Now imagine the table has a primary key on (col1, col2, ... colN).  Is COUNT(col1) a good idea?  Did the user mean COUNT(DISTINCT col1) or was it someone trying to outsmart the optimizer?  Are you still convinced that it&#039;s just as good if not better?  Assuming the user wanted to count rows and not values, in the best case it may perform as well as COUNT(*) but never better.  What happens when you alter the table&#039;s schema?

The optimizer needs choices, and the consultant who is trying to find out why the choices are forbidden needs information.  So the real question still remains, what does the SQL mean?  To return to your statement &quot;just as good as COUNT(*), if not better,&quot; what metric of goodness are you using?  In performance it can never be better; in optimizability/understandability/maintainability, COUNT(*) always wins when you&#039;re trying to count rows.

The person who commented about LEFT OUTER JOIN is right on the money, that is the most common use I&#039;ve seen for actually counting values instead of rows.</description>
		<content:encoded><![CDATA[<p>Kalaish, the real question is not what is faster.  The real question is what did the user intend when writing the query?</p>
<p>To address the off-topic <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  The hope that COUNT(col1) is faster than COUNT(*) is false indeed.  Take a look at this:</p>
<p>create table test(<br />
  pk char(100) not null,<br />
  col1 tinyint not null,<br />
  primary key(pk),<br />
  key(col1)<br />
) engine = innodb;</p>
<p>Now what do you think MySQL will do with various types of COUNT() queries?  Try putting a few rows into the table, then using EXPLAIN with COUNT(*), COUNT(pk) and COUNT(col1).  You might be surprised.  Convert to MyISAM and try again.  Make col1 nullable and try again.</p>
<p>Now imagine the table has a primary key on (col1, col2, &#8230; colN).  Is COUNT(col1) a good idea?  Did the user mean COUNT(DISTINCT col1) or was it someone trying to outsmart the optimizer?  Are you still convinced that it&#8217;s just as good if not better?  Assuming the user wanted to count rows and not values, in the best case it may perform as well as COUNT(*) but never better.  What happens when you alter the table&#8217;s schema?</p>
<p>The optimizer needs choices, and the consultant who is trying to find out why the choices are forbidden needs information.  So the real question still remains, what does the SQL mean?  To return to your statement &#8220;just as good as COUNT(*), if not better,&#8221; what metric of goodness are you using?  In performance it can never be better; in optimizability/understandability/maintainability, COUNT(*) always wins when you&#8217;re trying to count rows.</p>
<p>The person who commented about LEFT OUTER JOIN is right on the money, that is the most common use I&#8217;ve seen for actually counting values instead of rows.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kailash Badu</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357168</link>
		<dc:creator>Kailash Badu</dc:creator>
		<pubDate>Mon, 22 Sep 2008 20:29:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357168</guid>
		<description>Ok. Why would anyone want to use â€˜SELECT COUNT(col1)â€™ instead of â€˜SELECT COUNT (*)â€™ for finding the number of rows? Probably because the person believes that performing count on a single column is faster than performing it on the entire column set (the spooky star sign has horrible reputation for being slow for normal SELECT queries although it might or might not apply in this particular case). Now, if I do use â€˜SELECT COUNT (col1) for finding number of row in the (possibly false) hope that itâ€™s faster than COUNT(*), I would at least make sure that col1 is a primary key. No NULL values for col1 (PK) means COUNT(col1) would work just as good as COUNT(*), if not better.

The real question should be: is COUNT(col1) faster than COUNT(*</description>
		<content:encoded><![CDATA[<p>Ok. Why would anyone want to use â€˜SELECT COUNT(col1)â€™ instead of â€˜SELECT COUNT (*)â€™ for finding the number of rows? Probably because the person believes that performing count on a single column is faster than performing it on the entire column set (the spooky star sign has horrible reputation for being slow for normal SELECT queries although it might or might not apply in this particular case). Now, if I do use â€˜SELECT COUNT (col1) for finding number of row in the (possibly false) hope that itâ€™s faster than COUNT(*), I would at least make sure that col1 is a primary key. No NULL values for col1 (PK) means COUNT(col1) would work just as good as COUNT(*), if not better.</p>
<p>The real question should be: is COUNT(col1) faster than COUNT(*</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kailash Badu</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357167</link>
		<dc:creator>Kailash Badu</dc:creator>
		<pubDate>Mon, 22 Sep 2008 20:26:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357167</guid>
		<description>Ok. Why would anyone want to use â€˜SELECT COUNT(col1)â€™ instead of â€˜SELECT COUNT (*)â€™ for finding the number of rows? Probably because the person believes that performing count on a single column is faster than performing it on the entire column set (the spooky star sign has horrible reputation for being slow for  normal SELECT queries although it might or might not apply in this particular case). Now, if I do use â€˜SELECT COUNT (col1) for finding number of row in the (possibly false) hope that itâ€™s faster than COUNT(*), I would at least make sure that col1 is a primary key. No NULL values for  col1 (PK) means  COUNT(col1) would work just as good as COUNT(*), if not better.

The real question should be: is COUNT(col1) faster than COUNT(*)?</description>
		<content:encoded><![CDATA[<p>Ok. Why would anyone want to use â€˜SELECT COUNT(col1)â€™ instead of â€˜SELECT COUNT (*)â€™ for finding the number of rows? Probably because the person believes that performing count on a single column is faster than performing it on the entire column set (the spooky star sign has horrible reputation for being slow for  normal SELECT queries although it might or might not apply in this particular case). Now, if I do use â€˜SELECT COUNT (col1) for finding number of row in the (possibly false) hope that itâ€™s faster than COUNT(*), I would at least make sure that col1 is a primary key. No NULL values for  col1 (PK) means  COUNT(col1) would work just as good as COUNT(*), if not better.</p>
<p>The real question should be: is COUNT(col1) faster than COUNT(*)?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: idont</title>
		<link>http://www.mysqlperformanceblog.com/2008/09/20/a-common-problem-when-optimizing-count/comment-page-1/#comment-357051</link>
		<dc:creator>idont</dc:creator>
		<pubDate>Sun, 21 Sep 2008 20:45:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=503#comment-357051</guid>
		<description>What I usually do:

SELECT COUNT(1) FROM mylargetable WHERE ....

I do not know if it is a good idea but that way there is no doubt about the result.</description>
		<content:encoded><![CDATA[<p>What I usually do:</p>
<p>SELECT COUNT(1) FROM mylargetable WHERE &#8230;.</p>
<p>I do not know if it is a good idea but that way there is no doubt about the result.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

