<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Should MySQL Extend GROUP BY Syntax ?</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/</link>
	<description>Everything about MySQL Performance</description>
	<pubDate>Tue, 02 Dec 2008 20:00:36 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-157242</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Sun, 19 Aug 2007 09:05:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-157242</guid>
		<description>Ronald,

You article was already quoted and I replied :) 

"Functionally Dependent"  is really more elaborate explanation of columns in group by - the basics remain the same - you can have the column in the list if there is only one value of that column in the group. 

I'm speaking about different case when there are many column value for each group and we just want to define which one is picked.

For example in our example  grouping by Country we may also include CountryCode because there is only one for each country,  however there are multiple Cities for each Country and group by does not allow to pick which one is selected (or even select it at all in more strict databases)</description>
		<content:encoded><![CDATA[<p>Ronald,</p>
<p>You article was already quoted and I replied <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>&#8220;Functionally Dependent&#8221;  is really more elaborate explanation of columns in group by - the basics remain the same - you can have the column in the list if there is only one value of that column in the group. </p>
<p>I&#8217;m speaking about different case when there are many column value for each group and we just want to define which one is picked.</p>
<p>For example in our example  grouping by Country we may also include CountryCode because there is only one for each country,  however there are multiple Cities for each Country and group by does not allow to pick which one is selected (or even select it at all in more strict databases)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roland Bouman</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-157017</link>
		<dc:creator>Roland Bouman</dc:creator>
		<pubDate>Sat, 18 Aug 2007 23:30:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-157017</guid>
		<description>mmm, seems all terms enclosed in angle brackets got removed....quotes again:

First Quote, GROUP BY according to SQL92:

"If T is a grouped table, then each [column reference] in each [value expression] that references a column of T shall reference a grouping column or be specified within a [set function specification]."

Second Quote, GROUP BY according to SQL1999 and SQL2003:

"If T is a grouped table, then let G be the set of grouping columns of T. In each [value expression] contained in [select list], each column reference that references a column of T shall reference some column C that is functionally dependent on G or shall be contained in an aggregated argument of a [set function specification] whose aggregation query is QS."</description>
		<content:encoded><![CDATA[<p>mmm, seems all terms enclosed in angle brackets got removed&#8230;.quotes again:</p>
<p>First Quote, GROUP BY according to SQL92:</p>
<p>&#8220;If T is a grouped table, then each [column reference] in each [value expression] that references a column of T shall reference a grouping column or be specified within a [set function specification].&#8221;</p>
<p>Second Quote, GROUP BY according to SQL1999 and SQL2003:</p>
<p>&#8220;If T is a grouped table, then let G be the set of grouping columns of T. In each [value expression] contained in [select list], each column reference that references a column of T shall reference some column C that is functionally dependent on G or shall be contained in an aggregated argument of a [set function specification] whose aggregation query is QS.&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roland Bouman</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-157015</link>
		<dc:creator>Roland Bouman</dc:creator>
		<pubDate>Sat, 18 Aug 2007 23:27:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-157015</guid>
		<description>Hi Peter,

"In fact ANSI SQL even forbids you to select columns which are not aggregates or part of group by because result in this case is not defined."

Well, it depends which version of ANSI SQL you are referring too. SQL92 says: 

"If T is a grouped table, then each  in each  that references a column of T shall reference a grouping column or be specified within a ."

Which indeed wants us to always include all non-aggregated columns in the GROUP BY clause.

But SQL1999 and SQL2003 have a more sophisticated view on the matter:

"If T is a grouped table, then let G be the set of grouping columns of T. In each  contained in , each column reference that references a column of T shall reference some column C that is functionally dependent on G or shall be contained in an aggregated argument of a  whose aggregation query is QS."

Which just says that you are allowed to include non-aggregated columns in the GROUP BY clause as long as they are completely determined by the GROUP BY clause, in other words, when there is exactly one distinct value per combination of values in the GROUP BY clause.

(http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html)

kind regards, 

Roland Bouman</description>
		<content:encoded><![CDATA[<p>Hi Peter,</p>
<p>&#8220;In fact ANSI SQL even forbids you to select columns which are not aggregates or part of group by because result in this case is not defined.&#8221;</p>
<p>Well, it depends which version of ANSI SQL you are referring too. SQL92 says: </p>
<p>&#8220;If T is a grouped table, then each  in each  that references a column of T shall reference a grouping column or be specified within a .&#8221;</p>
<p>Which indeed wants us to always include all non-aggregated columns in the GROUP BY clause.</p>
<p>But SQL1999 and SQL2003 have a more sophisticated view on the matter:</p>
<p>&#8220;If T is a grouped table, then let G be the set of grouping columns of T. In each  contained in , each column reference that references a column of T shall reference some column C that is functionally dependent on G or shall be contained in an aggregated argument of a  whose aggregation query is QS.&#8221;</p>
<p>Which just says that you are allowed to include non-aggregated columns in the GROUP BY clause as long as they are completely determined by the GROUP BY clause, in other words, when there is exactly one distinct value per combination of values in the GROUP BY clause.</p>
<p>(http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html)</p>
<p>kind regards, </p>
<p>Roland Bouman</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Grant</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156910</link>
		<dc:creator>David Grant</dc:creator>
		<pubDate>Sat, 18 Aug 2007 15:03:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156910</guid>
		<description>I was just wishing for this feature the other day.</description>
		<content:encoded><![CDATA[<p>I was just wishing for this feature the other day.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156869</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Sat, 18 Aug 2007 10:51:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156869</guid>
		<description>Mark,

Analytical functions are quite helpful but they are generally designed for the different purpose - as you can see they do aggregates but they run on result set not on the grouped result set (which ensures only one row for each group is left)

Or may be I do not see how you can get the same result set as I'm looking for using these functions without too much complications :)

It surely would be helpful to get them in MySQL.</description>
		<content:encoded><![CDATA[<p>Mark,</p>
<p>Analytical functions are quite helpful but they are generally designed for the different purpose - as you can see they do aggregates but they run on result set not on the grouped result set (which ensures only one row for each group is left)</p>
<p>Or may be I do not see how you can get the same result set as I&#8217;m looking for using these functions without too much complications <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>It surely would be helpful to get them in MySQL.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156855</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Sat, 18 Aug 2007 10:01:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156855</guid>
		<description>No,

Sorting table is dangerous trap as if you change query a bit, ie to add range which will be using index to access the rows you will get different plan and not result you're expecting.

Regarding DISTINCT - applies to the Row rather than the column - it says basically remove duplicate rows. 

If you would make it to take argument and read like remove duplicate values from this column you would have same problem as group by - what values you should see for other columns ?  This would be hard to define independently on query execution plan.</description>
		<content:encoded><![CDATA[<p>No,</p>
<p>Sorting table is dangerous trap as if you change query a bit, ie to add range which will be using index to access the rows you will get different plan and not result you&#8217;re expecting.</p>
<p>Regarding DISTINCT - applies to the Row rather than the column - it says basically remove duplicate rows. </p>
<p>If you would make it to take argument and read like remove duplicate values from this column you would have same problem as group by - what values you should see for other columns ?  This would be hard to define independently on query execution plan.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156854</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Sat, 18 Aug 2007 09:58:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156854</guid>
		<description>Olexandr,

Yes this is simplification there are other bits in pieces which are allowed but generally if selected column is not dependent on the group by column, like in this case City is not Dependent on Country  it either would not be allowed or you could see surprising results in the city column. 

That exact article gives good examples and a bit more explanations.</description>
		<content:encoded><![CDATA[<p>Olexandr,</p>
<p>Yes this is simplification there are other bits in pieces which are allowed but generally if selected column is not dependent on the group by column, like in this case City is not Dependent on Country  it either would not be allowed or you could see surprising results in the city column. </p>
<p>That exact article gives good examples and a bit more explanations.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Olexandr Melnyk</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156851</link>
		<dc:creator>Olexandr Melnyk</dc:creator>
		<pubDate>Sat, 18 Aug 2007 09:52:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156851</guid>
		<description>&#62; in fact ANSI SQL even forbids you to select columns 
&#62; which are not aggregates or part of group by

ANSI SQL doesn't require not aggregated selected columns to be present in GROUP BY in all cases. Here is some elaboration on this: 

http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html</description>
		<content:encoded><![CDATA[<p>&gt; in fact ANSI SQL even forbids you to select columns<br />
&gt; which are not aggregates or part of group by</p>
<p>ANSI SQL doesn&#8217;t require not aggregated selected columns to be present in GROUP BY in all cases. Here is some elaboration on this: </p>
<p><a href="http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html" rel="nofollow">http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Callaghan</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156785</link>
		<dc:creator>Mark Callaghan</dc:creator>
		<pubDate>Sat, 18 Aug 2007 04:34:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156785</guid>
		<description>Would window functions provide what you want? Oracle has them (http://orafaq.com/node/55). I thought they were part of a standard.</description>
		<content:encoded><![CDATA[<p>Would window functions provide what you want? Oracle has them (http://orafaq.com/node/55). I thought they were part of a standard.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: No</title>
		<link>http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156661</link>
		<dc:creator>No</dc:creator>
		<pubDate>Fri, 17 Aug 2007 21:25:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/08/17/should-mysql-extend-group-by-syntax/#comment-156661</guid>
		<description>Great post. Lucky for me I can use "ALTER TABLE tbl ORDER BY col;" because my data is nearly static and just one GROUP BY query needs a "GROUPORDER".

The issue as I see it thought, is that those queries aren't actually any GROUP BY queries, but queries where you want the best matching row for a certain WHERE.

Basically MySQL should handle those queries like this:

-Sort by Population
-Scan through the table and
a. if we already took a row with that city, trash it
b. take the city

This can even fully run on indices!

Something like

SELECT DISTINCT( Country ) City, Population, Country FROM City ORDER BY Country, Population DESC

This would pick all all cities with highest population, and sort them by country.

**Sadly enough DISTINCT doesn't take any arguments.**

I never understood why it doesn't. Makes no sense to me.</description>
		<content:encoded><![CDATA[<p>Great post. Lucky for me I can use &#8220;ALTER TABLE tbl ORDER BY col;&#8221; because my data is nearly static and just one GROUP BY query needs a &#8220;GROUPORDER&#8221;.</p>
<p>The issue as I see it thought, is that those queries aren&#8217;t actually any GROUP BY queries, but queries where you want the best matching row for a certain WHERE.</p>
<p>Basically MySQL should handle those queries like this:</p>
<p>-Sort by Population<br />
-Scan through the table and<br />
a. if we already took a row with that city, trash it<br />
b. take the city</p>
<p>This can even fully run on indices!</p>
<p>Something like</p>
<p>SELECT DISTINCT( Country ) City, Population, Country FROM City ORDER BY Country, Population DESC</p>
<p>This would pick all all cities with highest population, and sort them by country.</p>
<p>**Sadly enough DISTINCT doesn&#8217;t take any arguments.**</p>
<p>I never understood why it doesn&#8217;t. Makes no sense to me.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
