<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A rule of thumb for choosing column order in indexes</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/</link>
	<description>Everything about MySQL Performance</description>
	<lastBuildDate>Sat, 21 Nov 2009 05:23:57 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: John Larsen</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-586212</link>
		<dc:creator>John Larsen</dc:creator>
		<pubDate>Mon, 15 Jun 2009 16:28:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-586212</guid>
		<description>I ran into an interesting multi key issue last week, wondering if anyone had any suggestions.
Table sent_items looked like
id int(10) autoincrement unsigned
sender_id int(11) unsigned
receiver_id int(11) unsigned
item_id tinyint(3)
sent_time int(11)  (stored unix timestamp, stamped by application not server)
in InnoDB
This issue started popping up somewhere around 20 million rows in the table, possibly earlier though we didn&#039;t notice.
There was a single key on receiver_id and a combo key on (sender_id, sent_time) due to a frequent query
to get the 50 most recently received items.
Average use had about a dozen items sent.
select * from sent_items where sender_id = &#039;12345678&#039; limit 50;
would give 12 rows in 0.5 seconds under heavy server load.
select * from sent_items where sender_id = &#039;12345678&#039; ORDER BY sent_time DESC limit 50;
would give the same 12 rows in 5 full seconds under heavy load.
Both queries would be nearly instant under light server load.
A query on 
select * from sent_items where receiver_id = &#039;12345678&#039; ORDER BY sent_time DESC limit 50;
With or without ORDER would be 0.5 seconds under load.
Now the sender_id has a very different cardinality than the receiver_id and records are generally grouped by a sender as a person can send many items to a number of different receivers in an instant.
Going to try changing the multi_key to a single key later, but very strange results. Wondering if you have any ideas?</description>
		<content:encoded><![CDATA[<p>I ran into an interesting multi key issue last week, wondering if anyone had any suggestions.<br />
Table sent_items looked like<br />
id int(10) autoincrement unsigned<br />
sender_id int(11) unsigned<br />
receiver_id int(11) unsigned<br />
item_id tinyint(3)<br />
sent_time int(11)  (stored unix timestamp, stamped by application not server)<br />
in InnoDB<br />
This issue started popping up somewhere around 20 million rows in the table, possibly earlier though we didn&#8217;t notice.<br />
There was a single key on receiver_id and a combo key on (sender_id, sent_time) due to a frequent query<br />
to get the 50 most recently received items.<br />
Average use had about a dozen items sent.<br />
select * from sent_items where sender_id = &#8216;12345678&#8242; limit 50;<br />
would give 12 rows in 0.5 seconds under heavy server load.<br />
select * from sent_items where sender_id = &#8216;12345678&#8242; ORDER BY sent_time DESC limit 50;<br />
would give the same 12 rows in 5 full seconds under heavy load.<br />
Both queries would be nearly instant under light server load.<br />
A query on<br />
select * from sent_items where receiver_id = &#8216;12345678&#8242; ORDER BY sent_time DESC limit 50;<br />
With or without ORDER would be 0.5 seconds under load.<br />
Now the sender_id has a very different cardinality than the receiver_id and records are generally grouped by a sender as a person can send many items to a number of different receivers in an instant.<br />
Going to try changing the multi_key to a single key later, but very strange results. Wondering if you have any ideas?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Wultsch</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-580920</link>
		<dc:creator>Rob Wultsch</dc:creator>
		<pubDate>Wed, 10 Jun 2009 08:30:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-580920</guid>
		<description>For whatever it is worth I regret my comment. It is easy to be a critic (which is a hair away from what I did of asking questions I knew were hard to answer).

This technique is very good for beginners and a very good option for dealing with an unfamiliar dbms. One way or another seat time and understanding whatever optimizer you are dealing with can not be easily replaced with a single blog post. Understanding the optimizer is really key and that can not be easily summarized. 

It is a pipe dream for many (most?) databases to see a process list that evenly distributes queries to hit all portions of tables evenly.  Much more likely is getting hit with many queries of the same form (possibly exactly the same) on a table that changes just enough for query cache to be a bad idea. In this case the above technique with some knowledge of nuances is I think very useful...

Thank you for your time.</description>
		<content:encoded><![CDATA[<p>For whatever it is worth I regret my comment. It is easy to be a critic (which is a hair away from what I did of asking questions I knew were hard to answer).</p>
<p>This technique is very good for beginners and a very good option for dealing with an unfamiliar dbms. One way or another seat time and understanding whatever optimizer you are dealing with can not be easily replaced with a single blog post. Understanding the optimizer is really key and that can not be easily summarized. </p>
<p>It is a pipe dream for many (most?) databases to see a process list that evenly distributes queries to hit all portions of tables evenly.  Much more likely is getting hit with many queries of the same form (possibly exactly the same) on a table that changes just enough for query cache to be a bad idea. In this case the above technique with some knowledge of nuances is I think very useful&#8230;</p>
<p>Thank you for your time.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron Schwartz</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-579986</link>
		<dc:creator>Baron Schwartz</dc:creator>
		<pubDate>Tue, 09 Jun 2009 12:46:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-579986</guid>
		<description>Right.  That&#039;s another half-written blog post from long ago, and as people are asking about all the nuances around this I&#039;m regretting trying to simplify things too much.  There are a dozen things to think about, though -- cardinality has skew, queries use certain values more often than others, etc etc.  Ultimately you really need statistical information about the values in the table and the values that are used in WHERE clauses.  In the example, the query came from a real application that queries for twitter/waiting FAR more often than anything else, by the way.</description>
		<content:encoded><![CDATA[<p>Right.  That&#8217;s another half-written blog post from long ago, and as people are asking about all the nuances around this I&#8217;m regretting trying to simplify things too much.  There are a dozen things to think about, though &#8212; cardinality has skew, queries use certain values more often than others, etc etc.  Ultimately you really need statistical information about the values in the table and the values that are used in WHERE clauses.  In the example, the query came from a real application that queries for twitter/waiting FAR more often than anything else, by the way.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stefan Stubbe</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-579974</link>
		<dc:creator>Stefan Stubbe</dc:creator>
		<pubDate>Tue, 09 Jun 2009 12:34:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-579974</guid>
		<description>Wouldn&#039;t it be better to check the cardinality of the columns instead of the selectivity of specific values ?
When you want to select the rows with another STATUS, another column order in the index might be better.</description>
		<content:encoded><![CDATA[<p>Wouldn&#8217;t it be better to check the cardinality of the columns instead of the selectivity of specific values ?<br />
When you want to select the rows with another STATUS, another column order in the index might be better.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ibrahim oguz</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-578645</link>
		<dc:creator>ibrahim oguz</dc:creator>
		<pubDate>Mon, 08 Jun 2009 09:25:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-578645</guid>
		<description>is it possible that using 2 server one for inserting data and the other one for reporting(making select queries) first server replicate to second server but in first one there is no index but in second server there are indexes.
because indexes make slow when you inserting,updating or deleting data but it makes fast when you select data.
how can i reorganize index when replicate data.

thanks</description>
		<content:encoded><![CDATA[<p>is it possible that using 2 server one for inserting data and the other one for reporting(making select queries) first server replicate to second server but in first one there is no index but in second server there are indexes.<br />
because indexes make slow when you inserting,updating or deleting data but it makes fast when you select data.<br />
how can i reorganize index when replicate data.</p>
<p>thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-578196</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Mon, 08 Jun 2009 00:29:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-578196</guid>
		<description>Baron,

I think it also makes sense to look into things in the cost/benefit way.    Extending the index makes it larger. In your case adding extra column which halves amount of rows to be traversed for the query probably makes sense. In some other cases when it is only 10% gain it does not make sense.

Another thing which often make sense to consider is covering index potential in particular when trying to target several query patterns with single indexes.

Covering index slows down queries which can&#039;t use it as covering index because it is longer but speeds up queries which can use covering index dramatically in particular in IO bound scenarios.</description>
		<content:encoded><![CDATA[<p>Baron,</p>
<p>I think it also makes sense to look into things in the cost/benefit way.    Extending the index makes it larger. In your case adding extra column which halves amount of rows to be traversed for the query probably makes sense. In some other cases when it is only 10% gain it does not make sense.</p>
<p>Another thing which often make sense to consider is covering index potential in particular when trying to target several query patterns with single indexes.</p>
<p>Covering index slows down queries which can&#8217;t use it as covering index because it is longer but speeds up queries which can use covering index dramatically in particular in IO bound scenarios.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron Schwartz</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-578140</link>
		<dc:creator>Baron Schwartz</dc:creator>
		<pubDate>Sun, 07 Jun 2009 23:58:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-578140</guid>
		<description>Rob, it&#039;s a good point about size of the column.  This was meant to be kind of an all-things-being-equal example.  I didn&#039;t want to get into all the subtleties of range criteria.  That&#039;s a big mess to explain clearly :)</description>
		<content:encoded><![CDATA[<p>Rob, it&#8217;s a good point about size of the column.  This was meant to be kind of an all-things-being-equal example.  I didn&#8217;t want to get into all the subtleties of range criteria.  That&#8217;s a big mess to explain clearly <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anthony Linton</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-577708</link>
		<dc:creator>Anthony Linton</dc:creator>
		<pubDate>Sun, 07 Jun 2009 12:53:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-577708</guid>
		<description>Thanks the post, a great help. A lot of the ones recently have seemed to be at a more complex level, which isn&#039;t so useful to me =)</description>
		<content:encoded><![CDATA[<p>Thanks the post, a great help. A lot of the ones recently have seemed to be at a more complex level, which isn&#8217;t so useful to me =)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Wultsch</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-577498</link>
		<dc:creator>Rob Wultsch</dc:creator>
		<pubDate>Sun, 07 Jun 2009 08:26:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-577498</guid>
		<description>I have done similar things in the past though I have mostly looked at cardinality as I assumed this would be a better general use case.  (note: I think this is a foolish assumption when most queries are looking for the same values...)

I have wondered if there should be some account of the size of the columns in place. For example lets say a int is slightly less selective than a varchar(50). I would think that the smaller datatype should be somewhat preferred as a smaller index would be created and I imagine traversed more quickly.

Also I think worth note would be the effect of range type restrictions for user that are unfamiliar with their effects on queries that might make use of composite indexes.

Any thoughts?</description>
		<content:encoded><![CDATA[<p>I have done similar things in the past though I have mostly looked at cardinality as I assumed this would be a better general use case.  (note: I think this is a foolish assumption when most queries are looking for the same values&#8230;)</p>
<p>I have wondered if there should be some account of the size of the columns in place. For example lets say a int is slightly less selective than a varchar(50). I would think that the smaller datatype should be somewhat preferred as a smaller index would be created and I imagine traversed more quickly.</p>
<p>Also I think worth note would be the effect of range type restrictions for user that are unfamiliar with their effects on queries that might make use of composite indexes.</p>
<p>Any thoughts?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roman Petrichev</title>
		<link>http://www.mysqlperformanceblog.com/2009/06/05/a-rule-of-thumb-for-choosing-column-order-in-indexes/comment-page-1/#comment-575743</link>
		<dc:creator>Roman Petrichev</dc:creator>
		<pubDate>Fri, 05 Jun 2009 19:12:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=663#comment-575743</guid>
		<description>Live data can be changed dramatically within few minutes and previously looked efficient compound index can easily turn quite inefficient for the same query but for changed data set. So such technique is more suitable for immutable tables IMHO.</description>
		<content:encoded><![CDATA[<p>Live data can be changed dramatically within few minutes and previously looked efficient compound index can easily turn quite inefficient for the same query but for changed data set. So such technique is more suitable for immutable tables IMHO.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
