<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Database access Optimization in Web Applications.</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/</link>
	<description>Everything about MySQL Performance</description>
	<pubDate>Tue, 02 Dec 2008 12:34:10 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: steve Mac</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-171962</link>
		<dc:creator>steve Mac</dc:creator>
		<pubDate>Tue, 25 Sep 2007 05:02:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-171962</guid>
		<description>I applaud the effort.I use Summary tables to limit amount of rows analyzed.Peter very thankful to you to share such a quick and staright forward information with open heart =). I am bookmarking this too.</description>
		<content:encoded><![CDATA[<p>I applaud the effort.I use Summary tables to limit amount of rows analyzed.Peter very thankful to you to share such a quick and staright forward information with open heart =). I am bookmarking this too.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2818</link>
		<dc:creator>Tim</dc:creator>
		<pubDate>Tue, 26 Sep 2006 23:54:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2818</guid>
		<description>Great tip.  I'm bookmarking this one.  I think (hope?) we may need this soon.</description>
		<content:encoded><![CDATA[<p>Great tip.  I&#8217;m bookmarking this one.  I think (hope?) we may need this soon.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexey</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2663</link>
		<dc:creator>Alexey</dc:creator>
		<pubDate>Tue, 19 Sep 2006 18:32:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2663</guid>
		<description>&#62; Sometimes you would see 100 stories selected so one random of them will be displayed and similar things with filtering on application level.

Another very common bad practice is to do something like SELECT * FROM stories ORDER BY RND() LIMIT 1;
Most programmers seem to do it, and I'm not really sure that it's faster than transferring all rows (from the query cache, usually) to the app, and randomly picking one of them.</description>
		<content:encoded><![CDATA[<p>&gt; Sometimes you would see 100 stories selected so one random of them will be displayed and similar things with filtering on application level.</p>
<p>Another very common bad practice is to do something like SELECT * FROM stories ORDER BY RND() LIMIT 1;<br />
Most programmers seem to do it, and I&#8217;m not really sure that it&#8217;s faster than transferring all rows (from the query cache, usually) to the app, and randomly picking one of them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2636</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Mon, 18 Sep 2006 20:01:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2636</guid>
		<description>Felix,

Yes. MySQL Does not have materialized views but Summary tables is great way to limit amount of rows analyzed.   This is one of most common techniques - if you need query which analyzes (aggregates) 1.000.000 rows to return you 10 you're likely missing pre-created summary tables.   It can be done with triggers or manually (sometimes you can do it much faster manually)</description>
		<content:encoded><![CDATA[<p>Felix,</p>
<p>Yes. MySQL Does not have materialized views but Summary tables is great way to limit amount of rows analyzed.   This is one of most common techniques - if you need query which analyzes (aggregates) 1.000.000 rows to return you 10 you&#8217;re likely missing pre-created summary tables.   It can be done with triggers or manually (sometimes you can do it much faster manually)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2635</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Mon, 18 Sep 2006 19:56:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2635</guid>
		<description>Pedantic,

You seems to agree in many cases it makes a difference. Again I'm not claiming it is always going to increase your performance dramatically.  But if you select many columns which you do not use there is a chance things can be optimized. Sometimes it can be only few percent, sometimes it may be dramatic, ie in case you get index coverage for some tables. 

1 million of 100 floats is of course simply to show there is the difference and string conversion is not free.  Especially with connectors written in native interpreted languages it may slow down things dramatically. We've seen difference 5+ times compared to prepared statements.   Prepared statements do help but there is overhead still for sending extra columns. 

Speaking about premutation to select only collumns you need... Of course I do not mean that.  In fact Performance is not only thing you typically would care about.  I frequently would take decisions which are ie 10% slower but they allow to get simpler, more clear code etc.  

From the optimizations mentined this is least important one, but I thought it is still worth to have it in the list, as if I would not someone else might have pointed it out :)

Thanks for comments :)</description>
		<content:encoded><![CDATA[<p>Pedantic,</p>
<p>You seems to agree in many cases it makes a difference. Again I&#8217;m not claiming it is always going to increase your performance dramatically.  But if you select many columns which you do not use there is a chance things can be optimized. Sometimes it can be only few percent, sometimes it may be dramatic, ie in case you get index coverage for some tables. </p>
<p>1 million of 100 floats is of course simply to show there is the difference and string conversion is not free.  Especially with connectors written in native interpreted languages it may slow down things dramatically. We&#8217;ve seen difference 5+ times compared to prepared statements.   Prepared statements do help but there is overhead still for sending extra columns. </p>
<p>Speaking about premutation to select only collumns you need&#8230; Of course I do not mean that.  In fact Performance is not only thing you typically would care about.  I frequently would take decisions which are ie 10% slower but they allow to get simpler, more clear code etc.  </p>
<p>From the optimizations mentined this is least important one, but I thought it is still worth to have it in the list, as if I would not someone else might have pointed it out <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Thanks for comments <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Felix</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2619</link>
		<dc:creator>Felix</dc:creator>
		<pubDate>Mon, 18 Sep 2006 00:06:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2619</guid>
		<description>I don't think views can be used for summary data very usefully in mysql as of yet. Although that would be a really awsome thing to have on the list there. I guess it can be done with triggers in a somewhat sane way or else its up to application logic to update the summary data tables or fields.</description>
		<content:encoded><![CDATA[<p>I don&#8217;t think views can be used for summary data very usefully in mysql as of yet. Although that would be a really awsome thing to have on the list there. I guess it can be done with triggers in a somewhat sane way or else its up to application logic to update the summary data tables or fields.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: pedantic</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2615</link>
		<dc:creator>pedantic</dc:creator>
		<pubDate>Sun, 17 Sep 2006 21:20:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2615</guid>
		<description>Shame on me for not being clearer.
I wasn't arguing that SELECT * is ideal practice (I never, ever use it).  I was just using that for shorthand for "SELECT all columns from table" with all columns enumerated.  The point of my response was that limiting the "projection" of a query has very little benefit (if any) in terms of performance, which is what your blog had claimed.
Now, in response to your claims:
Remember, you titled your entry "Database Optimizations for Web Applications"
emphasis on Web.
So your example of selecting a 1 million rows of 100 floats each is irrelevant since if you're returning 1 million rows to a web browser you've got bigger problems.
Index coverage is certainly a valid point--granted.
As far as String conversion costs, this is completely obviated if you are using prepared statements
It is true that returning more data over the wire is generally more expensive than returning less (who could argue with that?), but I'm guided by Pareto--spend your time and effort on things that make a difference.  And I would venture to say that in the vast majority of cases, by limiting your projection you will achieve at most negligible benefits.
And my objection is more of an architectural/design one--as an OO programmer everything is NOT just a big associative array, and if you litter your code with all different permutations of which fields you might need (which is very likely to change), you've created a big mess for very little benefit.  Yet very many software shops continue to do such things.
I just wanted to offer a different view since some might tend to take what you've written here as canon (which btw, I generally do :)</description>
		<content:encoded><![CDATA[<p>Shame on me for not being clearer.<br />
I wasn&#8217;t arguing that SELECT * is ideal practice (I never, ever use it).  I was just using that for shorthand for &#8220;SELECT all columns from table&#8221; with all columns enumerated.  The point of my response was that limiting the &#8220;projection&#8221; of a query has very little benefit (if any) in terms of performance, which is what your blog had claimed.<br />
Now, in response to your claims:<br />
Remember, you titled your entry &#8220;Database Optimizations for Web Applications&#8221;<br />
emphasis on Web.<br />
So your example of selecting a 1 million rows of 100 floats each is irrelevant since if you&#8217;re returning 1 million rows to a web browser you&#8217;ve got bigger problems.<br />
Index coverage is certainly a valid point&#8211;granted.<br />
As far as String conversion costs, this is completely obviated if you are using prepared statements<br />
It is true that returning more data over the wire is generally more expensive than returning less (who could argue with that?), but I&#8217;m guided by Pareto&#8211;spend your time and effort on things that make a difference.  And I would venture to say that in the vast majority of cases, by limiting your projection you will achieve at most negligible benefits.<br />
And my objection is more of an architectural/design one&#8211;as an OO programmer everything is NOT just a big associative array, and if you litter your code with all different permutations of which fields you might need (which is very likely to change), you&#8217;ve created a big mess for very little benefit.  Yet very many software shops continue to do such things.<br />
I just wanted to offer a different view since some might tend to take what you&#8217;ve written here as canon (which btw, I generally do <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2611</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Sun, 17 Sep 2006 20:23:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2611</guid>
		<description>Pedantic, 

There are cases when SELECT * FROM table WHERE has limited overhead.   There is always overhead (unless you list all fields) but it can be too small to be noticed.  

There are a lot of factors besides BLOB columns:

- Query can be index covered if you select limited number of columns.
- Selecting all columns may require more network traffic and more packets. 
- Conversion to the string and back is expensive for certain amount of strings. 
...

Try benchmarking it for example with 1.000.000 row result set having 100 float columns, vs selecting only couple of them :)

So instead of thinking if it would slow things down in your case or not I would simply avoid it and use only columns you need. 

There are other reasons SELECT * is bad practice besides performance - for example reffering to columns by offsets may break if column inserted between other columns by ALTER TABLE,  it also may break if colum is added and JOIN now has duplicate columns etc. 

Sometimes you want to present all data from the table and do not think which data is going to be used in particular case - it might be reasonable in particular case but it does not mean it would not be slower.

Speaking about other factors which may be added to the list - of course.  I write a big deal about query optimization in the blog.   This list is to say when there are things implemented suboptimally, it however does not mean if application passes this checklist it would not ever have problems :)</description>
		<content:encoded><![CDATA[<p>Pedantic, </p>
<p>There are cases when SELECT * FROM table WHERE has limited overhead.   There is always overhead (unless you list all fields) but it can be too small to be noticed.  </p>
<p>There are a lot of factors besides BLOB columns:</p>
<p>- Query can be index covered if you select limited number of columns.<br />
- Selecting all columns may require more network traffic and more packets.<br />
- Conversion to the string and back is expensive for certain amount of strings.<br />
&#8230;</p>
<p>Try benchmarking it for example with 1.000.000 row result set having 100 float columns, vs selecting only couple of them <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>So instead of thinking if it would slow things down in your case or not I would simply avoid it and use only columns you need. </p>
<p>There are other reasons SELECT * is bad practice besides performance - for example reffering to columns by offsets may break if column inserted between other columns by ALTER TABLE,  it also may break if colum is added and JOIN now has duplicate columns etc. </p>
<p>Sometimes you want to present all data from the table and do not think which data is going to be used in particular case - it might be reasonable in particular case but it does not mean it would not be slower.</p>
<p>Speaking about other factors which may be added to the list - of course.  I write a big deal about query optimization in the blog.   This list is to say when there are things implemented suboptimally, it however does not mean if application passes this checklist it would not ever have problems <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: pedantic</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2598</link>
		<dc:creator>pedantic</dc:creator>
		<pubDate>Sun, 17 Sep 2006 04:16:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2598</guid>
		<description>I agree with most of what you're written, but I have to object to what I consider a common FUD:
the comment about only selecting those columns that are actually required for some "display"
This generally has zero impact on the performance of a query of the SELECT * FROM table where primary_key=x  (with usual caveats about BLOBs and the like).
It is relatively easy to quickly enumerate literally dozens of factors (as you did mention, what really matters is the # of rows that query processing READS) that impact the performance of a query and that one would have to go at the bottom of the list.  And this can easily be benchmarked--I have seen far too many ill-informed developers littering their code with attempts to "gain" this performance benchmark.
Frankly, I'm surprised to see you propagate it.</description>
		<content:encoded><![CDATA[<p>I agree with most of what you&#8217;re written, but I have to object to what I consider a common FUD:<br />
the comment about only selecting those columns that are actually required for some &#8220;display&#8221;<br />
This generally has zero impact on the performance of a query of the SELECT * FROM table where primary_key=x  (with usual caveats about BLOBs and the like).<br />
It is relatively easy to quickly enumerate literally dozens of factors (as you did mention, what really matters is the # of rows that query processing READS) that impact the performance of a query and that one would have to go at the bottom of the list.  And this can easily be benchmarked&#8211;I have seen far too many ill-informed developers littering their code with attempts to &#8220;gain&#8221; this performance benchmark.<br />
Frankly, I&#8217;m surprised to see you propagate it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dmitri Mikhailov</title>
		<link>http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2548</link>
		<dc:creator>Dmitri Mikhailov</dc:creator>
		<pubDate>Fri, 15 Sep 2006 14:16:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2006/09/12/database-access-optimization-in-web-applications/#comment-2548</guid>
		<description>To gather applicationdatabase round-trip statistics: http://forge.mysql.com/snippets/view.php?id=15</description>
		<content:encoded><![CDATA[<p>To gather applicationdatabase round-trip statistics: <a href="http://forge.mysql.com/snippets/view.php?id=15" rel="nofollow">http://forge.mysql.com/snippets/view.php?id=15</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
