<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: &#8220;Shard early, shard often&#8221;</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/</link>
	<description>Percona&#039;s MySQL &#38; InnoDB performance and scalability blog</description>
	<lastBuildDate>Sat, 11 Feb 2012 16:45:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Morgan Tocker</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/comment-page-1/#comment-681575</link>
		<dc:creator>Morgan Tocker</dc:creator>
		<pubDate>Sat, 21 Nov 2009 19:51:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1730#comment-681575</guid>
		<description>Peter - Yes, having multiple machines certainly is helpful in A/B testing changes, but at the simplest level tuning queries with mk-query-digest is fairly safe unless you go too far with indexes with a write-heavy workload.

There&#039;s nothing wrong with planning to shard early - just hold off implementing it while you can.  As Arjen wrote, users can surprise you - the components you thought wouldn&#039;t grow exponentially do.</description>
		<content:encoded><![CDATA[<p>Peter &#8211; Yes, having multiple machines certainly is helpful in A/B testing changes, but at the simplest level tuning queries with mk-query-digest is fairly safe unless you go too far with indexes with a write-heavy workload.</p>
<p>There&#8217;s nothing wrong with planning to shard early &#8211; just hold off implementing it while you can.  As Arjen wrote, users can surprise you &#8211; the components you thought wouldn&#8217;t grow exponentially do.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Morgan Tocker</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/comment-page-1/#comment-678773</link>
		<dc:creator>Morgan Tocker</dc:creator>
		<pubDate>Tue, 17 Nov 2009 17:36:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1730#comment-678773</guid>
		<description>Michael and Arjen: You raise good points about not introducing functionality that could block sharding if you&#039;ve established you need to shard at some point.  That can be a case where prolonging pain causes more pain - and I&#039;ve certainly seen that happen.

My main message here is really in response to a repeating pattern on systems I&#039;ve had to work on.  If you are using a pretty standard box with 1-2 spindles and 4-8G of RAM, there are a lot more opportunities open to you before sharding.  Most of the items in Mark&#039;s list (comment #3) were not available 2-3 years ago.</description>
		<content:encoded><![CDATA[<p>Michael and Arjen: You raise good points about not introducing functionality that could block sharding if you&#8217;ve established you need to shard at some point.  That can be a case where prolonging pain causes more pain &#8211; and I&#8217;ve certainly seen that happen.</p>
<p>My main message here is really in response to a repeating pattern on systems I&#8217;ve had to work on.  If you are using a pretty standard box with 1-2 spindles and 4-8G of RAM, there are a lot more opportunities open to you before sharding.  Most of the items in Mark&#8217;s list (comment #3) were not available 2-3 years ago.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Sankauskas</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/comment-page-1/#comment-678770</link>
		<dc:creator>Peter Sankauskas</dc:creator>
		<pubDate>Tue, 17 Nov 2009 17:31:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1730#comment-678770</guid>
		<description>Your first bullet point is mostly true... definitely nothing replaces good data design with effective use of indexes. However, there are some reasons why you want to shared (or at least build it) from the start. Your second bullet point is true, but misses the advantage. Having multiple servers allows you to experiments with different options, and if done properly, lets you choose the best optimization sooner. Depending on how the sharding (or combined with replication) is done, you may also be able to create an index (which locks a table) but not face bringing the application down to do it.

One more point that is somewhat overlooked is stress. It is much easier to think about and plan out a good sharding model when your do not have to deal with an application that has performance problems, all while the boss wants to add more features. If you do leave sharding for later, make sure everyone on the team knows about the potential technical debt, and it is accounted for. You don&#039;t want to have to rush into sharding under pressure/stress.</description>
		<content:encoded><![CDATA[<p>Your first bullet point is mostly true&#8230; definitely nothing replaces good data design with effective use of indexes. However, there are some reasons why you want to shared (or at least build it) from the start. Your second bullet point is true, but misses the advantage. Having multiple servers allows you to experiments with different options, and if done properly, lets you choose the best optimization sooner. Depending on how the sharding (or combined with replication) is done, you may also be able to create an index (which locks a table) but not face bringing the application down to do it.</p>
<p>One more point that is somewhat overlooked is stress. It is much easier to think about and plan out a good sharding model when your do not have to deal with an application that has performance problems, all while the boss wants to add more features. If you do leave sharding for later, make sure everyone on the team knows about the potential technical debt, and it is accounted for. You don&#8217;t want to have to rush into sharding under pressure/stress.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Callaghan</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/comment-page-1/#comment-678475</link>
		<dc:creator>Mark Callaghan</dc:creator>
		<pubDate>Tue, 17 Nov 2009 05:19:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1730#comment-678475</guid>
		<description>Several things can delay the need to shard: affordable RAM, affordable flash storage, InnoDB plugin or XtraDB and smart people including expert consultants. Hopefully all of these are given proper consideration. Sharding is usually much easier than re-sharding (splitting) data on an overloaded shard. The plan to shard must include a plan to reshard.</description>
		<content:encoded><![CDATA[<p>Several things can delay the need to shard: affordable RAM, affordable flash storage, InnoDB plugin or XtraDB and smart people including expert consultants. Hopefully all of these are given proper consideration. Sharding is usually much easier than re-sharding (splitting) data on an overloaded shard. The plan to shard must include a plan to reshard.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/comment-page-1/#comment-678445</link>
		<dc:creator>Michael</dc:creator>
		<pubDate>Tue, 17 Nov 2009 03:47:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1730#comment-678445</guid>
		<description>I agree that sharding simply for the sake of it can create more problems and too much work early on when resources are scarce. However, there are at least three good reasons to shard early -- or at least run your dev environments sharded, even if you have a single &quot;shard&quot; in production.

1) It&#039;s too tempting to produce joins or otherwise assume all data is in the same place, especially if you&#039;re using an ORM layer (developers tend not to look at the resulting queries too closely). This makes it difficult to shard later (or vertically partition), and almost impossible to do it quickly.

2) The tools you use for data access layer may not handle sharding correctly, or may need significant changes, or may result in different idioms. This also makes it difficult to shard later, and will involve significant retraining. 

3) You may expect significant traffic bumps as your company gains publicity. Forget the cost for a moment, it&#039;s simply less work to add DB servers into the mix than to upgrade to one with more RAM/CPU/disk or whatever it is you need. 

But, truthfully, the best investment a new company can make is hire smart people with experience in both scaling systems and optimizing performance. :)</description>
		<content:encoded><![CDATA[<p>I agree that sharding simply for the sake of it can create more problems and too much work early on when resources are scarce. However, there are at least three good reasons to shard early &#8212; or at least run your dev environments sharded, even if you have a single &#8220;shard&#8221; in production.</p>
<p>1) It&#8217;s too tempting to produce joins or otherwise assume all data is in the same place, especially if you&#8217;re using an ORM layer (developers tend not to look at the resulting queries too closely). This makes it difficult to shard later (or vertically partition), and almost impossible to do it quickly.</p>
<p>2) The tools you use for data access layer may not handle sharding correctly, or may need significant changes, or may result in different idioms. This also makes it difficult to shard later, and will involve significant retraining. </p>
<p>3) You may expect significant traffic bumps as your company gains publicity. Forget the cost for a moment, it&#8217;s simply less work to add DB servers into the mix than to upgrade to one with more RAM/CPU/disk or whatever it is you need. </p>
<p>But, truthfully, the best investment a new company can make is hire smart people with experience in both scaling systems and optimizing performance. <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arjen Lentz</title>
		<link>http://www.mysqlperformanceblog.com/2009/11/16/shard-early-shard-often/comment-page-1/#comment-678403</link>
		<dc:creator>Arjen Lentz</dc:creator>
		<pubDate>Mon, 16 Nov 2009 23:48:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1730#comment-678403</guid>
		<description>It&#039;s good to think about sharding from early on, that is - keep it in mind.
Doing it early on tends to also be a bad idea because you don&#039;t yet know where the nasties will be when things grow.

For instance, originally independent components (good candidates for functional sharding!) may end up needing to be tightly integrated (joins) to deliver what users end up doing with your system.</description>
		<content:encoded><![CDATA[<p>It&#8217;s good to think about sharding from early on, that is &#8211; keep it in mind.<br />
Doing it early on tends to also be a bad idea because you don&#8217;t yet know where the nasties will be when things grow.</p>
<p>For instance, originally independent components (good candidates for functional sharding!) may end up needing to be tightly integrated (joins) to deliver what users end up doing with your system.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

