February 27, 2007

System size and Performance Optimization

Posted by peter |

Performance optimization is never ending story, you can virtually always find something else to optimize but while on generic system, not tuned by expert you often can get significant performance increase in the matter of hours further performance improvements become more and more time consuming and expensive and gains smaller.

This observation does not only apply to MySQL but to systems running other databases or no database at all as well.

So how does System size come in play here ? Well it is system size and scale what may define how much performance you need to squeeze out, and also how your scaling is set up.

For smaller systems lets say running couple of MySQL servers running another couple can be cheaper than serious application changes and this can be good way to go if you’re not expecting major growth.

This however implies two things – first you can scale system by adding more servers, because scaling up and purchasing more and more powerful servers will not tall you far, especially as MySQL Server may have trouble scaling itself. The second thing – it should be throughput rather than latency problems – for example if your search is taking 10 seconds for the single user on idle seconds you need to fix it and simply adding more boxes may not be enough.

The growth expectation is always interesting trade off. Working with startups We often ask them if they would like me to help them to design architecture which is very easy to implement or which would be efficient and scalable ? Typically we settle on some balance but it does not change the fact – scalable architectures may be more complex to implement. This of course does not mean all simple architectures scale badly – for many tasks simple architectures can be good enough, but it should be right simple architectures. In some cases you might wish to implement simple architecture which you’ll have to redo in the 6 months – just to show proof of concept, see first happy customers and raise investment or fell assured this project is worth to spend savings on. In others you prefer to do things as good as you can from the start. I’m not saying “right” as experience gained during the project development and new features typically require some changes.

If system will be huge from the start, like if Yahoo develops new service and will link it from the front page, or if it is expected to grow quickly it is worth to spend time and money implementing things as optimal as possible. If you’re using 50 5000$ database servers to run the system even 10% performance increase mean 25K$ savings in hardware cost alone not to mention rack space cost, power cost, staff costs and other components of TCO. If you’re MySQL Customer you can also save on support and possibly Enterprise Subscription by keeping number of servers low.

How do we approach it in our practice ? We figure out application size and ask about expected growth, to help to access if minor face lifting such as MySQL Server settings tuning, changing indexes and queries is enough to deliver performance required in cost efficient way, or if we should do some more complex things such as major schema changes and architecture overhaul, adding/changing caching infrastructure and other things.

Of course it is not black and white, typically there are many alternatives and we typically can offer list of items which can be done to the system together with implementation complexity and expected performance impact.

February 25, 2007

Pitfalls of converting to InnoDB

Posted by Vadim |

We often recommend to our clients to convert their current database from MyISAM tables to InnoDB.
The transfer by itself in most cases is almost plain, however the application can be broken by new unexpected errors
1205 (ER_LOCK_WAIT_TIMEOUT)

Lock wait timeout expired. Transaction was rolled back.

1213 (ER_LOCK_DEADLOCK)

Transaction deadlock. You should rerun the transaction.
[read more...]

February 20, 2007

MySQL Consulting – Being on your own

Posted by peter |

About half a year have passed since me and Vadim have left MySQL to do MySQL Consulting on our own. Bunch of people have been wondering about our experiences so I thought it would be worth to share it here.
[read more...]

February 19, 2007

MySQL Error control changes

Posted by peter |

In MySQL 5.0 mainly error control was improved, such as strict mode was added to change famous MySQL behavior of cutting too large strings, too big numbers and allowing you to use dates such as February 31st.

In one case however reverse change was done – in regards to storage engine initialization. Previously if you start MySQL and Innodb storage engine fails to initialize (ie you resized log file but forgot to delete old ones) MySQL Server simply would not start. In recent MySQL 5.0 series however it will continue loading and simply have Innodb storage engine disabled.

This can cause numerous problems especially if you got use to old behavior and do not check MySQL logs but simply check it is started. It especially hurts if you have only some tables in Innodb so you might not notice part of your application does not function. Also monitoring often monitors MySQL is up and running and will not query all tables are accessible not catching such errors. If MySQL is started with Innodb tables disabled accesses to Innodb tables will simply result in errors.

The other way I’ve seen it beating up people is performing Innodb conversion…. By default MySQL substitutes storage engines so if Innodb is disabled MyISAM is used, so I’ve seen people thinking they have converted things to Innodb while they really did not because it was substituted back to MyISAM.

I do not think this change is good one but I’d expect this is something we’ve got to live with – with new “Plugable Storage Engine” concept the importance of storage engines is decreased – if one of these storage engine can’t start who cares ? I do because it has my data in it.

P.S One thing I’m happy tables from storage engines which failed to initialize are not automatically replaced with empty MyISAM tables of the same structure :)

February 16, 2007

Using index for ORDER BY vs restricting number of rows.

Posted by peter |

One interesting problem with MySQL Optimizer I frequently run into is making poor decision when it comes to choosing between using index for ORDER BY or using index for restriction.

Consider we’re running web site which sell goods, goods may be from different categories, different sellers different locations which can be filtered on, and there are also bunch of fields which sorting can be performed on such as seller, price, date added etc.

Such configuration often causes serious challenge choosing proper index configuration as it is hard to add all combinations of restrictions and order by to be fully indexed.

An extra problem comes from the fact MySQL prefers when it is possible to use index for further restriction and than using file sort, rather than using index for sorting and doing non-index based filtering for further restrictions. Here is example:
[read more...]

February 14, 2007

Beware: key_buffer_size larger than 4G does not work

Posted by peter |

I was working with customer today which has MySQL on a system with some 64GB or RAM running MyISAM, so they set key_buffer_size to 16G… and every few days MySQL crashes.

Why ? Because key_buffer_size over 4GB in size is not really supported (checked with latest and greatest MySQL Enterprise 5.0.34). It works just fine until you have less than 4GB worth of key cache used and then it would crash.

If you do not know about this limitation it may be quite complicated to figure out what the problem is as MySQL will be crashing on random statements associated to random tables.

What I find interesting is this bug was known for years and not only fix was not made available but even maximum key_buffer_size was not restricted to safe limit. Former would be close to changing single line of code with variable values limits, might be a bit more than that but it should not be that bad.

Getting use of Slave in MySQL Replication

Posted by peter |

MySQL Replication is asynchronous which causes problems if you would like to use MySQL Slave as it can contain stale data. It is true delay is often insignificant but in times of heavy load or in case you was running some heavy queries on the master which not take time to replicate to the slave replication lag can be significant. Also even very small lag can cause the problems – for example you’ve posted comment on the blog and on next page reload you do not see it as it was read from the slave millisecond later…. this is something you would not like to happen.

I’ll list some techniques here which I found to be helpful for offloading load to the slave without causing application to be have crazy. The same approach can be used in Master-Master replication in Active-Passive mode, just think about passive node as a slave.
[read more...]

February 11, 2007

Content delivery system design mistakes

Posted by peter |

This week I helped dealing with performance problems (part MySQL related and part related to LAMP in general) of system which does quite a bit of content delivery, serving file downloads and images – something a lot of web sites need to do these days. There were quite a bit of mistakes in design for this one which I though worth to note, adding some issues seen in other systems.

Note this list applies to static content distribution, dynamic content has some of its own issues which need different treatment.

DNS TTL Settings The system was using DNS based load balancing, using something like img23.domain.com to serve some of the images. I’m not big fan of purely DNS based load balancing and HA but it works if configured well. In this case however the problem was zero TTL set in DNS configuration. This obviously adds latency especially for “aggregate” pages which may require images to be pulled from 10 different image servers.

Keep Alive In my previous post I wrote you often do not need keep alive for dynamic pages (there are also exceptions) but you really should have Keep Alive enabled while serving images. It especially hurts not to have one if 30 thumbnails are loaded per page if you do not have one.
[read more...]

February 8, 2007

Debugging sleeping connections with MySQL

Posted by peter |

Have you ever seen connection in the SHOW PROCESSLIST output which is in “Sleep” state for a long time and you have no idea why this would happen ?

I see if frequently with web applications and it is often indication of trouble. Not only it means you may run out of MySQL connections quicker than you expected but it also frequently indicates serious problems in the application. If you do not use persistent connections and you have connection in Sleep stage for 600 seconds what could it be ? It may mean some of your pages take that long to generate (or might be the code simply gets into the tight loop and page never gets generated) it also could mean some of external Web Services are slow or not available and you’re not dealing with timeouts properly. Or may be you have several connections to MySQL server and right now running query which takes that long ? In any case it is something frequently worth looking at.

[read more...]

February 5, 2007

Why do you need many apache children ?

Posted by peter |

I already wrote kind of about same topic a while ago and now interesting real life case makes me to write again :)

Most Web applications we’re working with have single tier web architecture, meaning there is just single set of apache servers server requests and nothing else – no dedicated server for static content, no squid in front nothing else. This architecture is frequently used even for medium size web sites which have millions of page views per day.

Typically single Apache server in this configuration will have rather high MaxClients settings (in hundreds) and would argue web site performance suffers if the value is decreased, only few however understand why they need MaxClients to be set to some high number.
[read more...]