One of the more common questions I get asked is which Linux distribution I would use for a MySQL database server. Bearing the responsibility for someone else’s success means I should advise something that is stable, reliable, easy to manage and has plenty of resources available online. It should also allow running MySQL without too [...]
Which Linux distribution for a MySQL database server? A specific point of view.
Using any general purpose computer as a special purpose SIMD computer
Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]
Shard-Query turbo charges Infobright community edition (ICE)
Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to [...]
Should we give a MySQL Query Cache a second chance ?
Over last few years I’ve been suggesting more people to disable Query Cache than to enable it. It can cause contention problems as well as stalls and due to coarse invalidation is not as efficient as it could be. These are however mostly due to neglect Query Cache received over almost 10 years, with very [...]
Innodb undo segment size and transaction isolation
You might know if you have long running transactions you’re risking having a lot of “garbage” accumulated in undo segment size which can cause performance degradation as well as increased disk space usage. Long transactions can also be bad for other reasons such as taking row level locks which will prevent other transactions for execution, [...]
Lost innodb tables, xfs and binary grep
Before I start a story about the data recovery case I worked on yesterday, here’s a quick tip – having a database backup does not mean you can restore from it. Always verify your backup can be used to restore the database! If not automatically, do this manually, at least once a month. No, seriously [...]
Percona’s Commitments to MySQL Users
You probably saw the Twitter storm over Oracle’s pricing changes and InnoDB in the last few days. The fear about Oracle removing InnoDB from the free version of MySQL was baseless — it was just a misunderstanding. Still, in the years since MySQL has been acquired by Sun, and then by Oracle, many MySQL users [...]
Baron Schwartz interviewed on WebPulp.tv
There’s an interview with Baron Schwartz (that’s me) on WebPulp.tv. Topics include the history of Percona’s software such as Percona Server (our version of the MySQL database server) and XtraBackup, what we do at Percona, what tools we use to do it, how to think logically about performance optimization, what ugly surprises happen when you [...]
Cache Miss Storm
I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything [...]
Data mart or data warehouse?
This is part two in my six part series on business intelligence, with a focus on OLAP analysis. Part 1 – Intro to OLAP Identifying the differences between a data warehouse and a data mart. (this post) Introduction to MDX and the kind of SQL which a ROLAP tool must generate to answer those queries. [...]

