One of the serious limitations in the fast index creation feature introduced in the InnoDB plugin is that it only works when indexes are explicitly created using ALTER TABLE or CREATE INDEX. Peter has already blogged about it before, here I’ll just briefly reiterate other cases that might benefit from that feature: when ALTER TABLE [...]
When Does InnoDB Update Table Statistics? (And When It Can Bite)
An InnoDB table statistics is used for JOIN optimizations and helping the MySQL optimizer choose the appropriate index for a query. If a table’s statistics or index cardinality becomes outdated, you might see queries which previously performed well suddenly show up on slow query log until InnoDB again updates the statistics. But when does InnoDB [...]
Preprocessing Data
There are many ways of improving response times for users. There are some people that spend a lot of time, energy and money on trying to have the application respond as fast as possible at the time when the users made the request. Those people may miss out on an opportunity to do some or [...]
Using any general purpose computer as a special purpose SIMD computer
Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]
Distributed Set Processing with Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]
Flexviews – part 3 – improving query performance using materialized views
Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually [...]
Using Flexviews – part one, introduction to materialized views
If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized [...]
MySQL 5.5.8 and Percona Server: being adaptive
As we can see, MySQL 5.5.8 comes with great improvements and scalability fixes. Adding up all the new features, you have a great release. However, there is one area I want to touch on in this post. At Percona, we consider it important not only to have the best peak performance, but also stable and predictable performance. I refer you [...]
MySQL Partitioning – can save you or kill you
I wanted for a while to write about using MySQL Partitioning for Performance Optimization and I just got a relevant customer case to illustrate it. First you need to understand how partitions work internally. Partitions are on the low level are separate table. This means when you’re doing lookup by partitioned key you will look [...]
Caching could be the last thing you want to do
I recently had a run-in with a very popular PHP ecommerce package which makes me want to voice a recurring mistake I see in how many web applications are architected. What is that mistake? The ecommerce package I was working with depended on caching. Out of the box it couldn’t serve 10 pages/second unless I [...]

