When examining MySQL configuration, we quite often want to know how various buffer sizes are used. This matters because some buffers (sort_buffer_size for example) are allocated to their full size immediately as soon as they are needed, but others are effectively a “max size” and the corresponding buffers are allocated only as big as needed [...]
Extending Index for Innodb tables can hurt performance in a surprising way
One schema optimization we often do is extending index when there are queries which can use more key part. Typically this is safe operation, unless index length increases dramatically queries which can use index can also use prefix of the new index are they ? It turns there are special cases when this is not [...]
Multi Column indexes vs Index Merge
The mistake I commonly see among MySQL users is how indexes are created. Quite commonly people just index individual columns as they are referenced in where clause thinking this is the optimal indexing strategy. For example if I would have something like AGE=18 AND STATE=’CA’ they would create 2 separate indexes on AGE and STATE [...]
Statistics of InnoDB tables and indexes available in xtrabackup
If you ever wondered how big is that or another index in InnoDB … you had to calculate it yourself by multiplying size of row (which I should add is harder in the case of a VARCHAR – since you need to estimate average length) on count of records. And it still would be quite [...]
High-Performance Click Analysis with MySQL
We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work. The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is [...]
Picking datatype for STATUS fields
Quite commonly in the applications you would need to use some kind of “status” field – status of order – “new”, “confirmed”, “in production”, “shipped” status of job, message etc. People use variety of ways to handle them often without giving enough thought to the choice which can cause problems later. Perhaps worst, though quite [...]
The MySQL optimizer, the OS cache, and sequential versus random I/O
In my post on estimating query completion time, I wrote about how I measured the performance on a join between a few tables in a typical star schema data warehousing scenario. In short, a query that could take several days to run with one join order takes an hour with another, and the optimizer chose [...]
Speeding up GROUP BY if you want aproximate results
Doing performance analyzes today I wanted to count how many hits come to the pages which get more than couple of visits per day. We had SQL logs in the database so It was pretty simple query:
1 | select sum(cnt) from (select count(*) cnt from performance_log_080306 group by page having cnt>2) pv; |
Unfortunately this query ran for over half an hour badly overloaded server and I had to kill [...]
Finding out largest tables on MySQL Server
Finding largest tables on MySQL instance is no brainier in MySQL 5.0+ thanks to Information Schema but I still wanted to post little query I use for the purpose so I can easily find it later, plus it is quite handy in a way it presents information:
What exactly is read_rnd_buffer_size
Looking for documentation for read_rnd_buffer_size you would find descriptions such as “The read_rnd_buffer_size is used after a sort, when reading rows in sorted order. If you use many queries with ORDER BY, upping this can improve performance” which is cool but it does not really tell you how exactly read_rnd_buffer_size works as well as which [...]

