May 25, 2013

Benchmarking Percona Server TokuDB vs InnoDB

After compiling Percona Server with TokuDB, of course I wanted to compare InnoDB performance vs TokuDB. I have a particular workload I’m interested in testing – it is an insert-intensive workload (which is TokuDB’s strong suit) with some roll-up aggregation, which should produce updates in-place (I will use INSERT .. ON DUPLICATE KEY UPDATE statements [...]

MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this. [...]

InnoDB Full-text Search in MySQL 5.6 (part 1)

I’ve never been a very big fan of MyISAM; I would argue that in most situations, any possible advantages to using MyISAM are far outweighed by the potential disadvantages and the strengths of InnoDB. However, up until MySQL 5.6, MyISAM was the only storage engine with support for full-text search (FTS). And I’ve encountered many [...]

Logging Deadlock errors

The principal source of information for InnoDB diagnostics is the output of SHOW ENGINE INNODB STATUS but there are some sections that are not very useful. For example, LATEST DETECTED DEADLOCK only shows, as the name implies, the latest error detected. If you have 100 deadlocks per minute you will be able to see only [...]

MySQL Indexing Best Practices: Webinar Questions Followup

I had a lot of questions on my MySQL Indexing: Best Practices Webinar (both recording and slides are available now) We had lots of questions. I did not have time to answer some and others are better answered in writing anyway. Q: One developer on our team wants to replace longish (25-30) indexed varchars with [...]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]

Connecting orphaned .ibd files

There are two ways InnoDB can organize tablespaces. First is when all data, indexes and system buffers are stored in a single tablespace. This is typicaly one or several ibdata files. A well known innodb_file_per_table option brings the second one. Tables and system areas are split into different files. Usually system tablespace is located in [...]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually [...]

MySQL Partitioning – can save you or kill you

I wanted for a while to write about using MySQL Partitioning for Performance Optimization and I just got a relevant customer case to illustrate it. First you need to understand how partitions work internally. Partitions are on the low level are separate table. This means when you’re doing lookup by partitioned key you will look [...]

Statistics of InnoDB tables and indexes available in xtrabackup

If you ever wondered how big is that or another index in InnoDB … you had to calculate it yourself by multiplying size of row (which I should add is harder in the case of a VARCHAR – since you need to estimate average length) on count of records. And it still would be quite [...]