April 16, 2014

Percona XtraDB Cluster (PXC): what about GRA_*.log files ?

How easy is it to identify and debug Percona XtraDB Cluster replication problem ? If you are using PXC, you may have already seen in your datadirectory several log files starting with GRA_ Those files correspond to a replication failure. That means the slave thread was not able to apply one transaction. For each of […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Ultimate MySQL variable and status reference list

I am constantly referring to the amazing MySQL manual, especially the option and variable reference table. But just as frequently, I want to look up blog posts on variables, or look for content in the Percona documentation or forums. So I present to you what is now my newest Firefox toolbar bookmark: an option and […]

Why you don’t want to shard.

Note: This blog post is part 1 of 4 on building our training workshop.

The Percona training workshop will not cover sharding. If you follow our blog, you’ll notice we don’t talk much about the subject; in some cases it makes sense, but in many we’ve seen that it causes architectures to be prematurely complicated.

So let me state it: You don’t want to shard.

Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. The reason you need to shard basically comes down to one of these two reasons

How much overhead is caused by on disk temporary tables

As you might know while running GROUP BY and some other kinds of queries MySQL needs to create temporary tables, which can be created in memory, using MEMORY storage engine or can be created on disk as MYISAM tables. Which one will be used depends on the allowed tmp_table_size and also by the data which […]

MySQL MyISAM Active Active Clustering – looking for trouble ?

Reading last few days worth of planet MySQL and commenting on some entries as you can see. The post by Oli catches my attention. There is also PDF with more details available Oli is saying you can use MySQL with Active Active Clustering and MyISAM tables if you follow certain rules like enabling external locking, […]


Many people asked me to publish a walk through SHOW INNODB STATUS output, showing what you can learn from SHOW INNODB STATUS output and how to use this info to improve MySQL Performance. To start with basics SHOW INNODB STATUS is command which prints out a lot of internal Innodb performance counters, statistics, information about […]

Using MyISAM in production

There were recently number of posts about MyISAM, for example Arjen wrote pretty nice article about MyISAM features so I thought I would share my own view on using MyISAM in production. For me it is not only about table locks. Table locks is only one of MyISAM limitations you need to consider using it […]

MySQL Server Variables – SQL layer or Storage Engine specific.

MySQL Server has tons of variables which may be adjusted to change behavior or for performance purposes. They are documented in the manual as well as on new page Jay has created. Still I see constant confusion out where which of variables apply to storage engines only and which are used on SQL layer and […]