April 20, 2014

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

How (not) to find unused indexes

I’ve seen a few people link to an INFORMATION_SCHEMA query to be able to find any indexes that have low cardinality, in an effort to find out what indexes should be removed.  This method is flawed – here’s the first reason why:

The cardinality of status index is woeful, but provided that the application […]

Gathering queries from a server with Maatkit and tcpdump

For the last couple of months, we’ve been quietly developing a MySQL protocol parser for Maatkit. It isn’t an implementation of the protocol: it’s an observer of the protocol. This lets us gather queries from servers that don’t have a slow query log enabled, at very high time resolution. With this new functionality, it becomes […]

Announce: Front End Performance Optimization

I guess many of you know us and so our company for MySQL related services. It is true this is majority of our business at this point but it is far from everything. Our goal in reality is to help people to build and operate quality systems, typically web sites, which means we help customers […]

Using MMM to ALTER huge tables

Few months ago, I wrote about a faster way to do certain table modifications online. It works well when all you want is to remove auto_increment or change ENUM values. When it comes to changes that really require table to be rebuilt – adding/dropping columns or indexes, changing data type, converting data to different character […]

Are PHP persistent connections evil ?

As you probably know PHP “mysql” extension supported persistent connections but they were disabled in new “mysqli” extension, which is probably one of the reasons some people delay migration to this extension. The reason behind using persistent connections is of course reducing number of connects which are rather expensive, even though they are much faster […]