Overview Profiling, analyzing and then fixing queries is likely the most oft-repeated part of a job of a DBA and one that keeps evolving, as new features are added to the application new queries pop up that need to be analyzed and fixed. And there are not too many tools out there that can make [...]
Identifying the load with the help of pt-query-digest and Percona Server
Distributed Set Processing with Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]
Shard-Query turbo charges Infobright community edition (ICE)
Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to [...]
Flexviews – part 3 – improving query performance using materialized views
Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually [...]
Using Flexviews – part two, change data capture
In my previous post I introduced materialized view concepts. This post begins with an introduction to change data capture technology and describes some of the ways in which it can be leveraged for your benefit. This is followed by a description of FlexCDC, the change data capture tool included with Flexviews. It continues with an [...]
Ultimate MySQL variable and status reference list
I am constantly referring to the amazing MySQL manual, especially the option and variable reference table. But just as frequently, I want to look up blog posts on variables, or look for content in the Percona documentation or forums. So I present to you what is now my newest Firefox toolbar bookmark: an option and [...]
High availability for MySQL on Amazon EC2 – Part 2 – Setting up the initial instances
This post is the second of a series that started here. The first step to build the HA solution is to create two working instances, configure them to be EBS based and create a security group for them. A third instance, the client, will be discussed in part 7. Since this will be a proof [...]
Star Schema Bechmark: InfoBright, InfiniDB and LucidDB
In my previous rounds with DataWarehouse oriented engines I used single table without joins, and with small (as for DW) datasize (see http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/, http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/, http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/). Addressing these issues, I took Star Schema Benchmark, which is TPC-H modification, and tried run queries against InfoBright, InfiniDB, LucidDB and MonetDB. I did not get results for MonetDB, will [...]
Upgrading MySQL
Upgrading MySQL Server is a very interesting task as you can approach it with so much different “depth”. For some this is 15 minutes job for others it is many month projects. Why is that ? Performing MySQL upgrade two things should normally worry you. It is Regressions – functionality regressions when what you’ve been [...]
A micro-benchmark of stored routines in MySQL
Ever wondered how fast stored routines are in MySQL? I just ran a quick micro-benchmark to compare the speed of a stored function against a “roughly equivalent” subquery. The idea — and there may be shortcomings that are poisoning the results here, your comments welcome — is to see how fast the SQL procedure code [...]

