As part of Percona Remote DBA for MySQL service we recognize that reliable backups are one of the most important things we can bring to the table. In my experience handling emergencies, the single worst thing that can happen is finding out you don’t have backups available when some sort of data loss or catastrophic [...]
Knowing what pt-online-schema-change will do
pt-online-schema-change is simple to use, but internally it is complex. Baron’s webinar about pt-online-schema-change hinted at several of the tool’s complexities. Consequently, users often want to know before making changes what pt-online-schema-change will do when it runs. The tool has two options to help answer this question: –dry-run and –print. When ran with –dry-run and –print, pt-online-schema-change changes nothing [...]
A case for MariaDB’s Hash Joins
MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks [...]
InnoDB’s gap locks
One of the most important features of InnoDB is the row level locking. This feature provides better concurrency under heavy write load but needs additional precautions to avoid phantom reads and to get a consistent Statement based replication. To accomplish that, row level locking databases also acquire gap locks. What is a Phantom Read A [...]
Identifying the load with the help of pt-query-digest and Percona Server
Overview Profiling, analyzing and then fixing queries is likely the most oft-repeated part of a job of a DBA and one that keeps evolving, as new features are added to the application new queries pop up that need to be analyzed and fixed. And there are not too many tools out there that can make [...]
SELECT UNION Results INTO OUTFILE
Here’s a quick tip I know some of us has overlooked at some point. When doing SELECT … UNION SELECT, where do you put the the INTO OUTFILE clause? On the first SELECT, on the last or somewhere else? The manual has the answer here, to quote: Only the last SELECT statement can use INTO [...]
Using any general purpose computer as a special purpose SIMD computer
Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]
Distributed Set Processing with Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]
Moving Subtrees in Closure Table Hierarchies
Many software developers find they need to store hierarchical data, such as threaded comments, personnel org charts, or nested bill-of-materials. Sometimes it’s tricky to do this in SQL and still run efficient queries against the data. I’ll be presenting a webinar for Percona on February 28 at 9am PST. I’ll describe several solutions for storing [...]
Moving from MyISAM to Innodb or XtraDB. Basics
I do not know if it is because we’re hosting a free webinar on migrating MyISAM to Innodb or some other reason but recently I see a lot of questions about migration from MyISAM to Innodb. Webinar will cover the process in a lot more details though I would like to go over basics in [...]

