May 25, 2013

MySQL Backup tools used by Percona Remote DBA for MySQL

As part of Percona Remote DBA for MySQL service we recognize that reliable backups are one of the most important things we can bring to the table. In my experience handling emergencies, the single worst thing that can happen is finding out you don’t have backups available when some sort of data loss or catastrophic [...]

The Optimization That (Often) Isn’t: Index Merge Intersection

Prior to version 5.0, MySQL could only use one index per table in a given query without any exceptions; folks that didn’t understand this limitation would often have tables with lots of single-column indexes on columns which commonly appeared in their WHERE clauses, and they’d wonder why the EXPLAIN plan for a given SELECT would [...]

MySQL Indexing Best Practices: Webinar Questions Followup

I had a lot of questions on my MySQL Indexing: Best Practices Webinar (both recording and slides are available now) We had lots of questions. I did not have time to answer some and others are better answered in writing anyway. Q: One developer on our team wants to replace longish (25-30) indexed varchars with [...]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks [...]

Index Condition Pushdown in MySQL 5.6 and MariaDB 5.5 and its performance impact

I have been working with Peter in preparation for the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. We are taking a look at and benchmarking optimizer enhancements one by one. So in the same way this blog post is aimed at a new optimizer enhancement Index Condition Pushdown (ICP). Its available [...]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to [...]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually [...]

A workaround for the performance problems of TEMPTABLE views

MySQL supports two different algorithms for views: the MERGE algorithm and the TEMPTABLE algorithm. These two algorithms differ greatly. A view which uses the MERGE algorithm can merge filter conditions into the view query itself. This has significant performance advantages over TEMPTABLE views. A view which uses the TEMPTABLE algorithm will have to compute the [...]

Debugging problems with row based replication

MySQL 5.1 introduces row based binary logging. In fact, the default binary logging format in GA versions of MySQL 5.1 is ‘MIXED’ STATEMENT*;   The binlog_format  variable can still be changed per sessions which means it is possible that some of your binary log entries will be written in a row-based fashion instead of the [...]