May 18, 2013

Investigating MySQL Replication Latency in Percona XtraDB Cluster

I was curious to check how Percona XtraDB Cluster behaves when it comes to MySQL replication latency — or better yet, call it data propagation latency. It was interesting to see whenever I can get stale data reads from other cluster nodes after write performed to some specific node. To test it I wrote quite a [...]

InnoDB Full-text Search in MySQL 5.6 (part 1)

I’ve never been a very big fan of MyISAM; I would argue that in most situations, any possible advantages to using MyISAM are far outweighed by the potential disadvantages and the strengths of InnoDB. However, up until MySQL 5.6, MyISAM was the only storage engine with support for full-text search (FTS). And I’ve encountered many [...]

Read/Write Splitting with PHP Webinar Questions Followup

Today I gave a presentation on “Read/Write Splitting with PHP” for Percona Webinars.  If you missed it, you can still register to view the recording and my slides. Thanks to everyone who attended, and especially to folks who asked the great questions.  I answered as many as I could during the session, but here are [...]

Quickly finding unused indexes (and estimating their size)

I had a customer recently who needed to reduce their database size on disk quickly without a lot of messy schema redesign and application recoding.  They didn’t want to drop any actual data, and their index usage was fairly high, so we decided to look for unused indexes that could be removed. Collecting data It’s [...]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks [...]

Avoiding auto-increment holes on InnoDB with INSERT IGNORE

Are you using InnoDB tables on MySQL version 5.1.22 or newer? If so, you probably have gaps in your auto-increment columns. A simple INSERT IGNORE query creates gaps for every ignored insert, but this is undocumented behaviour. This documentation bug is already submitted. Firstly, we will start with a simple question. Why do we have [...]

Fishing with dynamite, brought to you by the randgen and dbqp

I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, [...]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to [...]

MySQL caching methods and tips

“The least expensive query is the query you never run.” Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding [...]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually [...]