April 19, 2014

Using GROUP BY WITH ROLLUP for Reporting Performance Optimization

Quite typical query for reporting applications is to find top X values. If you analyze Web Site logs you would look at most popular web pages or search engine keywords which bring you most of the traffic. If you’re looking at ecommerce reporting you may be interested in best selling product or top sales people. […]

Using the new spatial functions in MySQL 5.6 for geo-enabled applications

Geo-enabled (or location enabled) applications are very common nowadays and many of them use MySQL. The common tasks for such applications are: Find all points of interests (i.e. coffee shops) around (i.e. a 10 mile radius) the given location (latitude and longitude). For example we want to show this to a user of the mobile […]

Percona Server 5.5.18-23.0: Now with Group Commit!

Percona is glad to announce the release of Percona Server 5.5.18-23.0 on December 19th, 2011 (Downloads are available here and from the Percona Software Repositories). Based on MySQL 5.5.18, including all the bug fixes in it, Percona Server 5.5.18-23.0 is now the current stable release in the 5.5 series. All of Percona ‘s software is open-source and free, all the details of the release can […]

Testing the Group Commit Fix

As you may know, Kristian Nielsen made a fix for the Group Commit Problem which we many times wrote about. The fix came into MariaDB 5.3 and Mark Callaghan tested it recently . We ported this patch to Percona Server (it is not in the main branch yet), and here are the results of my […]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears […]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]

Using Flexviews – part one, introduction to materialized views

If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized […]

Pacemaker, please meet NDB Cluster or using Pacemaker/Heartbeat to start a NDB Cluster

Customers have always asked me to make NDB Cluster starts automatically upon startup of the servers. For the ones who know NDB Cluster, it is tricky to make it starts automatically. I know at least 2 sets of scripts to manage NDB startup, ndb-initializer and from Johan configurator www.severalnines.com. If all the nodes come up […]

How to generate per-database traffic statistics using mk-query-digest

We often encounter customers who have partitioned their applications among a number of databases within the same instance of MySQL (think application service providers who have a separate database per customer organization … or wordpress-mu type of apps). For example, take the following single MySQL instance with multiple (identical) databases:

Using flow control functions for performance monitoring queries

I’m not big fan on flow control functions like IF or CASE used in MySQL Queries as they are often abused used to create queries which are poorly readable as well as can hardly be optimized well by MySQL Optimizer. One way I find IF statement very useful is computing multiple aggregates over different set […]