April 19, 2014

Checking the subset sum set problem with set processing

Hi, Here is an easy way to run the subset sum check from SQL, which you can then distribute with Shard-Query:

Notice there is no 16 in the list. We did not pass the check. There are enough 15s though. The distinct value count for each item in the output set, must at least […]

Read/Write Splitting with PHP Webinar Questions Followup

Today I gave a presentation on “Read/Write Splitting with PHP” for Percona Webinars.  If you missed it, you can still register to view the recording and my slides. Thanks to everyone who attended, and especially to folks who asked the great questions.  I answered as many as I could during the session, but here are […]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks […]

Identifying the load with the help of pt-query-digest and Percona Server

Overview Profiling, analyzing and then fixing queries is likely the most oft-repeated part of a job of a DBA and one that keeps evolving, as new features are added to the application new queries pop up that need to be analyzed and fixed. And there are not too many tools out there that can make […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Using Flexviews – part one, introduction to materialized views

If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized […]

Caching could be the last thing you want to do

I recently had a run-in with a very popular PHP ecommerce package which makes me want to voice a recurring mistake I see in how many web applications are architected. What is that mistake? The ecommerce package I was working with depended on caching.  Out of the box it couldn’t serve 10 pages/second unless I […]

Quick comparison of MyISAM, Infobright, and MonetDB

Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client’s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they’re ad-hoc (can access […]

A rule of thumb for choosing column order in indexes

I wanted to share a little rule of thumb I sometimes use to decide which columns should come first in an index. This is not specific to MySQL, it’s generally applicable to any database server with b-tree indexes. And there are a bunch of subtleties, but I will also ignore those for the sake of […]

Goal driven performance optimization

When your goal is to optimize application performance it is very important to understand what goal do you really have. If you do not have a good understanding of the goal your performance optimization effort may well still bring its results but you may waste a lot of time before you reach same results as […]