April 21, 2014

Schema changes – what’s new in MySQL 5.6?

Among many of the improvements you can enjoy in MySQL 5.6, there is one that addresses a huge operational problem that most DBAs and System Administrators encounter in their life: schema changes. While it is usually not a problem for small tables or those in early stages of product life cycle, schema changes become a […]

Troubleshooting MySQL Memory Usage

One of the most painful troubleshooting tasks with MySQL is troubleshooting memory usage. The problem usually starts like this – you have configured MySQL to use reasonable global buffers, such as innodb_buffer_size, key_buffer_size etc, you have reasonable amount of connections but yet MySQL takes much more memory than you would expect, causing swapping or other […]

Apache PHP MySQL and Runaway Scripts

Sometimes due to programming error or due to very complex query you can get your PHP script running too long, well after user stopped waiting for the page to render and went browsing other sites. Looking at Server-Status I’ve seen scripts executing for hours sometimes which is obviously the problem – they take Apache Slot, […]

To SQL_CALC_FOUND_ROWS or not to SQL_CALC_FOUND_ROWS?

When we optimize clients’ SQL queries I pretty often see a queries with SQL_CALC_FOUND_ROWS option used. Many people think, that it is faster to use this option than run two separate queries: one – to get a result set, another – to count total number of rows. In this post I’ll try to check, is […]

How fast can you sort data with MySQL ?

I took the same table as I used for MySQL Group by Performance Tests to see how much MySQL can sort 1.000.000 rows, or rather return top 10 rows from sorted result set which is the most typical way sorting is used in practice. I tested full table scan of the table completes in 0.22 […]

PERFORMANCE_SCHEMA vs Slow Query Log

A couple of weeks ago, shortly after Vadim wrote about Percona Cloud Tools and using Slow Query Log to capture the data, Mark Leith asked why don’t we just use Performance Schema instead? This is an interesting question and I think it deserves its own blog post to talk about. First, I would say main […]

Is Synchronous Replication right for your app?

I talk with lot of people who are really interested in Percona XtraDB Cluster (PXC) and mostly they are interested in PXC as a high-availability solution.  But, what they tend not to think too much about is if moving from async to synchronous replication is right for their application or not. Facts about Galera replication […]

The case for getting rid of duplicate “sets”

The most useful feature of the relational database is that it allows us to easily process data in sets, which can be much faster than processing it serially. When the relational database was first implemented, write-ahead-logging and other technologies did not exist. This made it difficult to implement the database in a way that matched […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]