April 24, 2014

Wrong GROUP BY makes your queries fragile

This is probably well known issue for everyone having some MySQL experience or experience with any other SQL database. Still I see this problem in many production applications so it is worth to mention it, especially as it is connected to MySQL Performance. No it might not affect MySQL Performance per say but it limits […]

Followup questions to ‘What’s new in Percona XtraDB Cluster 5.6′ webinar

Thanks to all who attended my webinar yesterday.  The slides and recording are available on the webinar’s page.  I was a bit overwhelmed with the amount of questions that came in and I’ll try to answer them the best I can here. Q: Does Percona XtraDB Cluster support writing to multiple master? Yes, it does.  However, […]

Automatic replication relaying in Galera 3.x (available with PXC 5.6)

A decade ago MySQL folks were in love with the concept of a relay slave for MySQL high availability across data centers.  A relay is a single slave in a remote data center that receives replication from the global master and, in turn, replicates to all the other local slaves in that data center.  This saved […]

Limited disk space? Compact backups with Percona Xtrabackup 2.1

One very interesting feature, “Compact Backup,” is introduced in Percona XtraBackup 2.1. You can run “compact backups” with the  –compact option, which is very useful for those who have limited disk space to keep the database backup. Now let’s first understand how it works. When we are using –compact option with Innobackupex, it will omit the secondary […]

Virident vCache vs. FlashCache: Part 2

This is the second part in a two-part series comparing Virident’s vCache to FlashCache. The first part was focused on usability and feature comparison; in this post, we’ll look at some sysbench test results. Disclosure: The research and testing conducted for this post were sponsored by Virident. First, some background information. All tests were conducted […]

Percona XtraDB Cluster: Failure Scenarios with only 2 nodes

During the design period of a new cluster, it is always advised to have at least 3 nodes (this is the case with PXC but it’s also the same with PRM). But why and what are the risks ? The goal of having more than 2 nodes, in fact an odd number is recommended in […]

Benchmarking single-row insert performance on Amazon EC2

I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, […]

Aligning IO on a hard disk RAID – the Benchmarks

In the first part of this article I have showed how I align IO, now I want to share results of the benchmark that I have been running to see how much benefit can we get from a proper IO alignment on a 4-disk RAID1+0 with 64k stripe element. I haven’t been running any benchmarks […]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]