April 20, 2014

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks […]

New distribution of random generator for sysbench – Zipf

Sysbench has three distribution for random numbers: uniform, special and gaussian. I mostly use uniform and special, and I feel that both do not fully reflect my needs when I run benchmarks. Uniform is stupidly simple: for a table with 1 mln rows, each row gets equal amount of hits. This barely reflects real system, […]

Introducing new type of benchmark

Traditionally the most benchmarks are focusing on throughput. We all get used to that, and in fact in our benchmarks, sysbench and tpcc-mysql, the final result is also represents the throughput (transactions per second in sysbench; NewOrder transactions Per Minute in tpcc-mysql). However, like Mark Callaghan mentioned in comments, response time is way more important […]

Percona XtraDB Cluster Feature 2: Multi-Master replication

This is about the second great feature – Multi-Master replication, what you get with Percona XtraDB Cluster. It is recommended you get familiar with general architecture of the cluster, described on the previous post. By Multi-Master I mean the ability to write to any node in your cluster and do not worry that eventually you […]

Making the impossible: 3 nodes intercontinental replication

In this post I want to show new possibilities which open with Percona XtraDB Cluster. We will create 3 nodes Cluster with nodes on different continents (Europe, USA, Japan) and each node will accept write queries. Well, you theoretically could create 3 node traditional MySQL ring replication, but this is not what you want to […]

kernel_mutex problem cont. Or triple your throughput

This is to follow up my previous post with kernel_mutex problem. First, I may have an explanation why the performance degrades to significantly and why innodb_sync_spin_loops may fix it. Second, if that is correct ( or not, but we can try anyway), than playing with innodb_thread_concurrency also may help. So I ran some benchmarks with […]

Fishing with dynamite, brought to you by the randgen and dbqp

I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, […]

MLC SSD card lifetime and write amplification

As MLC-based SSD cards are raising popularity, there is also a raising concern how long it can survive. As we know, a MLC NAND module can handle 5,000-10,000 erasing cycles, after which it gets unusable. And obviously the SSD card based on MLC NAND has a limited lifetime. There is a lot of misconceptions and […]

Flexviews is a working scalable database transactional memory example

http://Flexvie.ws fully implements a method for creating materialized views for MySQL data sets. The tool is for MySQL, but the methods are database agnostic. A materialized view is an analogue of software transactional memory. You can think of this as database transactional memory, or as database state distributed over time, but in an easy way […]

The case for getting rid of duplicate “sets”

The most useful feature of the relational database is that it allows us to easily process data in sets, which can be much faster than processing it serially. When the relational database was first implemented, write-ahead-logging and other technologies did not exist. This made it difficult to implement the database in a way that matched […]