April 17, 2014

InnoDB scalability issues due to tables without primary keys

Each day there is probably work done to improve performance of the InnoDB storage engine and remove bottlenecks and scalability issues. Hence there was another one I wanted to highlight: Scalability issues due to tables without primary keys This scalability issue is caused by the usage of tables without primary keys. This issue typically shows […]

TokuDB vs InnoDB in timeseries INSERT benchmark

This post is a continuation of my research of TokuDB’s  storage engine to understand if it is suitable for timeseries workloads. While inserting LOAD DATA INFILE into an empty table shows great results for TokuDB, what’s more interesting is seeing some realistic workloads. So this time let’s take a look at the INSERT benchmark.

Edge-case behavior of INSERT…ODKU

A few weeks back, I was working on a customer issue wherein they were observing database performance that dropped through the floor (to the point of an outage) roughly every 4 weeks or so. Nothing special about the environment, the hardware, or the queries; really, the majority of the database was a single table with […]

Benchmarking single-row insert performance on Amazon EC2

I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, […]

Avoiding auto-increment holes on InnoDB with INSERT IGNORE

Are you using InnoDB tables on MySQL version 5.1.22 or newer? If so, you probably have gaps in your auto-increment columns. A simple INSERT IGNORE query creates gaps for every ignored insert, but this is undocumented behaviour. This documentation bug is already submitted. Firstly, we will start with a simple question. Why do we have […]

Side load may massively impact your MySQL Performance

When we’re looking at benchmarks we typically run some stable workload and we run it in isolation – nothing else is happening on the system. This is not however how things happen in real world when we have significant variance in the load and many things can be happening concurrently. It is very typical to […]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]

Instrumentation and the cost of Foreign Keys

I occasionally get in to light arguments healthy discussions with students about whether or not to use Foreign Key constraints on InnoDB tables.  My standard response has always been: “it depends on how much of a tradeoff you are willing to make for performance. In some situations the cost can be considerable”. .. that’s when […]

A workaround for the performance problems of TEMPTABLE views

MySQL supports two different algorithms for views: the MERGE algorithm and the TEMPTABLE algorithm. These two algorithms differ greatly. A view which uses the MERGE algorithm can merge filter conditions into the view query itself. This has significant performance advantages over TEMPTABLE views. A view which uses the TEMPTABLE algorithm will have to compute the […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]