April 23, 2014

MySQL Indexing Best Practices: Webinar Questions Followup

I had a lot of questions on my MySQL Indexing: Best Practices Webinar (both recording and slides are available now) We had lots of questions. I did not have time to answer some and others are better answered in writing anyway. Q: One developer on our team wants to replace longish (25-30) indexed varchars with […]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks […]

Fishing with dynamite, brought to you by the randgen and dbqp

I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, […]

Side load may massively impact your MySQL Performance

When we’re looking at benchmarks we typically run some stable workload and we run it in isolation – nothing else is happening on the system. This is not however how things happen in real world when we have significant variance in the load and many things can be happening concurrently. It is very typical to […]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears […]

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]

MySQL Partitioning – can save you or kill you

I wanted for a while to write about using MySQL Partitioning for Performance Optimization and I just got a relevant customer case to illustrate it. First you need to understand how partitions work internally. Partitions are on the low level are separate table. This means when you’re doing lookup by partitioned key you will look […]

Shard-Query adds parallelism to queries

Preamble: On performance, workload and scalability: MySQL has always been focused on OLTP workloads. In fact, both Percona Server and MySQL 5.5.7rc have numerous performance improvements which benefit workloads that have high concurrency. Typical OLTP workloads feature numerous clients (perhaps hundreds or thousands) each reading and writing small chunks of data. The recent improvements to […]

New OLAP Wikistat benchmark: Introduction and call for feedbacks

I’ve seen my posts on Ontime Air traffic and Star Schema Benchmark got a lot of interest (links: http://www.mysqlperformanceblog.com/2010/01/07/star-schema-bechmark-infobright-infinidb-and-luciddb/ http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/ http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/ http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/ ). However benchmarks by itself did not cover all cases I would want, so I was thinking about better scenario. The biggest problem is to get real big enough dataset, and I thank […]