July 29, 2014

Big Data with MySQL and Hadoop at MySQL Connect 2013

I will be talking about Big Data with MySQL and Hadoop at MySQL Connect 2013 (Sept. 21-22) in San Francisco as well as at Percona University at Washington, DC (September 12, 2013). Apache Hadoop is a very popular Big Data solution and we can nowadays easily integrate it with MySQL. I will start with a brief introduction […]

Data compression in InnoDB for text and blob fields

Have you wanted to compress only certain types of columns in a table while leaving other columns uncompressed? While working on a customer case this week I saw an interesting problem where a table had many heavily utilized TEXT fields with some read queries exceeding 500MB (!!), and stored in a 100GB table. In this […]

Data mart or data warehouse?

This is part two in my six part series on business intelligence, with a focus on OLAP analysis. Part 1 – Intro to OLAP Identifying the differences between a data warehouse and a data mart. (this post) Introduction to MDX and the kind of SQL which a ROLAP tool must generate to answer those queries. […]

Blob Storage in Innodb

I’m running in this misconception second time in a week or so, so it is time to blog about it. How blobs are stored in Innodb ? This depends on 3 factors. Blob size; Full row size and Innodb row format. But before we look into how BLOBs are really stored lets see what misconception […]

InnoDB, InnoDB-plugin vs XtraDB on fast storage

To continue fun with FusionIO cards, I wanted to check how MySQL / InnoDB performs here. For benchmark I took MySQL 5.1.42 with built-in InnoDB, InnoDB-plugin 1.0.6, and XtraDB 1.0.6-9 ( InnoDB with Percona patches). As benchmark engine I used tpcc-mysql with 1000 warehouses ( which gives around 90GB of data + indexes) on my […]

Detailed review of Tokutek storage engine

(Note: Review was done as part of our consulting practice, but is totally independent and fully reflects our opinion) I had a chance to take look TokuDB (the name of the Tokutek storage engine), and run some benchmarks. Tuning of TokuDB is much easier than InnoDB, there only few parameters to change, and actually out-of-box […]

Picking datatype for STATUS fields

Quite commonly in the applications you would need to use some kind of “status” field – status of order – “new”, “confirmed”, “in production”, “shipped” status of job, message etc. People use variety of ways to handle them often without giving enough thought to the choice which can cause problems later. Perhaps worst, though quite […]

How fast can MySQL Process Data

Reading Barons post about Kickfire Appliance and of course talking to them directly I learned a lot in their product is about beating data processing limitations of current systems. This raises valid question how fast can MySQL process (filter) data using it current architecture ? I decided to test the most simple case – what […]

Monty unviels Maria and starts Blogging

This weekend we’re hearing great news from Michael “Monty” Widenius – one of the Fathers of MySQL. Monty finally found a time to create his own blog with very descriptive name Monty Says. At the same time Monty finally announces Maria – the MyISAM successor storage engine he has been working for last few years. […]

Working with large data sets in MySQL

What does working with large data sets in mySQL teach you ? Of course you have to learn a lot about query optimization, art of building summary tables and tricks of executing queries exactly as you want. I already wrote about development and configuration side of the problem so I will not go to details […]