May 25, 2013

Identifying the load with the help of pt-query-digest and Percona Server

Overview Profiling, analyzing and then fixing queries is likely the most oft-repeated part of a job of a DBA and one that keeps evolving, as new features are added to the application new queries pop up that need to be analyzed and fixed. And there are not too many tools out there that can make [...]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]

Performance Schema tables stats

My previous benchmark on Performance Schema was mainly in memory workload and against single tables. Now after adding multi-tables support to sysbench, it is interesting to see what statistic we can get from workload that produces some disk IO. So let’s run sysbench against 100 tables, each 5000000 rows (~1.2G ) and buffer pool 30G. [...]

Using Flexviews – part two, change data capture

In my previous post I introduced materialized view concepts. This post begins with an introduction to change data capture technology and describes some of the ways in which it can be leveraged for your benefit. This is followed by a description of FlexCDC, the change data capture tool included with Flexviews. It continues with an [...]

Analyzing air traffic performance with InfoBright and MonetDB

Accidentally me and Baron played with InfoBright (see http://www.mysqlperformanceblog.com/2009/09/29/quick-comparison-of-myisam-infobright-and-monetdb/) this week. And following Baron’s example I also run the same load against MonetDB. Reading comments to Baron’s post I tied to load the same data to LucidDB, but I was not successful in this. I tried to analyze a bigger dataset and I took public [...]

Statistics of InnoDB tables and indexes available in xtrabackup

If you ever wondered how big is that or another index in InnoDB … you had to calculate it yourself by multiplying size of row (which I should add is harder in the case of a VARCHAR – since you need to estimate average length) on count of records. And it still would be quite [...]

MVCC: Transaction IDs, Log Sequence numbers and Snapshots

MySQL Storage Engines implementing Multi Version Concurrency Control have several internal identifiers related to MVCC. I see a lot of people being confused what they are and why they are needed so I decided to take a time to explain it a bit. This is general explanation, it does not corresponds to Innodb in particular [...]

Can Innodb Read-Ahead reduce read performance ?

I ran into pretty interesting behavior today. We needed to dump and reload large database and we had pretty good IO subsystem so we started number of mysqldump processes in parallel. Unlike in other case when we did load in parallel, dump in parallel did not increase IO rate significantly and we could still see [...]

MySQL: Followup on UNION for query optimization, Query profiling

Few days ago I wrote an article about using UNION to implement loose index scan. First I should mention double IN also works same way so you do not have to use the union. So changing query to:

So as you see there are really different types of ranges in MySQL. IN range allows [...]

Choosing proper innodb_log_file_size

If you’re doing significant amount of writes to Innodb tables decent size of innodb_log_file_size is important for MySQL Performance. However setting it too large will increase recovery time, so in case of MySQL crash or power failure it may take long time before MySQL Server is operational again. So how to find the optimal combination [...]