Percona is glad to announce the release of Percona XtraBackup 1.6.3 on 22 September, 2011 (Downloads are available here and from the Percona Software Repositories). This release is purely composed of bug fixes and is the current stable release of Percona Xtrabackup. If the innodb_file_per_table server option is being used and DDL operations, TRUNCATE TABLE, DROP/CREATE the_same_table or ALTER statements on InnoDB tables are [...]
Preprocessing Data
There are many ways of improving response times for users. There are some people that spend a lot of time, energy and money on trying to have the application respond as fast as possible at the time when the users made the request. Those people may miss out on an opportunity to do some or [...]
The case for getting rid of duplicate “sets”
The most useful feature of the relational database is that it allows us to easily process data in sets, which can be much faster than processing it serially. When the relational database was first implemented, write-ahead-logging and other technologies did not exist. This made it difficult to implement the database in a way that matched [...]
Checking the subset sum set problem with set processing
Hi, Here is an easy way to run the subset sum check from SQL, which you can then distribute with Shard-Query:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | CREATE TABLE `the list` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `val` bigint(20) NOT NULL DEFAULT '0', PRIMARY KEY (`id`), KEY `id` (`id`) ) ENGINE=MyISAM; SELECT val as `val`, COUNT(DISTINCT (id)) as `cd` FROM test.data as d WHERE val in (-2,-3,-10,15,15,16) GROUP BY val; +-----+----------+----------+ | val | cd | CNT | +-----+----------+----------+ | -10 | 1 | 1 | | -3 | 1 | 1 | | -2 | 1 | 1 | | 15 | 35417088 | 35417088 | +-----+----------+----------+ 5 rows in set (40.20 sec) |
Notice there is no 16 in the list. We did not pass the check. There are enough 15s though. The distinct value count for each item in the output set, must at least [...]
Using any general purpose computer as a special purpose SIMD computer
Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]
Distributed set processing performance analysis with ICE 3.5.2pl1 at 20 nodes.
Demonstrating distributed set processing performance Shard-Query + ICE scales very well up to at least 20 nodes This post is a detailed performance analysis of what I’ve coined “distributed set processing”. Please also read this post’s “sister post” which describes the distributed set processing technique. Also, remember that Percona can help you get up and [...]
Distributed Set Processing with Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]
Shard-Query EC2 images available
Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here [...]
Shard-Query turbo charges Infobright community edition (ICE)
Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to [...]
Multiple purge threads in Percona Server 5.1.56 and MySQL 5.6.2
Part of the InnoDB duties, being an MVCC-implementing storage engine, is to get rid of–purge–the old versions of the records as they become obsolete. In MySQL 5.1 this is done by the master InnoDB thread. Since then, InnoDB has been moving towards the parallelized purge: in MySQL 5.5 there is an option to have a [...]

