April 24, 2014

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

New OLAP Wikistat benchmark: Introduction and call for feedbacks

I’ve seen my posts on Ontime Air traffic and Star Schema Benchmark got a lot of interest (links: http://www.mysqlperformanceblog.com/2010/01/07/star-schema-bechmark-infobright-infinidb-and-luciddb/ http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/ http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/ http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/ ). However benchmarks by itself did not cover all cases I would want, so I was thinking about better scenario. The biggest problem is to get real big enough dataset, and I thank […]

Maatkit Now Supports Memcached

Have you ever wondered how optimized your Memcached installation is? There is a common misconception that one doesn’t have to think too deeply about Memcached performance, but that is not true. If your setup is inefficient, you could: Burn Memory Waste Network Round-Trips Store Keys That Never Get Retrieved Have a Low Cache Hit Ratio […]

Detailed review of Tokutek storage engine

(Note: Review was done as part of our consulting practice, but is totally independent and fully reflects our opinion) I had a chance to take look TokuDB (the name of the Tokutek storage engine), and run some benchmarks. Tuning of TokuDB is much easier than InnoDB, there only few parameters to change, and actually out-of-box […]

XtraDB storage engine release 1.0.2-3 (Spring edition) codename Sapporo

Today we announce release 1.0.2-3 of our XtraDB storage engine. Here is a list of enhancements: Move to MySQL 5.1.31 Scalability fix — ability to use several rollback segments Increasing the number of rseg may be helpful for CPU scale of write-intentional workloads. See benchmark results. Scalability fix — replaced page_hash mutex to page_hash read-write […]

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

Q&A: Common (but deadly) MySQL Development Mistakes

On Wednesday I gave a presentation on “How to Avoid Common (but Deadly) MySQL Development Mistakes” for Percona MySQL Webinars. If you missed it, you can still register to view the recording and my slides. Thanks to everyone who attended, and especially to folks who asked the great questions. I answered as many as we had time […]

Increasing slow query performance with the parallel query execution

MySQL and Scaling-up (using more powerful hardware) was always a hot topic. Originally MySQL did not scale well with multiple CPUs; there were times when InnoDB performed poorer with more  CPU cores than with less CPU cores. MySQL 5.6 can scale significantly better; however there is still 1 big limitation: 1 SQL query will eventually use only […]

Impact of memory allocators on MySQL performance

MySQL server intensively uses dynamic memory allocation so a good choice of memory allocator is quite important for the proper utilization of CPU/RAM resources. Efficient memory allocator should help to improve scalability, increase throughput and keep memory footprint under the control. In this post I’m going to check impact of several memory allocators on the […]