May 24, 2013

Why MySQL Performance at Low Concurrency is Important

A few weeks ago I wrote about “MySQL Performance at High Concurrency” and why it is important, which was followed up by Vadim’s post on ThreadPool in Percona Server providing some great illustration on the topic. This time I want to target an opposite question: why MySQL performance at low concurrency is important for you. [...]

Accessing Percona XtraDB Cluster nodes in parallel from PHP using MySQL asynchronous queries

This post is followup to Peter’s recent post, “Investigating MySQL Replication Latency in Percona XtraDB Cluster,” in which a question was raised as to whether we can measure latency to all nodes at the same time. It is an interesting question: If we have N nodes, can we send queries to nodes to be executed in [...]

Sphinx search performance optimization: multi-threaded search

Queries in MySQL, Sphinx and many other database or search engines are typically single-threaded. That is when you issue a single query on your brand new r910 with 32 CPU cores and 16 disks, the maximum that is going to be used to process this query at any given point is 1 CPU core and [...]

Load management Techniques for MySQL

One of the very frequent cases with performance problems with MySQL is what they happen every so often or certain times. Investigating them we find out what the cause is some batch jobs, reports and other non response time critical activities are overloading the system causing user experience to degrade. The first thing you need [...]

Is your MySQL Application having Busy IO by Oracle Measures ?

Preparing Choosing Storage Systems for MySQL talk for Percona Live in Washington,DC I ran into great paper called Sane SAN 2010 by James Morle from Scale Abilities – and Oracle consulting company. It is worth to read for variety of reason yet for this post I wanted to mention what James calls “Busy” Oracle database [...]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears [...]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]

Cache Miss Storm

I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything [...]

Data mart or data warehouse?

This is part two in my six part series on business intelligence, with a focus on OLAP analysis. Part 1 – Intro to OLAP Identifying the differences between a data warehouse and a data mart. (this post) Introduction to MDX and the kind of SQL which a ROLAP tool must generate to answer those queries. [...]

Scaling: Consider both Size and Load

So lets imagine you have the server handling 100.000 user accounts. You can see the CPU,IO and Network usage is below 10% of capacity – does it mean you can count on server being able to handle 1.000.000 of accounts ? Not really, and there are few reasons why, I’ll name most important of them: [...]