I had a customer recently who needed to reduce their database size on disk quickly without a lot of messy schema redesign and application recoding. They didn’t want to drop any actual data, and their index usage was fairly high, so we decided to look for unused indexes that could be removed. Collecting data It’s [...]
Identifying the load with the help of pt-query-digest and Percona Server
Overview Profiling, analyzing and then fixing queries is likely the most oft-repeated part of a job of a DBA and one that keeps evolving, as new features are added to the application new queries pop up that need to be analyzed and fixed. And there are not too many tools out there that can make [...]
Distributed Set Processing with Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]
Moving Subtrees in Closure Table Hierarchies
Many software developers find they need to store hierarchical data, such as threaded comments, personnel org charts, or nested bill-of-materials. Sometimes it’s tricky to do this in SQL and still run efficient queries against the data. I’ll be presenting a webinar for Percona on February 28 at 9am PST. I’ll describe several solutions for storing [...]
Multi Column indexes vs Index Merge
The mistake I commonly see among MySQL users is how indexes are created. Quite commonly people just index individual columns as they are referenced in where clause thinking this is the optimal indexing strategy. For example if I would have something like AGE=18 AND STATE=’CA’ they would create 2 separate indexes on AGE and STATE [...]
How adding another table to JOIN can improve performance ?
JOINs are expensive and it most typical the fewer tables (for the same database) you join the better performance you will get. As for any rules there are however exceptions The one I’m speaking about comes from the issue with MySQL optimizer stopping using further index key parts as soon as there is a range [...]
MySQL: Followup on UNION for query optimization, Query profiling
Few days ago I wrote an article about using UNION to implement loose index scan. First I should mention double IN also works same way so you do not have to use the union. So changing query to:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | mysql> SELECT sql_no_cache name FROM people WHERE age in(18,19,20) AND zip IN (12345,12346, 12347); +----------------------------------+ | name | +----------------------------------+ | ed4481336eb9adca222fd404fa15658e | | 888ba838661aff00bbbce114a2a22423 | +----------------------------------+ 2 rows in set (0.00 sec) mysql> explain SELECT sql_no_cache name FROM people WHERE age in(18,19,20) AND zip IN (12345,12346, 12347); +----+-------------+--------+-------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+-------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | people | range | age | age | 4 | NULL | 9 | Using where | +----+-------------+--------+-------+---------------+------+---------+------+------+-------------+ 1 row in set (0.00 sec) |
So as you see there are really different types of ranges in MySQL. IN range allows [...]
MySQL Query Cache
MySQL has a great feature called “Query Cache” which is quite helpful for MySQL Performance optimization tasks but there are number of things you need to know. First let me clarify what MySQL Query Cache is – I’ve seen number of people being confused, thinking MySQL Query Cache is the same as Oracle Query Cache [...]

