I had a customer recently who needed to reduce their database size on disk quickly without a lot of messy schema redesign and application recoding. They didn’t want to drop any actual data, and their index usage was fairly high, so we decided to look for unused indexes that could be removed. Collecting data It’s [...]
Find unused indexes
I wrote one week ago about how to find duplicate indexes. This time we’ll learn how to find unused indexes to continue improving our schema and the overall performance. There are different possibilites and we’ll explore the two most common here. User Statistics from Percona Server and pt-index-usage. User Statistics User Statistics is an improvement [...]
Percona Server 5.1.55-12.6
Percona Server version 5.1.55-12.6 is now available for download. It is now the current stable release version. Changes Fixed compiler warnings in both the core server and in XtraDB. (Alxey Kopytov, Yasufumi Kinoshita) Bugs Fixed Bug #602047 – The ROWS_READ columns of TABLE_STATISTICS and INDEX_STATISTICS were not properly updated when a query involved index lookups [...]
Percona Server and XtraBackup weekly news, February 5th
I decided to try a series of blog posts keeping people informed about what’s happening in Percona Server and Percona XtraBackup once a week. I’ll try to digest things, but it turns out to be hard — I want to provide details and links for everything, but then it isn’t really a digest anymore, so [...]
Introducing percona-patches for 5.1
Our patches for 5.0 have attracted significant interest. You can read about SecondLife’s experience here, as well as what Flickr had to say on their blog. The main improvements come in both performance gains and improvements to diagnostics (such as the improvements to the slow log output, and INDEX_STATISTICS). Despite having many requests to port [...]
How (not) to find unused indexes
I’ve seen a few people link to an INFORMATION_SCHEMA query to be able to find any indexes that have low cardinality, in an effort to find out what indexes should be removed. This method is flawed – here’s the first reason why:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | CREATE TABLE `sales` ( `id` int(11) NOT NULL AUTO_INCREMENT, `customer_id` int(11) DEFAULT NULL, `status` enum('archived','active') DEFAULT NULL, PRIMARY KEY (`id`), KEY `status` (`status`) ) ENGINE=MyISAM AUTO_INCREMENT=65691 DEFAULT CHARSET=latin1; mysql> SELECT count(*), status FROM sales GROUP by status; +----------+---------+ | count(*) | status | +----------+---------+ |   65536 | archived | |     154 | active | +----------+---------+ 2 rows in set (0.17 sec) mysql> EXPLAIN SELECT * FROM sales WHERE status='active'; # query 1 +----+-------------+-------+------+---------------+--------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key   | key_len | ref  | rows | Extra      | +----+-------------+-------+------+---------------+--------+---------+-------+------+-------------+ | 1 | SIMPLE     | sales | ref | status       | status | 2      | const | 196 | Using where | +----+-------------+-------+------+---------------+--------+---------+-------+------+-------------+ 1 row in set (0.06 sec) mysql> EXPLAIN SELECT * FROM sales WHERE status='archived'; # query 2 +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra      | +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE     | sales | ALL | status       | NULL | NULL   | NULL | 65690 | Using where | +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ 1 row in set (0.01 sec) |
The cardinality of status index is woeful, but provided that the application [...]
check-unused-keys: A tool to interact with INDEX_STATISTICS
With the growing adoption of Google’s User Statistics Patch**, the need for supporting scripts has become clear. To that end, we’ve created check-unused-keys, a Perl script to provide a nicer interface than directly querying the INFORMATION_SCHEMA database.
Dropping unused indexes
Vadim wrote some time ago about how to find unused indexes with single query. I was working on the system today and found hundreds of unused indexes on dozens of tables so just dropping indexes manually did not look fun. So I extended Vadim’s query to generate ALTER TABLE statements automatically. I also made it [...]
Unused indexes by single query
Usually unused indexes are devil, they waste diskspace, cache, they make INSERT / DELETE / UPDATE operations slower and what makes them worse – it is hard to find them. But now ( with userstatsV2.patch) you can find all unused indexes (since last restart of mysqld) by single query
1 2 3 4 5 6 | SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NAME FROM information_schema.statistics `s` LEFT JOIN information_schema.index_statistics INDXS ON (s.TABLE_SCHEMA = INDXS.TABLE_SCHEMA AND s.TABLE_NAME=INDXS.TABLE_NAME AND s.INDEX_NAME=INDXS.INDEX_NAME) WHERE INDXS.TABLE_SCHEMA IS NULL; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | +--------------+---------------------------+-----------------+ | TABLE_SCHEMA | TABLE_NAME | INDEX_NAME | +--------------+---------------------------+-----------------+ | art100 | article100 | ext_key | | art100 | article100 | site_id | | art100 | article100 | hash | | art100 | article100 | forum_id_2 | | art100 | article100 | published | | art100 | article100 | inserted | | art100 | article100 | site_id_2 | | art100 | author100 | PRIMARY | | art100 | author100 | site_id | ... +--------------+---------------------------+-----------------+ 1150 rows in set (1 min 44.23 sec) |
As you see query [...]
Google’s user_statistics V2 port and changes
Recently Google published V2 release of patches, one of them user_statistics we use in our releases. New features are quite interesting so we decided to port it to fresh releases of MySQL. Features includes: New statistics per user (Cpu_time, Bytes_received, Bytes_sent, etc) New command SHOW CLIENT_STATISTICS, which shows statistics per client’s hostname, not per user [...]

