Find unused indexes

I wrote one week ago about how to find duplicate indexes. This time we’ll learn how to find unused indexes to continue improving our schema and the overall performance. There are different possibilites and we’ll explore the two most common here. User Statistics from Percona Server and pt-index-usage.

User Statistics

User Statistics is an improvement on Percona Server that adds some tables to Information Schema with useful information to understand the server activity and identify the source of the load. Those tables store statistics from our users, tables, indexes and so on, information very helpful to have a clear picture of our users are doing in the database.

To accomplish our task of finding unused indexes we’re going to use INFORMATION_SCHEMA.INDEX_STATISTICS but first we should enable User Statistics:

mysql> SET GLOBAL userstat=on;

After all this hard work we have the statistics plugin gathering information. I usually enable it during load peaks to get realistic information about what is happening in the server. To generate some load on the database I’m going to use the benchmark software tpcc-mysql.

After some runs we can start to analyze the information on INDEX_STATISTICS table.

mysql> SELECT * FROM INDEX_STATISTICS;
+--------------+------------+--------------+-----------+
| TABLE_SCHEMA | TABLE_NAME | INDEX_NAME   | ROWS_READ |
+--------------+------------+--------------+-----------+
| tpcc         | orders     | PRIMARY      |     59472 |
| tpcc         | customer   | PRIMARY      |    128652 |
| tpcc         | stock      | PRIMARY      |   1128357 |
| tpcc         | district   | PRIMARY      |    116107 |
| tpcc         | item       | PRIMARY      |    282870 |
| tpcc         | warehouse  | PRIMARY      |     84959 |
| tpcc         | orders     | idx_orders   |      2832 |
| tpcc         | customer   | idx_customer |    175240 |
| tpcc         | new_orders | PRIMARY      |     56202 |
| tpcc         | order_line | PRIMARY      |   1164682 |
+--------------+------------+--------------+-----------+

mysql> SELECT * FROM INDEX_STATISTICS;

+--------------+------------+--------------+-----------+

+--------------+------------+--------------+-----------+

+--------------+------------+--------------+-----------+

These are the index used during the benchmark. Let’s compare it with the data on INNODB_INDEX_STATS to get the list of unused indexes:

mysql> SELECT INNODB_INDEX_STATS.TABLE_NAME, INNODB_INDEX_STATS.INDEX_NAME from INNODB_INDEX_STATS WHERE CONCAT(INNODB_INDEX_STATS.index_name, INNODB_INDEX_STATS.table_name)NOT IN(SELECT CONCAT(index_statistics.index_name, index_statistics.table_name) FROM index_statistics) AND INNODB_INDEX_STATS.TABLE_SCHEMA='tpcc' AND INNODB_INDEX_STATS.INDEX_NAME<>'GEN_CLUST_INDEX';
+------------+-------------------+
| TABLE_NAME | INDEX_NAME        |
+------------+-------------------+
| stock      | fkey_stock_2      |
| order_line | fkey_order_line_2 |
| history    | fkey_history_1    |
| history    | fkey_history_2    |
+------------+-------------------+

mysql> SELECT INNODB_INDEX_STATS.TABLE_NAME, INNODB_INDEX_STATS.INDEX_NAME from INNODB_INDEX_STATS WHERE CONCAT(INNODB_INDEX_STATS.index_name, INNODB_INDEX_STATS.table_name)NOT IN(SELECT CONCAT(index_statistics.index_name, index_statistics.table_name) FROM index_statistics) AND INNODB_INDEX_STATS.TABLE_SCHEMA='tpcc' AND INNODB_INDEX_STATS.INDEX_NAME<>'GEN_CLUST_INDEX';

+------------+-------------------+

| TABLE_NAME | INDEX_NAME |

+------------+-------------------+

| stock | fkey_stock_2 |

| order_line | fkey_order_line_2 |

| history | fkey_history_1 |

| history | fkey_history_2 |

+------------+-------------------+

Here we see one of the limitations on User Statistics, it can only track direct lookups on index keys but not lookups done during constraint checks. That’s the reason FK are marked as unused so that means all the other indexes has been used during the benchmark.

The second limitation is with partitioned tables, you can’t get index statistics usage from those tables.

pt-index-usage

The previous example only works with Percona Server so with stock MySQL we should try another approach.

Percona Toolkit has a tool to check the usage of our index from the slow query log. The concept is easy to understand. pt-index-usage reads the slow query log and execute every query with EXPLAIN to ask MySQL which indexes would it use. At the end of the process you’ll have an output with a list of not used indexes. This is an example:

root@debian:/var/log# pt-index-usage slow.log
slow.log:  11% 03:58 remain
slow.log:  21% 03:43 remain
slow.log:  32% 03:09 remain
[...]
ALTER TABLE `tpcc`.`order_line` DROP KEY `fkey_order_line_2`; -- type:non-unique
ALTER TABLE `tpcc`.`orders` DROP KEY `idx_orders`; -- type:non-unique
ALTER TABLE `tpcc`.`stock` DROP KEY `fkey_stock_2`; -- type:non-unique

root@debian:/var/log# pt-index-usage slow.log

slow.log: 11% 03:58 remain

slow.log: 21% 03:43 remain

slow.log: 32% 03:09 remain

[...]

ALTER TABLE `tpcc`.`order_line` DROP KEY `fkey_order_line_2`; -- type:non-unique

ALTER TABLE `tpcc`.`orders` DROP KEY `idx_orders`; -- type:non-unique

ALTER TABLE `tpcc`.`stock` DROP KEY `fkey_stock_2`; -- type:non-unique

If you compare these results with the previous example, there are some differences and these are some of the reasons:

pt-index-usage uses EXPLAIN and User Statistics number of rows read for every index. So we’re comparing an estimation with real data.

EXPLAIN, as I said before, is an estimation from the optimizer and sometimes the real query can use a different execution plan.

pt-index-usage tries to convert non-SELECT queries to SELECT queries and is not always possible to get a perfect conversion so there can be some differences.

Be cautious with Unique indexes, because maybe they are not used for index lookups but your application may need them to avoid duplicates on the column.

Take in account that depending on the size of the log the server would need lot of process time and cpu power so I recommend you to not use in the production server. Run it on a slave or a testing environment with an similar data estructure and size.

Conclusion

We’ve learnt how to find unused indexes using different approaches and we’ve also learnt that we should’t follow tool’s suggestion literally. These tools are here to help us and after that is our job to check and benchmark the changes on our testing environment to measure the impact on performance. You won’t need to drop all such indexes but it gives you an idea of which of them you should evaluate. Good tools to help us on that evaluation are pt-log-player and pt-upgrade.

It’s worth to mention the importance of taking a long enough period of logs or statistics to analyze all relevant workload. In some cases there are processes that run once a day or a week and we should have that information in our analysis report.

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Vad Gor

8 years ago

Fix word “estructure”, please

Short Sorceress

7 years ago

SET GLOBAL userstat=on;
ERROR 1193 (HY000): Unknown system variable ‘userstat’
on mysql 5.5

Prasanth

Reply to Short Sorceress

7 years ago

“The previous example only works with Percona Server so with stock MySQL we should try another approach.” Do you have MySQL with Percona server?

Vojtech Kurka

6 years ago

Updated query that works on 5.7:

SELECT DISTINCT iis.database_name, iis.table_name, iis.index_name
FROM mysql.innodb_index_stats iis
LEFT JOIN information_schema.index_statistics istat ON istat.TABLE_SCHEMA = iis.database_name AND istat.TABLE_NAME = iis.table_name AND istat.INDEX_NAME = iis.index_name
WHERE istat.TABLE_SCHEMA IS NULL
AND iis.index_name’PRIMARY’

sudhanshu sirohia

5 years ago

For Mysql 5.6 and later
SELECT object_schema,
object_name,
index_name
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
AND count_star = 0
ORDER BY object_schema, object_name;

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Find unused indexes

User Statistics

pt-index-usage

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Sets and Sorted Sets

Securing Your MySQL Database: Essential Best Practices

Troubleshooting PostgreSQL on Kubernetes With Coroot

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Find unused indexes

User Statistics

pt-index-usage

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Sets and Sorted Sets

Securing Your MySQL Database: Essential Best Practices

Troubleshooting PostgreSQL on Kubernetes With Coroot

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation