April 21, 2014

How (not) to find unused indexes

I’ve seen a few people link to an INFORMATION_SCHEMA query to be able to find any indexes that have low cardinality, in an effort to find out what indexes should be removed.  This method is flawed – here’s the first reason why:

The cardinality of status index is woeful, but provided that the application is always only sending query 1 to MySQL it’s actually a pretty good index!  It’s not always like this, but there are a lot of cases where applications have good selectivity with some queries despite what cardinality shows.

Not convinced?  Here’s reason number two:

The index on query 3 had high cardinality but should not be used since too many countries have a population greater than 1000.  An automated search for low cardinality indexes wouldn’t have revealed it’s uselessness.  For range scans, it’s very easy to lead yourself into a trap where your index can not filter out enough rows to be effective.  I see this a lot in consulting issues where customers have queries that use a BETWEEN on a date, but the window of time it is searching in is too wide.

Side Note: In some texts you’ll see people quote the numbers “20-30%” as the minimum amount of rows you have to filter down to for an index to be useful (that is, eliminate 70-80% of rows).  It’s not quite correct to quote this as an exact percentage, since this value is not fixed in MySQL and can be a much wider window (15-60%) depending on the circumstances.  In this case, MySQL flipped from tablescan to index at about 34%.

How am I supposed to find unused indexes then?
You really have to run queries against your server – there is no other way.  From there, there’s a helpful patch in 5.0-percona called INDEX_STATISTICS that can then show you which indexes were touched and which were not.

If you are not running a patched server, then the alternative is to either use a proxy that checks EXPLAIN information (like QUAN) or set your slow query log to zero microseconds (5.1 feature) and then find someway to parse and EXPLAIN all results, then subtract the indexes that were mentioned from all indexes known.  There’s an old tool called mysqlidxchx which should be able to do this.

About Morgan Tocker

Morgan is the Director of Training at Percona. He was formerly a Technical Instructor for MySQL and Sun Microsystems. He has also previously worked in the MySQL Support Team, and provided DRBD support.

Comments

  1. Sudheer says:

    Dear Morgan,

    how to add INDEX_STATISTICS on mysql server. I am using mysql 5.1 on Linux server. and my schema contains 90% unused index. Please guide me how to delete unused index?

    Regards
    Sudheer G

  2. Ryan Lowe says:

    Morgan, don’t forget that we have check-unused-keys … a perl script that uses Userstats to show unused keys.

    – Ryan Lowe

Speak Your Mind

*