July 26, 2014

Getting MySQL to use full key length

There is one bug, or “missing feature” in MySQL Optimizer which may give you hard time causing performance problems which may be hard to track down, it is using only part of the index when full index can be used or using shorter index while there is longer index available. The last item is yet another good reason for removing redundant indexes

Here is example from NNSEEK database:

Where group_id key is defined as KEY group_id (group_id,parent_id,published)

As you can see MySQL selects to use “ref” access only using two first key parts from the index and it assumes there is only one row to be matched by index. In fact there are 2000 matching rows for these constants (parent_id=0 is special value which matches significant amount of rows).

I think optimizer should be fixed to always expand and use “range” lookup at least for simple ranges. Even if further restrictions by next key part will only barely increase selectivity it anyway does not cost much more as basically the same data is traversed anyway.

So how do you force MySQL optimizer to use full length in this case ? ANALYZE TABLE does not help, at least in this case. However looks like as one of its side effects FORCE INDEX actually forces index to be used to largest extent possible:

As you can see in this case estimate becomes closer to the truth and this query runs about 50 times faster and easy on buffer pool (we do not have many un-needed rows accessed causing their pages to be loaded replacing valuable data from cache).

MySQL could look at the stats and adjust them appropriately – the plan for second query is accessing subset of rows from the first plan so estimation for number of rows for the first query should not be smaller. Stats and B-Tree dives can give information which is quite far away from the real number so it is worth at least make it consistent in this respect.

The other example I mentioned in the start of this post which I often have trouble with (just do not have example today) is using different index, for example, having indexes (A) and (A,B) for query A=Const and B>Const we can see MySQL selecting (A) index instead of (A,B) or even worse it can can select something like (A,D) using only first keypart of that index.

To fix this problem I guess another statistics alignment should take place – if you have several keys which start with prefix A and you use only this prefix for lookup – estimate number of rows should be the same for all indexes.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Alla says:

    Peter – are you aware of any improvements for this issue in 5.5?

  2. Peter, this ought to work. The EXPLAIN should be like in the second case whether FORCE INDEX clause is present or not.
    The “use range(index) instead of ref(the-same-index)” heuristics has been in the code since long time. IIRC it had some bug but we’ve fixed quite some time ago.
    Could you please post the table DDL and data sample, or file a bug? I’m interested to look.

  3. peter says:

    Sergey,

    This case is from MySQL 4.1.22 so it might be fixed in later versions.
    In later versions it might be fixed but in older you need to use workaround as described.

    Regarding choosing wrong key like (A,D) instead of (A,B) I’m confident the bug happens in MySQL 5.0 I’ll get you the test case as soon as we see it again.

  4. araki says:

    typo: “worse it can can select something”

  5. chad says:

    This seems to still be a weakness of MySQL. I am running 5.0.32 enterprise and I have the same experience. I am working with MySQL support who really have not been much help up to this point. The workaround mentioned by Peter above does indeed work for me too, and cuts my query time from about 120 seconds to 4 seconds. Hopefully MySQL will address this soon.

    mysql> select version();
    +—————————+
    | version() |
    +—————————+
    | 5.0.32-enterprise-gpl-log |
    +—————————+
    1 row in set (0.06 sec)

    mysql> explain SELECT SUM(IF(FACILITY=’failedsell’,1,0)), SUM(IF(FACILITY=’faile
    dbuy’,1,0)) FROM XCORE.LOGS WHERE PARTITIONID=1 AND TIME>=1171515600 AND TIME explain SELECT SUM(IF(FACILITY=’failedsell’,1,0)), SUM(IF(FACILITY=’faile
    dbuy’,1,0)) FROM XCORE.LOGS AS LOGS FORCE INDEX(PARTITIONIDTIME) WHERE PARTITION
    ID=1 AND TIME>=1171515600 AND TIME

Speak Your Mind

*