3 ways MySQL uses indexes

I often see people confuse different ways MySQL can use indexing, getting wrong ideas on what query performance they should expect. There are 3 main ways how MySQL can use the indexes for query execution, which are not mutually exclusive, in fact some queries will use indexes for all 3 purposes listed here.

Using index to find rows The main purpose of the index is to find rows quickly – without scanning whole data set. This is most typical reason index gets added on the first place. Most popular index type in MySQL – BTREE can speed up equality and prefix range matches. So if you have index on (A,B) This index can be used to lookup rows for WHERE clauses like A=5 ; A BETWEEN 5 AND 10 ; A=5 AND B BETWEEN 5 AND 10 it however will NOT be able to help lookup rows for B BETWEEN 5 AND 10 predicate because it is not index prefix. It is important to look at key_len column in explain plan to see how many index parts are actually used for row lookup. Very common problem I see is multi column indexes which are used but only to their short prefix which is not very selective. A lot of this mistakes come from missing one very important MySQL limitation – once MySQL runs into the interval range it will not use any further index parts. If you have A BETWEEN 5 AND 10 AND B=5 for the same index MySQL will use the index… but it will only use A prefix for row lookups and scan whole A BETWEEN 5 AND 10 range. It is interesting to note this limitation only applies to interval ranges – for enumerated ranges MySQL will use both key parts. Hence if you change this predicate to A IN (5,6,7,8,9,10) AND B=5 you will quite likely see improved query performance. Beware however of large nested enumerated ranges they are very hard on the optimizer. This just describes how MySQL uses single index – there are more complex rules of how indexes will be used if you look at multiple indexes usage with “index merge”

Using Index to Sort Data Another great benefit of BTREE index is – it allows to retrieve data in sorted form hence avoiding external sort process for executing of queries which require sorting. Using index for sorting often comes together with using index to find rows, however it can also be used just for sort for example if you’re just using ORDER BY without and where clauses on the table. In such case you would see “Index” type in explain which correspond to scanning (potentially) complete table in the index order. It is very important to understand in which conditions index can be used to sort data together with restricting amount of rows. Looking at the same index (A,B) things like ORDER BY A ; ORDER BY A,B ; ORDER BY A DESC, B DESC will be able to use full index for sorting (note MySQL may not select to use index for sort if you sort full table without a limit). However ORDER BY B or ORDER BY A, B DESC will not be able to use index because requested order does not line up with the order of data in BTREE. If you have both restriction and sorting things like this would work A=5 ORDER BY B ; A=5 ORDER BY B DESC; A>5 ORDER BY A ; A>5 ORDER BY A,B ; A>5 ORDER BY A DESC which again can be easily visualized as scanning a range in BTREE. Things like this however would not work A>5 ORDER BY B , A>5 ORDER BY A,B DESC or A IN (3,4) ORDER BY B – in these cases getting data in sorting form would require a bit more than simple range scan in the BTREE and MySQL decides to pass it on. There are some workarounds you can use though.

Using index to read data Some storage engines (MyISAM and Innodb included) can also use index to read the data, hence avoiding to read the row data itself. This is not simply savings of having 2 reads per index entry instead of one but it can save IO orders of magnitude in some cases – Indexes are sorted (at least on the page boundary) so doing index range scan you typically get many index entries from the same page but the rows itself can be scattered across many pages requiring potentially a lot of IOs. On top of that if you just need access to couple of columns
index can be simply much smaller than the data which is one of the reason covering indexes help to speed up queries even if data is in memory. If MySQL is only reading index and not accessing rows you will see “using index” in EXPLAIN output.

These are the main “core” use for indexes. You can also see others like using index for group by but I think they can be pretty much looked as one of these 3 ways described.

30 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

gmouse

14 years ago

>> Things like this however would not work [..] A>5 ORDER BY A,B

whai?

gmouse

14 years ago

My reaction disappeared?

>> Things like this however would not work [..] A>5 ORDER BY A,B
Wai not?

gmouse

14 years ago

>> Things like this however would not work [..] A>5 ORDER BY A,B

Why not?

Peter Zaitsev

Author

14 years ago

gmouse,

Thanks for spotting the typo. This indeed will work. I wanted to say A>5 ORDER BY A, B DESC will not work.

Shlomi Noach

14 years ago

Peter,

Perhaps it’s worth mentioning that GROUP BY & DISTINCT operations utilize sorting internally, hence fall into category #2.

Peter Zaitsev

Author

14 years ago

Shlomi,

Right. I tried to keep the post short enough not covering all details. Note Group by can be done using sort using temporary table or using skip-scan (which is more #1 than #2)

gmouse

14 years ago

It would be nice to see a larger post or article which fully describes how indices are used, also for joins. I still encounter too many people who are clueless, and it would be nice if I could provide them with a full article.

zerkms

14 years ago

“So if you have index on (A,B) This index can be used to lookup rows for WHERE clauses like A=5 ; A BETWEEN 5 AND 10 A=5 AND B BETWEEN 5 AND 10”

The last part of sentense isn’t clearly right: for expression A BETWEEN 5 AND 10 A=5 AND B BETWEEN 5 AND 10, only the left part of A+B index will be used. i think this fact should be noticed or sample changed to be more “simple”.

or i’m getting wrong?

zerkms

14 years ago

sorry, i was not enought intent. please delete my 2 posts :-[

Peter Zaitsev

Author

14 years ago

Zerkms,

This is actually 2 expressions one A BETWEEN 5 AND 10 and second one A=5 AND B BETWEEN 5 AND 10. Sorry for layout it is not always clear. Indeed as I speak later in the post you should look at key_len to see how many key parts actually used to lookup rows. in cases like A BETWEEN 5 AND 10 AND B BETWEEN 5 AND 10 the index will be used but really only for first part of the predicate using only (A) key prefix.

zerkms

14 years ago

“I speak later in the post you should look at key_len to see how many key parts actually used to lookup rows”
In this case i hate mysql and like mssql, because mssql show what (and even how!!) index parts were used.

Peter Zaitsev

Author

14 years ago

Right,

I think key_len is rather confusing, especially as it is length in bytes rather than key parts. This makes this complicated because of different length of different data types. For example if you have key (A,B) and see key_len=8 it may be using full index with INT fields or only A prefix for BIGINT column.

Jeff

14 years ago

I had been working on a performance problem for a while and this article triggered a new way for me to think about it and ultimately helped me resolve it. Thanks so much for this article in particular but also all of your articles. Some really great stuff!

Peter Zaitsev

Author

14 years ago

Jeff,

Thanks. Great to hear our articles are are helpful for you.

jay

13 years ago

Quote: “once MySQL runs into the interval range it will not use any further index parts.”
WRONG!

Triple range lookup:

EXPLAIN SELECT *
FROM tree
WHERE 1
AND tree_type = “2220”
AND date_from = “0000-00-00”
AND date_to > “2010-06-01”
AND date_to <= "9999-12-31"
AND tree_vector LIKE "0001%"

The WHOLE key is used.
KEY type_from_to_vector (tree_type,date_to,date_from,tree_vector)

more in Slovak language :
http://kyberia.sk/id/5396760

In app, I'm already using multiple range lookups with speeds in miliseconds.
It's maybe difficult to make it work the first time, to make the correct index, and the correct WHERE, but it's worth it.

Peter Zaitsev

Author

13 years ago

Jay,

In your case you have 4 clauses. 2 of them are EQ clauses and clauses on date_to and tree_vector are ranges. This query should use 3 key parts of this index hence only one range. If you think it is different can you explain why do you think so ?

jay

13 years ago

Someone who aproved my comment changed the query. (propably symbols “more” and “less” made problems)
Look at the link, at the end of the article is the correct query.

date_from and date_to are true ranges.
And vector LIKE “0001%” is changed by optimizer to another range – vector >= “0001” AND vector < "0002".

Peter Zaitsev

Author

13 years ago

Jay,

So the question remains. How do you KNOW all key parts are actually used ?

jay

13 years ago

sizes of parts of the key in Bytes:
tree_type = 2B
tree_vector = 200B (ascii char)
date_from = 3B
date_to = 3B

From Explain in the article:
type : range
possible_keys : type_vector_to_from,type_vector_from_to,type_from_…
key : type_from_to_vector
key_len : 208

testing on different keys (also in the article) confirms the key lenght.

jay

13 years ago

that you see more of it, I created 2 new indexes (on production data, not testing or random).
(sizes are different in production data. Type is int (4Bytes) not smallint and vector is varchar 255+2)

KEY ty_fr (tree_type,date_from),
KEY ty_fr_to (tree_type,date_from,date_to),
KEY ty_fr_to_ve_de (tree_type,date_from,date_to,tree_vector,tree_depth), – the real key used in app

FROM tree FORCE INDEX (ty_fr) :
key ty_fr
key_len 7
rows 11410

FROM tree FORCE INDEX (ty_fr_to) :
key ty_fr_to
key_len 10
rows 8492

FROM tree FORCE INDEX (ty_fr_to_ve_de) :
key ty_fr_to_ve_de
key_len 267
rows 7128

Raja Chakraborty

14 years ago

Hi Peter,

Thanks for your blog.
Can you explain me please more brief on mysql indexing.
Like how it’s work and what happen when query execute for index.

waiting for reply.

Thanks
Raja

Raja Chakraborty

14 years ago

Hi Peter,

Thanks for your blog.
Can you explain me please more brief on mysql indexing.

Like how it’s work and what happen when query execute for index.

Thanks
Raja

Pavel Cibulka

12 years ago

Hi Peter,
I’ve been thinking about your example with index (A,B) and WHERE B BETWEEN 5 AND 10. What about rewrite it as WHERE A>=0 AND B BETWEEN 5 AND 10? It is similar to the example in your book (AND sex IN(‘m’, ‘f’)).

This will use index. It would be ok, if A have few different values. But could become slow, if A has many different values.

Peter Zaitsev

Author

12 years ago

Pavel,

It is not the same as sex in (‘M’,’F’) is enumerated range while A>=0 is not. MySQL would only use full index when leading parts are enumerated as soon as it is non enumerated range (as > < BETWEEN etc) this will be last key part to use.

Pavel Cibulka

12 years ago

Peter,

thanks for response. Percona blog is really great source of knowledge with many helpful people. I hope we will see 3rd edition of “High Performance MySQL” book soon.

Phan Cuong Thinh

11 years ago

Hi Peter,

I have a question.
I have a table ‘t_table1’ include 3 fields :

field1 tinyint(1) unsigned default NULL,
field2 tinyint(1) unsigned default NULL,
field3 tinyint(1) unsigned NOT NULL default ‘0’,

and a Index:
KEY t_table1_index1 (field1,field2,field3),

When I run this SQL1:
EXPLAIN
SELECT *
FROM table1 AS c
WHERE
c.field1 = 1
AND
c.field2 = 0
AND
c.field3 = 0

Then is show:

Select type: Simple
tyle: All
possible key: t_table1_index1
key: NULL
key_len: NULL
rows: 1042
extra: Using where

I think it say that my index useless in this case.

But when I run this SQL2:
EXPLAIN
SELECT *
FROM table1 AS c
WHERE
c.field1 = 1
AND
c.field2 = 1
AND
c.field3 = 1

it show:

Select type: Simple
tyle: ref
possible key: t_table1_index1
key: t_table1_index1
key_len: 5
ref: const, const, const
rows: 1
extra: Using where

This case it used my index.
So please explain for me:
1. why SQL1 can not use index ?
2. with SQL1, how can i edit index or rewrite SQL to performing more quickly ?

Thanks !

Amit Shah

10 years ago

hi Peter,

What is the impact of non-key (non-clustered) indexed column when updated in innodb table of size 50G every day with 1000 rows.

Thanks.

Joey

9 years ago

@Peter

“””
EXPLAIN SELECT *
FROM tree
WHERE 1
AND tree_type = “2220”
AND date_from = “0000-00-00″
AND date_to > “2010-06-01″
AND date_to <= "9999-12-31"
AND tree_vector LIKE "0001%"

The WHOLE key is used.
KEY type_from_to_vector (tree_type,date_to,date_from,tree_vector)
"""

In this example, you said "This query should use 3 key parts of this index hence only one range. "
I thought this query should use 2 key parts, because date_to is a range and "once MySQL runs into the interval range it will not use any further index parts. ", only ( tree_type,date_to) is used.
Am I wrong?

Ruud H.G. van Tol

7 years ago

A 4th way: for the UNIQUE constraint.

Nano Zone

6 years ago

hi peter, it would be better with example by active mysql code

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

3 ways MySQL uses indexes

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Sets and Sorted Sets

Hello World… Hello Valkey! Let’s Get Started!

Valkey/Redis Configurations and Persistent Setting of the Key Parameters

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

3 ways MySQL uses indexes

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Sets and Sorted Sets

Hello World… Hello Valkey! Let’s Get Started!

Valkey/Redis Configurations and Persistent Setting of the Key Parameters

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation