August 27, 2014

What does Handler_read_rnd mean?

MySQL’s SHOW STATUS command has two counters that are often confusing and result in “what does that mean?” questions:

  1. Handler_read_rnd
  2. Handler_read_rnd_next

As I understand it, there is some historical context to the choice of names here, hearkening back to before I was involved with MySQL, way back when it was a wrapper around ISAM tables — or even earlier, I don’t know. (Unireg, anyone?) In any case, these two variables deserve a little explanation.

Both counters indicate the number of times the corresponding storage engine API function has been called. In olden times, the storage engine API was called the handler API, which is why the variables begin with Handler_.

Handler_read_rnd counts the number of times the handler::rnd_pos() method is called. This method fetches a row from a table based on a “fixed position,” i.e. a random-read. What this actually means varies between storage engines. For MyISAM, position really means a byte offset from the beginning of the file. For InnoDB, it means to read a row based on a primary key value.

Handler_read_rnd_next is incremented when handler::rnd_next() is called. This is basically a cursor operation: read the “next” row in the table. The operation advances the cursor position so the next time it’s called, you get the next row.

The naming convention probably doesn’t make as much sense today as it did historically. Monty told me that “rnd” meant “random,” but now I can’t find a reference to that conversation. In any case, one of the operations is basically a random read (if you think of tuples as having a natural order, which historically they did), and the other is just a read-next operation; there’s nothing random about it, really. I would welcome any corrections from those who know more about this than I do.

Why would either of these be called? Usually Handler_read_rnd is called when a sort operation gathers a list of tuples and their “position” values, sorts the tuples by some criterion, and then traverses the sorted list, using the position to fetch each one. This is quite likely to result in retrieving rows from random points in the table, although that might not actually result in random IO if the data is all in memory. Handler_read_rnd_next is usually called for each row of a full or partial table scan.

PS: I’m being a bit vague about the sort-and-fetch algorithm; there are really two algorithms, and they are smarter than I’m giving them credit for. But that’s a topic for another blog post.

About Baron Schwartz

Baron is the lead author of High Performance MySQL.
He is a former Percona employee.

Comments

  1. MLBR says:

    Cool. Thanks for the interpretation.

  2. peter says:

    Baron,

    For full table scans it is clear for me I get one read_rnd with starting position and when do read_rnd_next until end of table. However what I’ll see in case of sorting ?

    Will sorting 100 rows table cause 100 read_rnd calls (considering row pointer only sort algorithm is used) ?

  3. Peter,

    On a MyISAM table with 59 rows, SELECT 1 FROM tbl gives these counters:

    | Handler_read_first | 1 |
    | Handler_read_next | 59 |

    Whereas adding ORDER BY RAND() adds “using temporary; using filesort” to the EXPLAIN, and gives the following:

    | Handler_read_first | 1 |
    | Handler_read_next | 59 |
    | Handler_read_rnd | 59 |
    | Handler_read_rnd_next | 60 |

  4. guillaume says:

    Baron, yes rnd means random, see this thread: http://lists.mysql.com/internals/37451

  5. linuxer says:

    yup, @guillaume

  6. Thanks Guillaume!

  7. peter says:

    Baron,

    How did you define your table ? read_first and read_next calls should correspond to index scan not full table scan.

  8. Right, it probably did scan an index for the query — good catch, I didn’t notice. I’ll try again with a new table without any indexes.

  9. OK so I made a copy of mysql.user (as a random choice of data to use), dropped all indexes, and inserted 5000 rows into it, then ran the same queries as above. SELECT user FROM user ORDER BY RAND() shows the following:

    | Handler_read_rnd | 5000 |
    | Handler_read_rnd_next | 10002 |
    | Handler_write | 5014 |

    So, essentially “scan 5000 rows, insert 5000 rows into temp table, sort them with a scan, read them out with a scan with random access”. The extra 14 rows in Handler_write should be due to SHOW STATUS.

    SELECT user FROM user gives

    | Handler_read_rnd_next | 5001 |

    SELECT 1 FROM user is not a good query to use — it ends up not accessing the table ;)

  10. Sasha Pachev says:

    Is there a way to have a partial table scan? SELECT * FROM t1 WHERE cond is a full scan. All of the rows are still scanned even though only the ones for which cond is true will be returned.

    Actually, there is now that I’ve thought some. If we add LIMIT the scan can terminate early. Note however that something like LIMIT N, M does not make it start after N! We start at the first record, and throw away the first N. A common optimization mistake in paging applications.

  11. Sasha, yes that should do it, and if I understand correctly should even remove the extra EOF call to Handler_read_rnd_next.

  12. Some more thoughts. If the table has N rows and you do a full scan, Handler_read_rnd_next is incremented by N+1. The reason for the extra 1 is the EOF test mentioned by Baron. The query execution engine does not know if there are any more rows to read unless it calls handler::read_rnd_next(). So the last call is always unsuccessful, but it does increment the counter.

    Now here is something interesting I just found (tested on version 5.1.48):

    mysql> create table t1(n int);
    Query OK, 0 rows affected (0.00 sec)

    mysql> insert into t1 values (1),(2),(3);
    Query OK, 3 rows affected (0.00 sec)
    Records: 3 Duplicates: 0 Warnings: 0

    mysql> flush status;
    Query OK, 0 rows affected (0.00 sec)

    mysql> show status like ‘Handler_read_rnd_next’ ;
    +———————–+——-+
    | Variable_name | Value |
    +———————–+——-+
    | Handler_read_rnd_next | 0 |
    +———————–+——-+
    1 row in set (0.00 sec)

    mysql> select * from t1;
    +——+
    | n |
    +——+
    | 1 |
    | 2 |
    | 3 |
    +——+
    3 rows in set (0.00 sec)

    mysql> show status like ‘Handler_read_rnd_next’ ;
    +———————–+——-+
    | Variable_name | Value |
    +———————–+——-+
    | Handler_read_rnd_next | 4 |
    +———————–+——-+
    1 row in set (0.00 sec)

    mysql> flush status;
    Query OK, 0 rows affected (0.00 sec)

    mysql> select * from t1;
    +——+
    | n |
    +——+
    | 1 |
    | 2 |
    | 3 |
    +——+
    3 rows in set (0.00 sec)

    mysql> show status like ‘Handler_read_rnd_next’ ;
    +———————–+——-+
    | Variable_name | Value |
    +———————–+——-+
    | Handler_read_rnd_next | 0 |
    +———————–+——-+
    1 row in set (0.00 sec)

    mysql> select * from t1;
    +——+
    | n |
    +——+
    | 1 |
    | 2 |
    | 3 |
    +——+
    3 rows in set (0.00 sec)

    mysql> show status like ‘Handler_read_rnd_next’ ;
    +———————–+——-+
    | Variable_name | Value |
    +———————–+——-+
    | Handler_read_rnd_next | 0 |
    +———————–+——-+
    1 row in set (0.01 sec)

    mysql> flush query cache;
    Query OK, 0 rows affected (0.00 sec)

    mysql> show status like ‘Handler_read_rnd_next’ ;
    +———————–+——-+
    | Variable_name | Value |
    +———————–+——-+
    | Handler_read_rnd_next | 0 |
    +———————–+——-+
    1 row in set (0.00 sec)

    mysql> flush tables;
    Query OK, 0 rows affected (0.00 sec)

    mysql> select * from t1;
    +——+
    | n |
    +——+
    | 1 |
    | 2 |
    | 3 |
    +——+
    3 rows in set (0.00 sec)

    mysql> show status like ‘Handler_read_rnd_next’ ;
    +———————–+——-+
    | Variable_name | Value |
    +———————–+——-+
    | Handler_read_rnd_next | 4 |
    +———————–+——-+
    1 row in set (0.00 sec)

    Handler_read_rnd_next was not incremented the second time you ran the query even if you flushed the query cache. However, if you did not flush the query cache, but flushed the tables, it did get incremented again.

    So I tested it on a table with 150 rows. This time the second execution did increment the counter without flushing the table or the query cache.

    Then I tried inserting rows into the small table. Inserting a row followed by SELECT * does increment Handler_read_rnd_next. However, the second time around it is not incremented! With 5 rows it was still not incremented when re-running the query.

    Time to study the source code to see what is up. Will post once I know.

  13. One correction – it is hanlder::rnd_next() not hanlder::read_rnd_next().

  14. Monty once said that the source code is the ultimate documentation. I think I found it – in sql/records.cc in init_read_record() :

    DBUG_PRINT(“info”,(“using rr_sequential”));
    info->read_record=rr_sequential;
    table->file->ha_rnd_init(1);
    /* We can use record cache if we don’t update dynamic length tables */
    if (!table->no_cache &&
    (use_record_cache > 0 ||
    (int) table->reginfo.lock_type s->db_options_in_use & HA_OPTION_PACK_RECORD) ||
    (use_record_cache file->ha_table_flags() & HA_NOT_DELETE_WITH_CACHE))))
    (void) table->file->extra_opt(HA_EXTRA_CACHE,
    thd->variables.read_buff_size);

    Now it just needs to be translated into human language. Will post once I get it figured out.

  15. The call to handler::extra_opt() with HA_EXTRA_CACHE enables the record cache, otherwise known as read buffer, and controlled by read_buffer_size.

    One more odd twist to this. The original table with 150 records that was not getting cached was mysql.user. I made its exact copy with CREATE TABLE LIKE followed by INSERT INTO SELECT. Scanning the copy the second time does not increment Handler_read_rnd_next. However, scanning the original mysql.user does. So it is not the size of the table, but something else, which is most likely the fact that it is a privileges table.

    One mystery that I do not understand, though, is how incrementing Handler_read_rnd_next is avoided on a MyISAM table without the query cache. I searched through the code and do not see how it is possible. rr_sequential does this:

    int rr_sequential(READ_RECORD *info)
    {
    int tmp;
    while ((tmp=info->file->rnd_next(info->record)))
    {
    /*
    rnd_next can return RECORD_DELETED for MyISAM when one thread is
    reading and another deleting without locks.
    */
    if (info->thd->killed || (tmp != HA_ERR_RECORD_DELETED))
    {
    tmp= rr_handle_error(info, tmp);
    break;
    }
    }
    return tmp;
    }

    ha_myisam::rnd_next() increments the counter first thing, and rr_sequential calls ha_myisam::rnd_next() first thing. So it should get incremented whether the record is in the record cache or not.

Speak Your Mind

*