July 30, 2014

Index lock and adaptive search – next two biggest InnoDB problems

Running many benchmarks on fast storage (FusionIO, SSDs) and multi-cores CPUs system I constantly face two contention problems.

So I suspect it’s going to be next biggest issues to make InnoDB scaling on high-end system.

This is also reason why in benchmarks I posted previously CPU usage is only about 50%, leaving other 50% in idle state.

First problem is index->lock mutex.
InnoDB uses single mutex per index, so when you run mixed read / write queries, InnoDB locks index for write operation and thus keeps all selects waiting when update/insert is done. This is implemented in this way because write operation may cause B-Tree page split, and InnoDB needs to move records between pages to finish operation. It is getting even worse when for write you need to perform some additional IO to bring page into buffer_pool.

What could be done there internally: there is B-Tree lock free or only page-level lock algorithms, so operation does not need to block whole B-Tree.
From end user point, to fight with this problem, you may need to partition (manually or using 5.1 partitions) table with big index into couple smaller table. It’s ugly, but it can help while main problem is not solved.

Second problem is adaptive_search index.
It appears when you have some scanning by secondary key select queries and write queries at the same time.
InnoDB again uses single global mutex for adaptive_search (single mutex for ALL table and ALL indexes), so write query blocks ALL select queries.
Usually first action is to disable adaptive_search (it is possible via global variable), but it rarely helps actually. With disabled adaptive index InnoDB needs to perform much more operations reading secondary keys.

How it can be solved internally: I think some hashing algorithms may be applied to not lock select queries. We may look how to implement it.

Until that InnoDB basically can’t utilize powerful hardware.
For example even in IO intensive load I am getting the same result
for single FusionIO card and for two FusionIO cards coupled in RAID0 (which theoretically doubles througput).

About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. Tobias Petry says:

    Your explanation of the lock mutex sounds like the MyIsam implementation: Block all reads when there’s some data manipulation. But this can’t be true or? This would be as worse as MyIsam and would not fit into the concept of row-level-locking.

  2. Vadim, the adaptive hash index is sometimes even the leading problem on standard non-plugin InnoDB in MySQL 5.0.x. I just saw a case of that today, on a server with 24 cores.

    It seems to me that the global structures are inevitably going to become a global problem inside InnoDB. Adaptive hash index is one, but I think the insert buffer, undo logs, and probably almost everything else in the global tablespace is another. As I look into our crystal ball, I see us going from bottleneck to bottleneck :-) Will we eventually end up with a storage engine that has nothing shared? Everything per-table or per-index?

  3. Vadim says:

    Baron,

    You are right, after fixing one bottleneck we will face another, and it will continue as long we have shread global structures.

    So.. the solution to run 128 mysql instances on 128 cores does not look that bad ;)

  4. With regard to index->mutex problem:
    Another possible solution would be to manage overflow pages to the index. So that not all writes to the index are immediately affected in the B-Tree, but rather written — sequentially — to index-overflow-pages, which are later applied in batch jobs.
    Index lookups will need to iterate both these structures.

  5. artemg says:

    have you already tried/tested to bind several mysql instances to different cores/sockets on the same box and run them in master->slave configuration?
    does it scale good enough, or there are other replicataion-specific bottlnecks?

  6. Vadim says:

    Tobias,

    InnoDB has too many internal global structures which are blocked by single writer.
    To add to alredy named index->lock and adaptive_search lock, there is single rollback segment and single lock on intrenal data dictionary.
    It is not as bad as for MyISAM table level lock, but named issues do not allow to scale on multi-cpu/cores systems.

  7. Vadim says:

    artemg,

    We did not test this setup yet.
    Actually Gallera replication looks prefferable for me in this regard,
    but I need yet to understand whan kind of latency we can have on simple transactions on
    single box.

  8. peter says:

    Vadim,

    Is it Mutex which protecting Index lock or is it RWLOCK ? I see no reason why readers should block other readers in either of them ?

  9. sky000 says:

    Not only both read and write query using same secondary index will happen this problem,but also just only read query too.

    like this:

    http://bugs.mysql.com/bug.php?id=51543

  10. mutex_pizza says:

    his is bad, all Select queries are waiting for the lock, and there was only one update uses secondary key and one insert query.
    ./Percona-Server-5.5.16-rel22.0/storage/innobase/btr/btr0cur.c
    ===================================================
    545 if (latch_mode == BTR_MODIFY_TREE) {
    546 mtr_x_lock(dict_index_get_lock(index), mtr); ==> writer aqcuire execlusive lock
    547
    548 } else if (latch_mode == BTR_CONT_MODIFY_TREE) {
    549 /* Do nothing */
    550 ut_ad(mtr_memo_contains(mtr, dict_index_get_lock(index),
    551 MTR_MEMO_X_LOCK));
    552 } else {
    553 mtr_s_lock(dict_index_get_lock(index), mtr); ===> all other queries on that table blocked here, including readers
    554 }

    ===========================================================
    1 Mutex at 0x2d07c648 ‘&trx_doublewrite->mutex’
    2 lock on RW-latch at 0x2d881108 ‘&new_index->lock’
    212 Mutex at 0x2d0777a8 ‘&log_sys->mutex’
    803 lock on RW-latch at 0x2d87c708 ‘&new_index->lock’

  11. mutex_pizza says:

    I am sorry, I meant this is bad, Does this apply to unique secondary key too?

  12. mutex_pizza says:

    I am sorry, I meant this is bad, Does this apply to unique secondary keys too?

Speak Your Mind

*