April 16, 2014

kernel_mutex problem cont. Or triple your throughput

This is to follow up my previous post with kernel_mutex problem.

First, I may have an explanation why the performance degrades to significantly and why innodb_sync_spin_loops may fix it.
Second, if that is correct ( or not, but we can try anyway), than playing with innodb_thread_concurrency also may help. So I ran some benchmarks with innodb_thread_concurrency.

My explanation on the performance degradation is following:
InnoDB still uses some strange mutex implementation, based on sync_arrays (hello 1990ies), I do not have a good reason why it is not yet replaced.
Sync_array internally uses pthread_cond_wait / pthread_cond_broadcast construction, and on pthread_cond_broadcast call, all threads, competing on mutex, wake up and start racing.
This effect has name thundering herd.

Davi Arnaut does not agree with me, where I do not agree with him either. This is the healthy discussion, and it is possible only because InnoDB is still Open Source and we all can check source code. If the problem were in the closed extension Thread Pool I could not participate in it.

We will probably argue more on that topic, but that does not stop us from trying different
innodb_thread_concurrency ( 0 by default, that is no restrictions).

This variable has a complex fate. Once it was one solution for poor InnoDB scalability, then it changed default value, then it even was named useless.

There is results for workload as in previous post, 256 threads and
with innodb_thread_concurrency=0,4,8,16,32,64

innodb_thread_concurrencyThroughput
068369.02
4137999.96
8194537.48
16161985.59
32158296.21
64153889.72

Wow, this is something. I expected improvement, but not almost 3x times ( 194537÷68369 = 2.8).
The best throughput is with innodb_thread_concurrency=8.

So now let’s compare results for innodb_thread_concurrency= 0 vs 8 for all range of threads:

Threadsinnodb concurrency=0innodb concurrency=8
111178.34
227741.06
453364.52
892546.7388046.72
16144619.58141781.00
32164884.03168360.95
64154235.73186167.15
128147456.33199260.97
25668369.02194357.78
51240509.67194639.51
102422166.94183524.16

So innodb_thread_concurrency is even more helpful innodb_sync_spin_loops, and allows to get stable result even with 1024 threads. It is yet early to say it useless, and you may play with it.


About Vadim Tkachenko

Vadim leads Percona's development group, which produces the Percona Server and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. Wlad says:

    I think the problem is about to be correctly identified. Maybe it is not kernel_mutex that hurts Innodb. Maybe it is sync_array (protected by own lock) that hurts. All that stuff, the atomic looping with dirty read on the variable, the ut_delay with its fake math just not to access the variable, avoiding entering sync_array mutex as long as possible, pthread conditions and wakeups. At some point, all this needs to be replaced with a single pthread_mutex_timedlock() – (I believe timed is needed to handle deadlocks) on systems that support timed mutex locks, and ported to systems that do not support it.

  2. Mark Callaghan says:

    innodb_thread_concurrency=8 is my favorite way to guarantee that you don’t get more than 8 pending disk operations (ignoring purge, ibuf merges and readahead). I know you aren’t promoting it as a great solution because for workloads that want to do a lot of disk IO and busy nice storage subsystems, it really is a good idea to send more concurrent operations to the disks.

  3. Dave Juntgen says:

    @Vadim – thanks for investigating, really good info, did you run the thread concurrency test with the increased spin loops set to 200?

    @Mark – in a read intensive setup (98% reads), would setting the innodb_thread_concurrency=8 be necessarily a bad thing? What are the trade offs?

  4. Mark,

    Yeah. innodb_thread_concurrency is especially hard to tune on mixed workload. When you have completely CPU bound load
    some of the time so you want it relatively low and when there are heavy batch jobs which are IO bound and would benefit from a lot higher innodb_thread_concurrency.

    “right” solution would be to some form of IO aware thread scheduling for whole MySQL not just Innodb where you can schedule something else to run when thread is to be blocked on disk/network IO, locks etc.

  5. Mark Callaghan says:

    I hope the community implements the thread pool API for MySQL 5.6 with something that is aware of disk and network IO.

  6. Dimitri says:

    Vadim,

    all depends on contention you have.. – then setting innodb_thread_concurrency will help or not at all. In case of kernel_mutex it was yet OK. In case of some others – not at all. See the analyze I’ve posted last year: http://dimitrik.free.fr/blog/archives/2010/11/mysql-performance-55-and-innodb-thread-concurrency.html

    Rgds,
    -Dimitri

  7. Mark Callaghan says:

    Wlad – I think you are right about using timed mutexes to replace the sync array. With that the sync array won’t be needed as each waiting thread can do its own checks for “waiting too long” and “missed wakeup” after each timeout. We have done prototypes for this a couple of times and the results were usually good, but CPU/mutex bound workloads are not the common case for me, so I will wait for someone else to implement it for real.

  8. todd says:

    Thread pools use the same broadcast mechanism to unblock threads off a semaphore when new work is available, so I’m not sure that would help much.

  9. Interesting to see it hitting optimum at 8 considering that the box has 24 logical threads (12 physical cores). What does this imply ? Is it hitting some software bottleneck (sync_array, mutex herding etc) or a hardware one — numa/cache contention etc.
    I don’t see hardware becoming a bottleneck considering it has both RAID and Fusion-io card, what I/O scheduler was used for this — default CFQ or deadline ? Also was the filesystem XFS ?

    Regarding innodb_thread_concurrency being 0, I can see that there will be lot of thrashing/cpu-stealing etc going on, leading to reduced throughput.

  10. Will says:

    We recently tried some of this tuning to get rid of some contention that we are having. sync_spin_loop changes made no difference, and decreasing innodb_thread_concurrency to 16 or under actually caused our site to crash. So obviously this stuff is work-load dependent.

  11. Tinel Barb says:

    I think the way we understand innodb_thread_concurrency is wrong.
    i admit, the results presented in this article are nice, but we should understand that innodb_thread_concurrency is not cpu nor disk I/O bound!
    As long as the threads are not in entered in execution pool they are not working at all, thus the cpu and disk I/O are not affected.
    This leads to the conclusion that innodb_thread_concurrency is setting up a stand-by pool before the thread gets access to execution.
    Therefore, buffered threads according innodb_thread_concurrency are not competing for mutexes.
    So, the only settings that affect execution pool and performance (in terms of cpu and disk I/O performance) are:
    - innodb_read_io_threads
    - innodb_write_io_threads
    - innodb_commit_concurrency
    - innodb_thread_sleep_delay
    - innodb_concurrency_tickets
    - innodb_sync_spin_loops
    - innodb_spin_wait_delay
    Having this in mind, and trying to find the most stable configuration for my workload (knowing statistically how many threads try to enter simultaneously the execution queue), I’ve come to conclusion that I have to basically tune:
    • how many threads gets to execution – with regards of cpu and disks number
    • how the threads goes to execution – by tunning innodb_sync_spin_loops, innodb_spin_wait_delay and innodb_concurrency_tickets
    According to mysql site, I have come to the conclusion that I need to give the threads the chance to wait more to be granted to the execution pool BEFORE entering the sleep state (which will put them in FIFO pool, but with performance decrease), by relaxing innodb_sync_spin_loops and innodb_spin_wait_delay!
    In fact, I have slightly increased innodb_sync_spin_loops to 80 (default 30!) and reducing innodb_spin_wait_delay to 5 (default 6!).
    i established a value for innodb_thread_concurrency pool to 32.
    The result is a major decrease for all mutex competitions, therefore I obtain an increase in stability.
    I have a great respect for Percona’s programmers, they are an inspiration for me), so I’d really appreciate their opinion, maybe conducting a series of tests with this “theory”.
    Keep up, Percona team, with this great blogging site!

  12. Tinel Barb says:

    I have to mention that my values for “how many threads gets to execution” were set by tunning:
    - innodb_read_io_threads = 2 #number of innodb_buffer_pool_instances
    - innodb_write_io_threads = 10 #8 cores + 2 disks
    - innodb_commit_concurrency = 2 #number of innodb_buffer_pool_instances
    i know that the number “2″ may lead to bottle-necks, but in reality the stability is so great, the value of OS Waits is under 1e-4% and “Spin rounds per wait” are below 30 on all types of mutexes, far from the value of 80 allocated. All these gave a performance boost.
    I’d like con’s or pros’s. Thanks!

Speak Your Mind

*