July 25, 2014

Helgrinding MySQL with InnoDB for Synchronisation Errors, Fun and Profit

It is no secret that bugs related to multithreading–deadlocks, data races, starvations etc–have a big impact on application’s stability and are at the same time hard to find due to their nondeterministic nature.  Any tool that makes finding such bugs easier, preferably before anybody is aware of their existence, is very welcome.

Enter the Helgrind tool of the Valgrind dynamic analyzer suite. Now, Valgrind does not need much introduction, especially to the MySQL server developers. The Valgrind Memcheck tool, which is synonymous with Valgrind itself, is relatively widely used for the MySQL server development: there is support for it in the MySQL Test Framework, DBQP, and the server source is properly annotated to get the most out of Memcheck. Not everything is perfect though: the manual is only partially correct in how to properly build the server to use it  (bug 61587). We at Percona regularly use Memcheck for our development and upstream testing with useful results (bug 61986, bug 61985, …).

Helgrind does not enjoy the same level of support and I think it’s a shame. What I have to found to be useful in the daily work is to have the ability to run a selected MTR testcase under Helgrind. (For the time being, running the whole MySQL testsuite is not very practical due to 100x slowdown imposed by Helgrind).  For this purpose, I have patched mysql-test-run.pl to accept a new option: –helgrind.  The patch is a copy-paste-modify of the already existing Callgrind support and is not going to win any Cleanest Feature of the Year awards, but hey, this is MTR we are talking about. For a proper implementation for our purposes, Patrick has added preliminary Helgrind support to DBQP.

For 5.1:

mtr-helgrind-51.patch

For 5.5:

mtr-helgrind-55.patch

Now let’s see what kind of goodies does Helgrind find on MySQL server. Let’s take a recent 5.1 bzr version (pre-5.1.61) and a single test, innodb_plugin.innodb_bug53674. The test is chosen for no particular reason except that it is a quick-running one.

The race here is between the reads and writes of mutex_t::waiters in mutex_get_waiters and mutex_set_waiters. Interestingly, there are comments next to it that say /* Here we assume that the [read|write] of a single word from memory is atomic */.  While it is true that machine native word stores/loads are technically atomic, i.e. you cannot observe any intermediate state, it is likely that if other CPU cores will read this same variable, they will get an old value for quite some time after the store. Now of course it is extremely unfair to say about InnoDB developers that they do not understand parallelism. They do. The stacktraces above are incomplete due to inlining, but it’s not hard to find out that mutex_get_waiters is called from mutex_exit and there we find a comment:

So, this race is accounted for. Let’s go on.

This is a race between srv_monitor_thread and srv_error_monitor_thread InnoDB threads storing to srv_last_monitor_time. It’s hard for me to think of any harm that may happen here due to the way this variable is used. Moving on.

This one shows that accesses to the fields of buf_pool->stat are unprotected. These are buffer pool statistics counters: number of read, written, evicted, etc. pages. The more interesting race is not the one show but rather between different threads bumping the counters at the same time. As a result, some of the stores might be lost, with the counter values slightly smaller than they should be in the end.

This one seems to suggest an unprotected access to the current LSN (log_sys->lsn), with a small wrinkle: the accesses are actually protected. log_write_low asserts that it holds the log system mutex, and log_get_lsn acquires it..

Just like with buffer pool stats, the I/O stats might be slightly smaller than they should be.

The RW lock spin stats join the company of the buffer pool and I/O stats.

All previous errors were data races, and here we see a new kind of error: lock order violation, thus a potential deadlock. The locks in question are LOCK_global_system_variables and LOCK_system_variables_hash. Deadlock is possible when one connection issues SHOW VARIABLES or equivalent while another connection is just starting up (i.e. its THD is being initialized). I have reported it as bug 63203. Interestingly Valeriy Kravchuk has confirmed it by finding another (related) lock order violation: between LOCK_system_variables_hash and LOCK_plugin.

I have only shown and analyzed a small part of all Helgrind-reported data races. The remaining ones either show the same issues, show similar issues (other InnoDB stat counters), show non-issues or my analysis errors (like the log_sys->lsn “race” above) or I simply haven’t analyzed them (yes, there is remains a lot of the last kind). What would I do about all this? If I had my way, I’d use atomic access primitives in InnoDB for the benign and minor cases too–if only to declare an intent and bullet-proof the code against future changes that might make the issues not-so-minor. Additionally, I’d consider backporting and using MySQL 5.5 atomic operation primitives with proper memory barriers, so that there is no need for workarounds in mutex_get_waiters/mutex_set_waiters above and similar cases. And then Helgrind would be even more useful for daily development and automated testing with a clean log by default :)

About Laurynas Biveinis

Laurynas is a software engineer and Percona Server lead whose primary interest is InnoDB performance. In the past he worked in industry, interned in Google as a compiler software engineer, as well as academia where he researched physical database indexes, including large-scale spatial models of the brain.

Comments

  1. tobi says:

    A clogged Helgrind log makes continuous checking a nasty exercise. For the same reason many projects choose to compile warnings as errors so everyone is forced to mark them off.

  2. Interesting. Related to unprotected variables, after looking at the code, I have noticed that many global variables are unprotected even though they can be dynamically updated. Some of them have a validate/update hooks which ensure to lock the access with mutex etc but still many of them aren’t. The one closest I can think of is innodb thread concurrency variable. It is a dynamic one but still is unprotected by any updates from main thread.

Speak Your Mind

*