The perils of InnoDB with Debian and startup scripts

Are you running MySQL on Debian or Ubuntu with InnoDB? You might want to disable /etc/mysql/debian-start. When you run /etc/init.d/mysql start it runs this script, which runs mysqlcheck, which can destroy performance.

It can happen on a server with MyISAM tables, if there are enough tables, but it is far worse on InnoDB. There are a few reasons why this happens — access to open an InnoDB table is serialized by a mutex, for one thing, and the mysqlcheck script opens all tables. One at a time.

It’s pretty easy to get into a “perfect storm” scenario. For example, I’m working with one client right now who has a hosted multi-tenanting application that keeps each customer in its own database. So they have a lot of databases and a lot of tables. And they’re running on Amazon EC2 with 8G of RAM and EBS storage, which is slower than typical directly-attached server-grade RAID storage. Since they have a lot of tables, InnoDB uses over 3.5G of memory for its data dictionary (the subject for another post — we’re working on a fix) and so we can’t make the buffer pool as large as we’d like to.

To avoid physical I/O all the time we need to get some reasonable amount of data into the buffer pool. But we have to do this without death-by-swapping, which would be extremely slow on this machine, so we need to stop the buffer pool and the OS cache from competing. My chosen strategy for this was to set innodb_flush_method=O_DIRECT. We could also tune the OS, but in my experience that’s not as effective when you’re really pushing to get memory into the buffer pool. Remember we have 3.5G of memory less to play with, solely due to the data dictionary.

But this strategy will only reduce physical reads if the buffer pool follows a typical access pattern. That is, some of the data is in your working set and will stay in the buffer pool, some small part of it will move in and out of the buffer pool, and some won’t be needed.

And that’s where the Debian startup script breaks down entirely, because it doesn’t follow this pattern. It’s going to open every table, regardless of whether user queries require it or not. On big servers I’ve seen it literally run for days (or longer). In the meanwhile, it’ll interfere with everything else going on. Look what happens:

mysql> show processlist;
+------+------------------+----------------+-------------
| Id   | User             | State          | Info        
+------+------------------+----------------+-------------
|    7 | debian-sys-maint | NULL           | CHECK TABLE tableA...
|  739 | user             |                | NULL        
| 4776 | user             |                | NULL        
| 6318 | user             | Sending data   | insert into tableB...
| 6322 | user             | update         | insert into 
| 6327 | user             |                | NULL        
| 6328 | user             | statistics     | select ...
| 6334 | user             | statistics     | select ...
| 6337 | user             |                | NULL        
| 6340 | user             | Sending data   | select ...
| 6342 | user             | statistics     | select ...
| 6344 | user             |                | NULL        
| 6345 | user             | Updating       | update ...
| 6346 | user             | Sorting result | insert ...
| 6351 | user             |                | NULL        
| 6355 | user             |                | NULL        
| 6356 | user             | statistics     | select ...
| 6357 | user             | statistics     | select ...
| 6358 | user             | Sending data   | select ...
| 6359 | user             | statistics     | select ...
| 6360 | user             |                | NULL        
| 6361 | user             |                | NULL        
+------+------------------+----------------+-------------

mysql> show processlist;

+------+------------------+----------------+-------------

| Id | User | State | Info

+------+------------------+----------------+-------------

| 7 | debian-sys-maint | NULL | CHECK TABLE tableA...

| 739 | user | | NULL

| 4776 | user | | NULL

| 6318 | user | Sending data | insert into tableB...

| 6322 | user | update | insert into

| 6327 | user | | NULL

| 6328 | user | statistics | select ...

| 6334 | user | statistics | select ...

| 6337 | user | | NULL

| 6340 | user | Sending data | select ...

| 6342 | user | statistics | select ...

| 6344 | user | | NULL

| 6345 | user | Updating | update ...

| 6346 | user | Sorting result | insert ...

| 6351 | user | | NULL

| 6355 | user | | NULL

| 6356 | user | statistics | select ...

| 6357 | user | statistics | select ...

| 6358 | user | Sending data | select ...

| 6359 | user | statistics | select ...

| 6360 | user | | NULL

| 6361 | user | | NULL

+------+------------------+----------------+-------------

Notice all those processes in ‘statistics’ status. Why is that happening? Look at SHOW INNODB STATUS:

=====================================
090128  8:29:03 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 15 seconds
----------


SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 39125236, signal count 13530611
--Thread 1161714000 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
--Thread 1164011856 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
--Thread 1164822864 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
--Thread 1161849168 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
--Thread 1163336016 has waited at btr0sea.c line 1529 for 0.00 seconds the semaphore:
X-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
--Thread 1159956816 has waited at btr0sea.c line 1127 for 0.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
--Thread 1157658960 has waited at btr0sea.c line 746 for 0.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139
a writer (thread id 1158064464) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0sea.c line 746
Last time write locked in file btr0sea.c line 1624
Mutex spin waits 0, rounds 5023577, OS waits 24953
RW-shared spins 34364070, OS waits 33800501; RW-excl spins 5756394, OS waits 5297208

=====================================

090128 8:29:03 INNODB MONITOR OUTPUT

=====================================

Per second averages calculated from the last 15 seconds

----------

SEMAPHORES

----------

OS WAIT ARRAY INFO: reservation count 39125236, signal count 13530611

--Thread 1161714000 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore:

S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139

a writer (thread id 1158064464) has reserved it in mode exclusive