Are you running MySQL on Debian or Ubuntu with InnoDB? You might want to disable /etc/mysql/debian-start. When you run /etc/init.d/mysql start it runs this script, which runs mysqlcheck, which can destroy performance.
It can happen on a server with MyISAM tables, if there are enough tables, but it is far worse on InnoDB. There are a few reasons why this happens — access to open an InnoDB table is serialized by a mutex, for one thing, and the mysqlcheck script opens all tables. One at a time.
It’s pretty easy to get into a “perfect storm” scenario. For example, I’m working with one client right now who has a hosted multi-tenanting application that keeps each customer in its own database. So they have a lot of databases and a lot of tables. And they’re running on Amazon EC2 with 8G of RAM and EBS storage, which is slower than typical directly-attached server-grade RAID storage. Since they have a lot of tables, InnoDB uses over 3.5G of memory for its data dictionary (the subject for another post — we’re working on a fix) and so we can’t make the buffer pool as large as we’d like to.
To avoid physical I/O all the time we need to get some reasonable amount of data into the buffer pool. But we have to do this without death-by-swapping, which would be extremely slow on this machine, so we need to stop the buffer pool and the OS cache from competing. My chosen strategy for this was to set innodb_flush_method=O_DIRECT. We could also tune the OS, but in my experience that’s not as effective when you’re really pushing to get memory into the buffer pool. Remember we have 3.5G of memory less to play with, solely due to the data dictionary.
But this strategy will only reduce physical reads if the buffer pool follows a typical access pattern. That is, some of the data is in your working set and will stay in the buffer pool, some small part of it will move in and out of the buffer pool, and some won’t be needed.
And that’s where the Debian startup script breaks down entirely, because it doesn’t follow this pattern. It’s going to open every table, regardless of whether user queries require it or not. On big servers I’ve seen it literally run for days (or longer). In the meanwhile, it’ll interfere with everything else going on. Look what happens:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | mysql> show processlist; +------+------------------+----------------+------------- | Id | User | State | Info +------+------------------+----------------+------------- | 7 | debian-sys-maint | NULL | CHECK TABLE tableA... | 739 | user | | NULL | 4776 | user | | NULL | 6318 | user | Sending data | insert into tableB... | 6322 | user | update | insert into | 6327 | user | | NULL | 6328 | user | statistics | select ... | 6334 | user | statistics | select ... | 6337 | user | | NULL | 6340 | user | Sending data | select ... | 6342 | user | statistics | select ... | 6344 | user | | NULL | 6345 | user | Updating | update ... | 6346 | user | Sorting result | insert ... | 6351 | user | | NULL | 6355 | user | | NULL | 6356 | user | statistics | select ... | 6357 | user | statistics | select ... | 6358 | user | Sending data | select ... | 6359 | user | statistics | select ... | 6360 | user | | NULL | 6361 | user | | NULL +------+------------------+----------------+------------- |
Notice all those processes in ‘statistics’ status. Why is that happening? Look at SHOW INNODB STATUS:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | ===================================== 090128 8:29:03 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 15 seconds ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 39125236, signal count 13530611 --Thread 1161714000 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore: S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 --Thread 1164011856 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore: S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 --Thread 1164822864 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore: S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 --Thread 1161849168 has waited at row0sel.c line 3326 for 0.00 seconds the semaphore: S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 --Thread 1163336016 has waited at btr0sea.c line 1529 for 0.00 seconds the semaphore: X-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 --Thread 1159956816 has waited at btr0sea.c line 1127 for 0.00 seconds the semaphore: S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 --Thread 1157658960 has waited at btr0sea.c line 746 for 0.00 seconds the semaphore: S-lock on RW-latch at 0x2aaaae0b70b8 created in file btr0sea.c line 139 a writer (thread id 1158064464) has reserved it in mode exclusive number of readers 0, waiters flag 1 Last time read locked in file btr0sea.c line 746 Last time write locked in file btr0sea.c line 1624 Mutex spin waits 0, rounds 5023577, OS waits 24953 RW-shared spins 34364070, OS waits 33800501; RW-excl spins 5756394, OS waits 5297208 |
Everyone is waiting for mutexes, and they are all waiting for thread 1158064464 which has reserved it. If you hunt through the TRANSACTIONS section, you can see the OS thread IDs, and that one is the debian-sys-maint thread. You also see the other threads:
1 2 3 | ---TRANSACTION 0 228527423, ACTIVE 0 sec, process no 30034, OS thread id 1159956816 starting index read, thread declared inside InnoDB 500 mysql tables in use 7, locked 0 MySQL thread id 6424, query id 1579718 10.255.106.47 user statistics |
And correlating the thread ID back to the semaphores, you see thread 1159956816 is waiting for the semaphore.
Notice that this is effectively a global lock. The debian-sys-maint thread is not touching the same tables as the other queries; it’s just touching the same internal structures. So a user working on table A can interfere with a user that wants access to table B.
The real solution is to disable this startup process. It’s not even needed for InnoDB. Sooner or later you’ll find yourself fighting with it. You can just put “exit 0;” at the top.
The solution I chose in this case?
1 | mysql> KILL 7; |
Immediately afterward everything cleared up.
Very interesting. I’ve noticed how long this process takes, but usually just accepted it as a fact of life for data consistency, similar to fsck. I have killed it off in the past though when I really needed the server to start running immediately.
It may be worth forwarding this along to their MySQL bugs page, as I’m sure they’d take action with the documentation provided here, and this is certainly something squarely in their hands as opposed to upstream: http://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=mysql-server;dist=unstable
Nice find though!
Note that this shouldn’t be a problem with MySQL from Debian lenny, as it ships with init-scripts which will not run CHECK TABLE on non-MYISAM tables.
Actually we only comment the call of “check_for_crashed_tables;” in the Debian script.
I have also used the “KILL 7” approach in the past and found that another thread is often opened up immediately with a “CHECK TABLE” on the next table in the database. I needed to keep killing threads until it gets to the end. [Ubuntu 8.04 Hardy]
My first action after installing mysql on Debian is to comment the check_for_crashed call (actually not really me, but Puppet is doing it for me).
Yep, the stock Mysql Debian init script sucks so badly that it’s almost mandatory for every sysadmin out there with a real Mysql installation to fine tune them. Moreover it sucks almost as well that scripts in the .deb packages stop/start Mysql without asking. There are situations in which you want to update the binary quickly but you want a controlled restart.
Vide, just drop a mail to the Debian BTS with your suggestions and patches. It’s too late to change things for lenny, but we can talk about your suggestions and patches for squeeze.
Btw, I’m using the stock init scripts in “real mysql installations” (50+ servers), and they are working just fine.
Changing the script isn’t enough to avoid the problems with the data dictionary, BTW. Just touching the table causes InnoDB to allocate space for it, which it will never release again until it’s restarted. And you have to touch the table to find out its engine, so you can’t know a priori which tables not to touch. Any scripts that query the INFORMATION_SCHEMA upon startup are also a very bad idea on installations with lots of tables.
I still advise to disable the script, period, unless you know you want it enabled.
Norbert, I don’t think it’s the init script Vide meant; it’s the install/deinstall. I agree — it shouldn’t try to start/stop automatically. This is not a mistake unique to Debian 🙂
Baron,
In reality it is not clear to me why do it this way all together.
If you want tables to be checked and repaired on startup myisam_recover is the decent option. If this causes problems because MySQL starts to check and repair a lot of tables at the same time blocking all connections may be MyISAM is simply not the right choice for you.
Running check in the background is the broken middle ground – you still have some queries ran against corrupted MyISAM tables which can cause wrong query results or further corruption and you can still get MySQL stalled/running out of connections when a lot of connections block waiting for some large commonly accessed table which is being checked.
I agree 100%. One thing I didn’t say — there might be a reason for a custom startup script — but it’s not this. Just don’t put mysql_check_upgrade (that only needs to be run *one time*!) and don’t put mysqlcheck in it. But you might put a script to warm up your indexes or something like that. Something that’s too complex to put in the –init-file option.
Yet another brain damaged Debian “improvement”.
Memo to debian package maintainers: You are *package maintainers*. If you want to change the way MySQL *behaves*, SUBMIT A PATCH LIKE EVERYONE ELSE, IF FOR NO OTHER REASON THAN TO HAVE IT REFLECTED IN THE OFFICIAL DOCUMENTATION EVERYONE (except a few of us crusty old farts who don’t trust anyone or anything we don’t build ourselves) RELIES UPON.
Seriously…
g, you don’t need to cry, as ch wrote above the behaviour was already changed about a year ago.