Sometimes a MySQL server running InnoDB takes a long time to shut down. The usual culprit is flushing dirty pages from the buffer pool. These are pages that have been modified in memory, but not on disk.

If you kill the server before it finishes this process, it will just go through the recovery phase on startup, which can be even slower in stock InnoDB than the shutdown process, for a variety of reasons.

One way to decrease the shutdown time is to pre-flush the dirty pages, like this:

Now run the following command:

And wait until it approaches zero. (If the server is being actively used, it won’t get to zero.)

Once it’s pretty low, you can perform the shutdown and there’ll be a lot less unfinished work to do, so the server should shut down more quickly.

22 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
ihanick

Thanks, amazing advice!

Mark Callaghan

Baron – you can update this article to include a link to a Percona binary that makes this faster.

It is one thing to ask InnoDB to lower the percentage of dirty buffers, it is another thing for InnoDB to do that on a busy server.

If writes are slow on your server, then you really need the Percona patches or features from the v3 Google patch to make the writing use the available IO capacity on a server. The patches to use include innodb_io_capacity, more background IO threads and a few others.

Writes to files in the OS buffer cache are fast. Writes may be slow when:
* using innodb_flush_method=O_DIRECT
* disabling SATA writeback cache
* not using a HW RAID disk cache
* using NFS or some other remote storage service that doesn’t buffer writes in the OS buffer cache

Kevin Burton

We made this a standard part of /etc/init.d/mysql stop.

It just sets this and waits for MySQL to finish writing and then allows the stop to happen.

The main problem is that the mysql control port is closed to there’s now way to figure out what MySQL is doing while you’re waiting…

Simon Mudd

You mention Innodb. There’s also a very nasty option which can affect MyISAM users, especially if you use MyISAM for datawarehouse issues. (smaller memory and disk footprint than innnobase)

This is something taken from our wiki, and discovered after discussing in detail with the MySQL support team.

delay_key_write=ON http://dev.mysql.com/doc/refman/5.0/en/server-options.html#option_mysqld_delay-key-write

This global configuration variable can be used (delay_key_writes=ALL) to stop writes from the MySQL key buffer to disk. It’s only been used on slave servers, not the masters.

This option sounds great but DO NOT USE IT. It has a very nasty effect of leaving the key data for MyISAM tables in memory and not written to disk. When you want to shutdown the server the server is forced to do a FLUSH TABLES and this can take a long time. On the XXX servers we’ve seen this take between 20 and 50 minutes.

NOTE: These are boxes running the MyISAM engine exclusively and having 32GB of RAM. (key_buffer 12GB)

So something to remember if you have a server setup this way.

pat

Is this really shutting down my server any faster in wall clock time?

Without setting the max_dirty_pages flag to zero, say it takes me 15 minutes to shut down.

If I do set it, doesn’t it just take me 14 minutes for dirty pages to drop to zero and 1 minute for the shutdown?

Seems like it might make the shutdown *command* take less time, but the actual elapsed time from the moment you decide to shut down the server to the time mysqld terminates isn’t decreasing is it?

Kevin Burton

pat,

If you’re new to MySQL and your DB is taking 12 minutes to shutdown, and the port is closed, one might assume that the DB is broken and kill -9 it..

Our script actually prints that it’s flushing the buffer pool and prints percentage until complete numbers.

Coolyou

Kevin, we were about to write one of these scripts ourselves until we realized that we may have an issue with figuring out how far we are with the flushing. Could you please provide us all with the script? It may prove very useful 🙂

Pat, the whole idea of this is to keep the application online for a longer period of time than having MySQL refuse all connections for that 15 minutes.

However I have noticed that MySQL performance decreases dramatically when innodb_max_dirty_pages_pct = 0 is in effect. That’s not unexpected considering that the dirty pages are immediately flushed to disk. But those with a busy site may want to consider this limitation because this option may even increase the time required to shut down because of the increased load from the application itself and the increased time it takes for the application to respond.

Kevin Burton

Hey Baron….

Here’s the script we’re using:

http://pastebin.com/f5ec2f70d

If we were to write it again it would almost certainly be written in python.

Kevin

Kevin Burton

Another note… we have three replicas of everything.

When we restart an InnoDB box it is taken out of production and does not serve requests.

Then we flush the innodb buffer pools, then restart it.

The script I linked to above is just added as a one liner to our /etc/init.d/mysql script…. if InnoDB isn’t running on this box it isn’t run.

We have about 1/2 of our cluster on MyISAM without InnoDB enabled so this would break otherwise.

Wagner Bianchi

This is a very good idea.
So nice.

Kayra Otaner

Even after all the updating threads shut down, you’ll see Innodb_buffer_pool_pages_dirty count approaching to 0 but still increasing sometimes. This is because of InnoDB merges, you can observe this using ‘show engine innodb status\G’. Only when all the merges are completed pages_dirty count will be 0 and stay at zero. To calculate how many merges still need to be done :
Under “INSERT BUFFER AND ADAPTIVE HASH INDEX” section difference between ‘Inserts’ and ‘merged recs’ will show how many to go. Example :

————————————-
INSERT BUFFER AND ADAPTIVE HASH INDEX
————————————-
Ibuf: size 4185, free list len 3748, seg size 7934,
65171060 inserts, 65171060 merged recs, 10552858 merges
Hash table size 7055701, used cells 32410, node heap has 33 buffer(s)

65171060 – 65171060 = 0

this means that Innodb_buffer_pool_pages_dirty should be 0 and stay at 0. Please note this calculation works most of the time, but sometimes ‘inserts’ – ‘merged recs’ value goes negative values.

kerja keras

great tips, its very helpful. thanks for share

Coway Zhou

I tried it once. But it never worked. I lost patient and did a kill -9. Of course, when the db was started, it did long time in recovery. At least it showed some progress in recovery.

Wagner Bianchi

Take when issue a KILL -9 command against mysqld. If MySQL is answering any writes on any MyISAM tables, you’ll see problems that would be repaired.

Wagner Bianchi

Sorry, *Take care

Andy Blyler

You can also execute the following SQL query to get the current count of the dirty pool pages, instead of using mysqladmin:

show global status like ‘Innodb_buffer_pool_pages_dirty’;

sysadmin

Unfortunately, setting innodb_max_dirty_pages_pct = 0 had no effect in my case:

[root@db12 ~]# mysqladmin ext -i10 | grep dirty
| Innodb_buffer_pool_pages_dirty | 2958 |
| Innodb_buffer_pool_pages_dirty | 2979 |
| Innodb_buffer_pool_pages_dirty | 2970 |
# at this moment dirty_pages_pct was set to zero
| Innodb_buffer_pool_pages_dirty | 3034 |
| Innodb_buffer_pool_pages_dirty | 3042 |
| Innodb_buffer_pool_pages_dirty | 3071 |
| Innodb_buffer_pool_pages_dirty | 3165 |
| Innodb_buffer_pool_pages_dirty | 3161 |
| Innodb_buffer_pool_pages_dirty | 3144 |
| Innodb_buffer_pool_pages_dirty | 3014 |
| Innodb_buffer_pool_pages_dirty | 3025 |
| Innodb_buffer_pool_pages_dirty | 3046 |
| Innodb_buffer_pool_pages_dirty | 2973 |

Strangely, but Innodb_buffer_pool_pages_dirty even increased instead of dropping to zero. Are those values (several thousands) too small for this technique to work? innodb_buffer_pool_size is 25G, server has 32G, db size is near 20G (sql-dump). Even after few hours after setting dirty_pages_pct to zero Innodb_buffer_pool_pages_dirty numbers still the same. And MySQL also is using several Gigs of swap…

Dean Herbert

This seems to have no effect on percona 5.6.20. Hovering 105k dirtty pages on a mostly idle server.

Fedge

sysadmin, when the server’s swapping, generally performance is so bad that only rebooting can fix it (by temporarily working around the swapping until all your memory fills up again). You may be surprised at the performance you might be able to achieve by reducing the buffer pool size enough to keep the server from swapping. Even if you think the buffer pool will be too small, in my experience, nothing helps once the server starts swapping.