We recently released XtraDB-9, and while we did not highlight it in announcement, the release-making feature is ability to save and restore InnoDB buffer pool.
The idea is not new and was originally developed by Jeremy Cole (sorry, I do not have the link on hands) some time ago, and now we implemented it in XtraDB.

Why would we need to save and restore content of buffer pool ?
There are several reasons.
First, it’s not rate on modern servers to have 32GB+ of RAM, with allocated InnoDB buffer_pool 26GB or more. When you do restart of server, it may take long time to populate cache with useful data before you can bring it back to serve production load. It’s not rare to see
maintenance cycle takes two or more hours, mainly because the slave need to catchup with master and to warm cache.
In case with the server crash, it is even worse, you need to wait possible long time on InnoDB
recovery (we have the patch for that too, in that post you can see InnoDB recovery took 1h to accomplish) and after that warm caches.

Second, it is useful for some HA schemas, like DRBD, when, in case of failover, you need to start passive instance on cold.

So let’s see what results we have.
Details about patch you can get there https://www.percona.com/docs/wiki/percona-xtradb:patch:innodb_lru_dump_restore (Yasufumi names it LRU dump/restore, because he thinks about buffer pool as about LRU list, which how it is internally).

To save buffer pool you execute

and to restore

it will create/read file

from your database directory.

You may want to sort ib_lru_dump in order of pages in tablespaces, so RESTORE will be
performed in most sequential way. The small python script

to sort

is available
in our Launchpad branch

I made small tpcc benchmark to show effect with restored buffer_pool (the condition of
benchmarks are the same as in my runs on fast storages, and I
used RAID10 to store InnoDB files).
First run (xtradb cold) I made just after restart and ran it for 1h.
After that I saved buffer_pool, restarted mysqld, restored buffer_pool ( it took about 4 min
to load 26GB worth of data), and run (xtradb warm) tpcc again.

Here is graphical results (results in New Transactions Per 10 sec, more is better):

tpcc_1000w

As you see in the cold run it took 1500-1800 sec to enter into stable mode, while
it warm run it happened almost from start. There was some period of unstable results, but it
did not affect ability to serve load.

You are welcome to test it, it is available in XtraDB-9 release and also in MariaDB 5.1.41-RC.

20 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Harrison Fisk

How long does it take to warm the cache using conventional means, such as full table scans/full index scans on this hardware? And how does the TPC compare after doing that for 4 minutes?

Any idea what caused the big dip in results at 1800 for the cold mode?

Are there any locks or other side effects when taking the snapshot (besides 26G of disk writing in non-synchronous manner)?

Peter Zaitsev

Vadim,

This is very cool. Some questions and suggestions though

1) I think Python script to sort the file is ugly. It is much better if it is done on its own
2) It would be great to have the option to store the LRU dump on the shutdown and load it on startup (probably in some background thread)
3) Looking at your graph there are couple of questions – why graph with warmup gets peak when goes down 50% and when picks up again. Should not warmup provide uniform speedup over time ? Also why do we have so uneven performance looking at 10 second samples – did you use adaptive checkpoint in this case ? Could it be something else which makes things so uneven ?

Mark R

I’d like to see this made fully automatic – so it would automatically dump the LRU list periodically after a certain amount of uptime – say 24h and also at server shutdown if it had been running for long enough, and automatically load it on restart.

These could be tunables of course.

This would mean that most users could just forget about it and have cache warmup goodness happen by itself.

Kim

When you dump the bufferpool, what data is dumped? Is it just pointers to what rows etc that needs to be loaded or is it the actual data in the pool that is dumped.. .

Im thinking if you restore an old pool will you risk getting invalid cached data or is the buffer pool re-validated against what is stored in the database when loaded.

Baron Schwartz

Vadim, how does this interact with recovery? Does it work OK if you save the buffer pool contents, then crash the server, restart it, and restore the buffer pool? If not, then that might be a problem for the DRBD use case.

I think it should work fine, but maybe I’m wrong.

Vadim

Baron,

There is no reason why it will not work.

I as said we store just pointers to pages (space_id, page_id),
and at restore stage we just read pages by pointers.
It does not matter if InnoDB crashed before, did recovery procedure, etc.

Tobias Petry

Vadim, this means we can make a BufferPool-Dump, run the server 2 hours, crash it and after restarting (and innodb’s recovery) and dumping the BufferPool back to the RAM we have an warm InnoDB instance with no stale data? And no other problems?

Vadim

Tobias,

Basically yes.
You need though to sort dump of buffer_pool, so it will be loaded sequentially.
Also dumping back to RAM may take some time (4 mins in my experiment), but it faster then work with cold cache.

Patrick Mulvany

Vadim, How is this patch effected by increases or decreases in InnoDB buffer pool size parameter?
I would assume that changing this would not be a good idea.

Vojtech

Vadim, I just recognized, I cannot kill the XTRA_LRU_RESTORE query. I think it should check for ‘killed status’ at least each few seconds. Am I right?

serbaut

Patch against 5.1.47 for sorting before restore: http://gist.github.com/570107

Vadim

serbaut,

Great, thanks!
Can we use it under BSD license ?

serbaut

: yes you can.

Will

Typo:

First, it’s not rate on modern servers to have 32GB+ of RAM
———————^ <should be rare, not rate.