In my previous posts I highlighted that one of improvements in Percona Server is support of innodb_log_file_size > 4G. This test was done using Percona Server 5.5.7, but the same performance expected for InnoDB-plugin and MySQL 5.5.

The valid question how long is recovery in this case, so let’s test it. I took the same tpcc-mysql 1000W workload with 52GB and 144GB innodb_buffer_pool_size with data located on Virident tachIOn card and killed mysqld after 30 mins of work.

The recovery time after start is:
for 52GB innodb_buffer_pool_size:

that is 7min 51sec

for 144GB innodb_buffer_pool_size:

that is 9min 23sec

and
for 144GB innodb_buffer_pool_size with data stored on RAID10:

that is 14min 57sec

I think this time is acceptable as trade-off for performance.

However it should be taken into account if you use HA solution like DRBD, as it basically means this is time for fail-over period, and your system will be down during this time.

11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Rob Wultsch

I think it would be useful if the particular version of MySQL/Percona server was noted. Something like:
” You can expect similar performance by using MySQL 5.1.46/Percona server version X.Y.Z and the InnoDB plugin/XtraDB”

Peter Zaitsev

Rob, Vadim I would note MySQL does not support larger log files 🙂

15 or so should be good enough for many workloads indeed. One interesting test though could be to test performance and recovery time with different log file sizes.

How much performance would be impacted by using 4G log file (for example) and how much longer recovery will take ? If performance is 5% less and recovery is half as long
4GB may be better choice for a lot of workloads.

I would also mention one really needs to measure recovery time – it is very much workload related.

inaam

Vadim,

When measuring recovery times it is more appropriate to consider end of the redo apply phase as end of recovery time. After that the server is online and the rollback of non-prepared transactions continues in the background. Strictly speaking downtime (as viewed by DBA) ends when you see the following message in the database logs:

InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 587, file name ./mysql-bin.000018

101222 13:49:25 [Note] mysqld: ready for connections.

Usually the rollback of non-prepared transactions is a quick affair only spanning few seconds but in case of long running transactions it can last many minutes.

Baron Schwartz

One of the areas we might look to optimize is the rollback performance in general (not just during recovery). It is a worst-case kind of scenario that can cause serious problems: someone decides to kill a long-running transaction after say, 4 hours and it takes days to roll back, and there is no way to avoid it other than going through some really undesirable procedures with restarts and innodb_force_recovery.

Coway Zhou

Ten minutes for 8G log file is too good for me!

The log file size should not solely determine the recovery time. The number of transactions in the log file that need roll forward and backward determines the recovery time.

I tested innodb hot backup (innobase.com). It took 40 hours to apply 13G accumulated log file. Since there are several hundred millions insert/update need be applied.

Peter, do you know how many transactions being applied in your 8G log file?

Coway Zhou

One thing I’d like to add, I did evaluate xtrabackup software too. It did recover faster than innodb hot backup 🙂

Unfortunately, it fails if partitions are dropped or any tables are dropped for legitimate purpose during backup. Another bug is it fails when the file name is too long in partitioning?

Peter Zaitsev

Coway,

Note – redo logs do not really deal with transactions but with changes done to the pages and indeed the number of these changes and their locality will affect recovery time. Regarding stats – you can enable innodb_recovery_stats in Percona Server/XtraDB to see what operations are done during recovery process.

Also note in early 5.1 version and below Innodb was really bad with recovery performance due to some performance bugs, it was dramatically improved since that time.

jackbillow

if you have a HA solution with DRBD,it is not good for the large pool buffer and large logfile size. it will increase failover time and the cost is very hight.

Yingkuan

One thing not very clear here is the test result didn’t show how much of 8G redo logfiles has been filled up in each cases.

Depends on the redo log generated per second (from show innodb status), the redo logfile could be filled up at different level in these cases.
It’s unlikely to have 8G log generated within 30 min so we can ignore the logfile switch cases.

So it could be only 2G or 4G logs need to be recovered from this 8G container. The higher buffer pool the higher TPS hence more log generated during test time. Could that be the reason why higher buffer pool test case took longer to recover? Otherwise, if all tests have exactly same amount LOG generated, it’s hard to explain why higher buffer pool took long time to recover. Otherwise should we just reduce buffer pool to 1G to get superb recover time?