Do you know that there are two limits about dirty (modified but not flushed to disk) blocks of InnoDB buffer pool? One is the limit of “amount”. The other is the limit of “age”.

— limit of “amount” —

As you know, buffer pool of InnoDB works as write-back cache of its datafiles. If the buffer pool is filled by dirty blocks, InnoDB cannot allocate new blocks without flushing the dirty blocks and the performance would get worse. This is the limit of dirty block “amount”. We can avoid this limit by setting ‘innodb_max_dirty_pages_pct’ smaller or setting the larger buffer pool size. We might be never at a loss about the limit.

The another limit we should understand is limit of dirty block “age”.

— limit of “age” —

As you know again, because InnoDB write the modifies of datafile to transaction log file synchronously, InnoDB is allowed to treat its buffer pool as write-back cache. The transaction log from the last checkpoint assures the latest-committed consistent data. So, the transaction log must contain the all transaction log from the last checkpoint.

Basically, InnoDB makes the checkpoint passively (it is called “fuzzy checkpoint”). InnoDB makes the point of time when the oldest dirty block occurred as the new checkpoint.

In short,
the dirty block older than the oldest transaction log is never allowed to exist.

This is the limit of dirty block “age”.

The max age is determined by the total size of the transaction log files, because InnoDB uses the transaction log circulately. But, the larger transaction log file may not solve the problem of dirty block “age”…

The main essence of the “age” limit problem might be clustered distribution of “age”. If there are huge number of similar aged dirty blocks and their age nears the max age, InnoDB flushes the dirty blocks with its best, but the oldest “age” of dirty blocks might not change and might reach the critical limit. Then InnoDB will pause until all of the old dirty blocks flushed…

I will show you the examples.
The workload is TPC-C like and the graphs show the change of throughput, write IO and checkpoint age with time.

<Normal>

The first is the case of normal (not patched) InnoDB.

Normal case

The intense write IO occur and InnoDB stalls when the checkpoint age reaches the red broken line. After that, InnoDB stalls periodically.

The steady flushing may be short, and there may be clustered distributed “age”. The clustered “age” may causes the clustered flushing and the next clustered “age”…

<innodb_max_dirty_pages_pct(native parameter) & innodb_io_capacity(patched parameter)>

The next graph is the case when I tried to restrain the checkpoint age growing by controlling steady flushing with only innodb_max_dirty_pages_pct and innodb_io_capacity.

Tuned by dirty_pages_pct

There are no sudden stalls, but many write IO and regression of the average throughput. The optimum setting may be very difficult by this method, because innodb_max_dirty_pages_pct and innodb_io_capacity are independent to the checkpoint age.

<innodb_adaptive_checkpoint(new patched parameter) & innodb_io_capacity>

The last is the case of using innodb_adaptive_checkpoint instead of innodb_max_dirty_pages_pct.

Tuned by adaptive_checkpoint

During the checkpoint age excesses the light-blue broken line, the steady flushing occurs. If the checkpoint age reaches the next orange line, the flushing becomes more strong. This is the effect of innodb_adaptive_checkpoint. And its strength can be controlled by innodb_io_capacity. (The optimum setting may not be difficult as the 2nd case.)

At first, the checkpoint age touch the red line once and the regression occurs. However after that, the periodical regression waves seem to be attenuated quickly. The clustered “age” may be flatted. There are no the regression of the average throughput.

innodb_adaptive_checkpoint will solve or soften such problems of dirty block “age” limit
with easier setting.

In addition, innodb_io_patches.patch also adds
max checkpoint age“, “modified age” and “checkpoint age
to SHOW INNODB STATUS and we can check them easily.

9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Peter Zaitsev

Great result Yasufumi !

I guess even if average TPM is not that much different, affect 95 percentile response time is significant… and in practice the periodic checkpointing stalls are surely not acceptable for real applications.

There is still probably a bit potential for improvement for system to self adapt and try to even out writes so we do not get into 20% TPM dips/write spikes.

John Masters

How do you collect the information for the graphs above from mysql? Also, is there a tool to plot the data to visualize the performance bottlenecks like you show above? Appreciate your help in this regards as we are running into strange issues with Mysql where in the same query with (sql_no_cache) gets the results in 50seconds one time and it takes more than 600secs the other. I see no other processes running using “show processlist”. So I am assuming something to do with mysql system level bottlenecks and would like some tools to help diagnose

Thanks

http://mysqlha.blogspot.com/

This is great. Can you provide more details on what ‘adaptive checkpointing’ does or reference a description elsewhere?

Davy Campano

Yasufumi,

Is innodb_adaptive_checkpoint a setting that can be changed online, or does it require restart? Thanks

Davy

http://mysqlha.blogspot.com/

Yes, I would like to know what makes it adaptive? The original code flushes at most a fixed number of blocks per time interval as requested by a background IO thread. Have you added something to make the number of blocks to flush dynamic?

Mike

What are the units of the values reported for “max checkpoint age”, “modified age” and “checkpoint age” ?