As we can see, MySQL 5.5.8 comes with great improvements and scalability fixes. Adding up all the new features, you have a great release. However, there is one area I want to touch on in this post. At Percona, we consider it important not only to have the best peak performance, but also stable and predictable performance. I refer you to Peter’s post, Performance Optimization and Six Sigma.

In Percona Server (and actually even before that, in percona-patches builds for 5.0), we added adaptive checkpoint algorithms, and later the InnoDB-plugin included an implementation of  “adaptive flushing”. This post shows the differences between them and MySQL.

The post also answers the question of whether we are going to have releases of Percona Server/XtraDB based on the MySQL 5.5 code line. The answer: Yes, we are. My benchmarks here are based on Percona Server 5.5.7. (You can get the source code from lp:~percona-dev/percona-server/5.5.7 , but it is very beta quality at the moment.)


For this post, I made tpcc-runs on our Dell PowerEdge R900 box, using RAID10 over 8 disks and a FusionIO 320GB MLC card.

First,  the results for tpcc-mysql, 500w (around 50GB of data) on RAID10. I used innodb_buffer_pool_size=24G, innodb_log_file_size=2000M (innodb_log_files_in_group=2), and innodb_flush_log_at_trx_commit=2. Also, innodb_adaptive_flushing (ON) / innodb_adaptive_checkpoint (estimate) were the default values.

The raw results, full config files, and scripts are in our Benchmarks Wiki.

The graphical result below shows the throughput on the server over 8 hours. (Yes, 8 hours, to show MySQL performance over a long time period. It is not a short, 5-minute exercise.)

Although it takes a decent time for the Percona Server results to stabilize, for MySQL 5.5.8 we have regular dips (3 times per hour) from 24900 NOTPM to 17700 NOTPM (dips of around 30%).

Next, the second run on the FusionIO card. There I should say that we were not able to get stable results with the existing adaptive_checkpoint or adaptive_flushing algorithms. So, Yasufumi invested a lot of research time and came up with the new innodb_adaptive_checkpoint=”keep_average” method. This method requires setting innodb_flush_neighbor_pages=0 , to disable flushing of neighborhood pages (not available in MySQL 5.5.8). The problem with flushing neighborhood pages is that it makes an exact calculation of how many pages were handled impossible. The flushing neighborhoods feature was created as an optimization for hard drives, since InnoDB tries to combine writing as many pages as possible into a single sequential write, which means that a single I/O may have a size of 32K, 64K, 96K, …, etc. And again, that makes a prediction of how many I/O operations there are impossible. Furthermore, this optimization is not needed for flash devices, like FusionIO or Virident cards.

An additional optimization we have for SSDs is big log files. For this run, I used innodb_log_file_size=4G (innodb_log_files_in_group=2) for Percona Server. That gave 8GB in total size for log files (MySQL 5.5.8 has a 4GB limit). In additional to increasing log_size we added option innodb_log_block_size which allows to change IO block size for logs files. Default is 512 bytes, in test with FusionIO I use 4096 bytes, to align IO with internal FusionIO size.

You can see that MySQL 5.5.8 has periodic drops here, too. The margin between Percona Server and MySQL is about 2500-2800 NOTPM (~15% difference).

MySQL 5.5.8 now has features related to having several buffer pool instances that are supposed to fix the buffer pool scalability issue. Let’s see how MySQL performance changes for the last workload if we set innodb_buffer_pool_instances=8 or 16.

As you see, having several buffer pools makes the dips deeper and longer. It seems that for Percona Server the best choice is innodb_buffer_pool_instances=1, as we implemented buffer pool scalability in a different way.

UPDATE
By request from commenter I put also results with different innodb_io_capacity for MySQL 5.5.8. It is 500 ( which I used in benchmarks above), 4000 and 20000.

As you see there is no improvements from bigger innodb_io_capacity, and it also concurs with my previous experience, that with bigger io_capacity you rather getting worse results.

For reference, here is the config file used for benchmarks on FusionIO:

(post edited by Fred Linhoss)

25 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Andy

A few questions:

1) Compare the 1st graph with the 2nd graph, you got around 25000 NOTPM on hard drives but less than 20000 NOTPM on FusionIO. I’d expect the results for FusionIO to be much higher than that of HDD, but that’s not the case here. Why is that?

2) the settings you used for FusionIO such as innodb_flush_neighbor_pages=0, would that also work for SATA SSD such as Intel X25?

3) You used innodb_io_capacity=500 for FusionIO. That implies an IO capacity of 500 IOPS, right? But FusionIO can do much higher than that. Why didn’t you use a higher value of innodb_io_capacity?

Dimitri

Hi Vadim,

few comments so far:

* any reason why innodb_io_capacity=500 ?.. – if I remember well, within XtraDB you’re not limiting flushing by I/O capacity, while 5.5.8 will respect I/O capacity limit and not flush more pages/sec than this limit.. (we discussed about during your previous tests on 5.5.4)

So, may you replay your tests with, say, innodb_io_capacity=20000 ? (I have no idea how intensive your workload on flushing ;-))

* Regarding tests with Buffer Pool instances: did you monitor on which mutexes a contention become more hot??.. – I’m very curious because on my tests I’ve obtained a completely opposite results.. – 16 instances are giving more than 50% better performance..

* Then with a REDO log block size it’s a good point – 4K is the most optimal I/O size for SSD! – however, I have a doubt it makes a huge difference while innodb_flush_log_at_trx_commit=2 is used.. So, to prove it you should present XtraDB results comparing innodb_log_block_size=4096 and 512, may you present them here?..

Very interesting results anyway – as they are involving many questions of all kind 😉

Rgds,
-Dimitri

inaam

Vadim,

Thanks for the detailed post. Couple of comments/observations:
* I agree that adaptive_flushing needs more work. One of the things that we did in 5.5 is to make it more accurate. As you know master thread does the adaptive_flushing check every second in a loop. In InnoDB plugin for 5.1 if master thread ends up doing some flushing work in one iteration it won’t sleep and would do the next iteration right away. That means it can do multiple batches in one second. In 5.5 we tidied it up a little and forced master thread to sleep for ‘1 – time spent in flusing’ seconds. What that means is that adaptive_flushing will do a maximum of one innodb_io_capacity batch in one second. Seeing the pattern above and the capabilities of the system you are benchmarking on I’d suggest setting innodb_io_capacity to something more realistic than 500. I’d say something like 4000 won’t be a bad idea. adaptive_flushing is heavily dependent on innodb_io_capacity i.e.: the batch size is capped by this value.
* Assuming your workload to be much like dbt2, I don’t think purge should be a concern. But to keep master thread free of getting stuck in purge maybe it is a good idea to use innodb_purge_threads = 1.

Andy

You set the log block size to 4K to align with FusionIO size. What about non-log IO such as IO for table and index data? Is it necessary to set block size for them too?

Andy

Vadim,

I thought all the doublewrite data is stored in ibdata1. Are you saying you can set it to be stored somewhere else? What is the name of the doublewrite file?

So tables and indexes should be stored in SSD, ibdata1 in RAID 10, and doublewrite file somewhere else other than ibdata1’s RAID 10? Can you clarify?

How should I use innodb_doublewrite_file=? Can you give an example?

Thanks.

Mark Callaghan

Vadim — is this a problem in Percona? That is, is it possible for threads handling user connections to request page writes when the async checkpoint limit has been reached? And if it does submit write requests, does it block for them to finish? That is the case for official InnoDB which means that foreground threads block on IO to be done by background threads and can add lots of pauses.

http://bugs.mysql.com/bug.php?id=55004

Dimitri

Vadim,

Thanks for update!

TPS drops are looking smaller with innodb_io_capacity=20000, but peak performance is lower due higher writes I suppose.. – and what will be a picture within 8 hours? (I was mainly curious on your 500w test, where during first 4 hours you have TPS drops on XtraDB too..)

BTW, how many concurrent user sessions were used during your test?
and how many CPU/cores do you have on your server?

Where may I get the percona-5.5.7 “tarball” to test it on my server?.. (I saw you’ve mentioned a LP trunk and the code is “very beta”, so I supposing it’s changing most of the time and if you may provide a source or binary tarball used within your tests it’ll simplify many things)

Thank you!

Rgds,
-Dimitri

Rgds,
-Dimitri

Hrvoje

I would be very interested to see it compared with 5.1 version of Percona

Vadim

Dimitri,

it is 16 cores system and I used 16 users.
I may run 16 hours tests, but it is all time consuming 🙂

I propose you to wait our patches available for 5.5.8, we will provide .tar.gz for that version.

Peter Zaitsev

Vadim,

Thanks for publishing results. Do you think it is possible to do something so Percona Server stabilizes quickly ? Does our Buffer Pool preload helps here ? I mean if I would use this option would I end up at almost stabilized state straight away or it is different self balancing stuff which is going on

Re innodb_io_capacity I think this option is “evil” to begin with. It would be much better to have database to balance to use capacity of storage it has available (probably self tuning based on IO performance it is getting) as there are many ways IO storage can vary a lot during run time. Think shared SAN storage, LVM snapshots or cloud for example.

Liang Xie

Hi Vadim,

Thanks for sharing this amazing testing.Just one minor question: where can i get the above “tcpp-mysql” tool ?

Best Regards,
-Liang

Seun Osewa

Any particular reason you used different data for the HDD and FusionIO tests? It’s frustrating because you robbed us of an opportunity to compare the performance of the two types of storage.

Dimitri

Vadim,

it’s pity you’re presenting results for the code I’m not able to test :-))
I’ve get a look on LP – there are only patches, without any info what to do and how to apply them..
so, yes, I may only wait for your 5.5.8 tar.gz..

Regarding the tpcc-mysql: there is no README either how to run it :-))
I may suppose that I have to compile tpcc_load and tpcc_start, then load the data, and then run a test…
May you please, just give me 3 lines:
– how I create db schema
– how do I load the data (exactly params)
– how do I run the test (exactly params)

(when I think that you told me that dbSTRESS is hard to setup… ;-))

And the last: may you finally present the result comparing 512 and 4K bytes in redo log block size?.. – you just told it improves performance on SSD but without giving any numbers..

Thank you!

Rgds,
-Dimitri

Dhiraj

Hi Vandim,
I have two questions regarding MySql 5.5.8:
1.) Previous versions of MySql had a size restriction of 4GB per table. So does MySql 5.5.8 has any such restriction?

2.) How many Schema’s can we prepare in MySql 5.5.8?

3.) Is there any limit for number of Columns a table can contain in MySql 5.5.8 ?

Regards,
Dhiraj

Jason

When will Percona Server 5.1 will stable and available?