MySQL on Amazon RDS part 1: insert performance

13 years ago

50K rows per second? Isn’t that kindof amazingly awesome? Does sysbench ‘prepare’ bulk insert or something? Even so.

0

Jamie McCarthy

13 years ago

The key figure I’d be interested in is write ops/sec. With innodb_flush_log_at_trx_commit = 1, how many update transactions can you commit per second, and what does iostat’s “w/s” say while you’re doing them? Can you even do those tests using RDS?

0

Baron Schwartz

Author

13 years ago

Jim, that is inserts per minute, not per second.

Jamie, RDS doesn’t provide access to iostat. You can only connect to the machine via mysql. I should have monitored the machine with Cloud Watch to get better metrics, although it still isn’t as good as I’d like.

0

Jim B.

13 years ago

3M rows/minute = 50K/second
400M rows in ~135 minutes = ~50K/second

0

Baron Schwartz

Author

13 years ago

Jim — sorry, I replied too hastily and didn’t pay attention to your comment. You’re right.

0

Peter Zaitsev

Admin

13 years ago

Baron,

I should note what with Insert the “working set” is not going to be the whole table, The Auto increment key insert go in the “end” of the primary key. The other keys in sysbench are also relatively clustered. So this should not be random IO bound workload. It should be mainly CPU bound. If you want IO bound workload you have to have some keys which are random, for example if you ad SHA1(id) as a key it would become io bound on insert at some time in the future.

0

Baron Schwartz

Author

13 years ago

Peter, yes, absolutely.

0

Jonathan Marston

13 years ago

Have you looked at benchmarking the EC2 High Performance Cluster Compute instance type?

http://aws.amazon.com/ec2/hpc-applications/

Not available for RDS (yet) but this would look to be a good path to optimize MySQL performance for in the amazon environment. App servers and replication slaves could all be running on HPC instances in the same placement group.

0

Patrick Casey

13 years ago

Does anybody have any (recent) experience with highly parallel inserts into INNODB?

I’m working on a project now where I have to write out several hundred million rows into an innodb database in the shortest possible time frame (long story as to why).

On my current hardware/software configuration (Mysql 5.051a/INNODB), I seem to cap out something in the innodb kernel around 8 parallel inserting threads. Any more than that and I actually get (slightly) negative throughput.

I checked the obvious stuff :

innodb_thread_concurrency is set to 32
vmstat shows 6-7 cores spinning and about 1 core worth of IO wait
I have 8 idle CPU cores.
IOStat shows I’m not capped (busy, but not pegged)
Log files are quite large (2G)

I’ve read some posts (on this blog I think), that older versions of innodb have concurrency issues around 8 cores and up, but frankly I’ve not seen that issue before with my typical, read heavy, workload; was wondering if I’ve finally run up against some of the internal database limits that things like xtra-db are aiming to correct?

0

Baron Schwartz

Author

13 years ago

Jonathan, I have not benchmarked on the HPC clusters, and have no immediate plans to do so. Part of the reason is the cost. You can rack up a steep bill pretty quickly doing these things. We usually wait until a client wants to know the answer and is willing to pay for it — that’s our business, after all. I don’t have budget allocated for doing it pro bono. That said, I’ve put my foot in my mouth with this half-done blog post on insert throughput, without measuring half the things I need to be able to explain the behavior. As a result, I’m going to have to eat a little humble pie and spin up another RDS instance on Percona’s dime to repeat it, and do it right this time.

0

Baron Schwartz

Author

13 years ago

Patrick, I can’t tell from your description what the problem you’re seeing is, but it sounds typical of a number of possibilities, and as a general rule, 5.0.51a is abysmally bad at those types of things compared to late 5.1 releases with the InnoDB plugin or 5.5 (or Percona Server). I would honestly not suggest trying to optimize 5.0.anything if there is any way at all to upgrade to something newer. Or at least use Percona Server 5.0 — we were able to backport a lot of the improvements (though not all of them).

0

Patrick Casey

13 years ago

Thanks Baron,

I need to look into whether or not its practical to upgrade. Be easier of course if I could guarantee beforehand “yeah, verily, if we upgrade it will be 3.14159 times faster”, but, of course, I can’t :). I’ll have to lab it out and see what I find.

Its one of thoses cases though where I’m honestly curious. Usually I’m pretty good at figuring out where the bottleneck is, and in this case I’m stumped, not IO saturation, not CPU saturation, not an obvious db concurrency limit, all of which leaves me waving my hands and rambling on about OS wait states and various other speculative things.

Basically, I just like to understand how this stuff words and I’m off my mental map on this one so was hoping somebody here had a hint or two :).

0

Peter Boros

13 years ago

Patrick: do you really have 16 cores, or you have 8 cores with hyperthreading? Also, partitioning would help you if the table is large, index maintenance will be cheaper.

0

Patrick Casey

13 years ago

8 physical cores with hyperthreading; its a pair of intel L5520s. I’m not showing CPU saturation through. 35-40% User CPU, maybe 5-10% System and then 50% or so idle. On other workloads (16 threads doing queries with large memory sort selects), I can drive CPU saturation up to 100%, so I know its possible to use ’em all.

We’re actually already partitioning the table into 10 chunks, which seems to be close to optimal. More chunks actually slows us down in this use case.

0

Baron Schwartz

Author

13 years ago

Patrick, I think “poor man’s profiler” is probably the tool for this situation.

0

Patrick Casey

13 years ago

Not sure if I should be embarrassed or not, but I’d never heard of that tool before. Think I’m going to take a crack at it; I’ll let folks know if I find anything interesting.

0

Baron Schwartz

Author

13 years ago

Patrick, there’s also a shiny version of it in Aspersa, along with nice docs on how to use.

0

Baron Schwartz

Author

13 years ago

Christian, I found some results that I didn’t understand, and delayed publishing more results until I did further benchmarking and validation. Unfortunately this ran me into the conference season, which is still ongoing, and will be followed immediately by a vacation. So it’s going to be a while, but I will finish and publish the results. Mind that this is an insert-only benchmark, and the other benchmarks you refer to sound like some other type of benchmark.

0

Christian

13 years ago

Just another thing…i just searched for similar benchmarks…but with the same “quadrupel super cool server” they just handle up to 7000 tpmC.

And you got it 1000 times faster? Or did I miss sth?

0

Christian

13 years ago

Hello Baron,

pretty amazing benchmark results..

Can you estimate when part2 of this article (multithreaded benchmarking) will be available?

Would it be even possible to build a cluster of cloud-instances…because I’m looking for a solution which can handle approx. 800 million inserts per hour…so I would need 5 instances for that task.

0

Baron Schwartz

Author

13 years ago

Patrick, there’s also a shiny version of it in Aspersa, along with nice docs on how to use.

0

Patrick Casey

13 years ago

Not sure if I should be embarrassed or not, but I’d never heard of that tool before. Think I’m going to take a crack at it; I’ll let folks know if I find anything interesting.

0

Baron Schwartz

Author

13 years ago

Patrick, I think “poor man’s profiler” is probably the tool for this situation.

0

Patrick Casey

13 years ago

8 physical cores with hyperthreading; its a pair of intel L5520s. I’m not showing CPU saturation through. 35-40% User CPU, maybe 5-10% System and then 50% or so idle. On other workloads (16 threads doing queries with large memory sort selects), I can drive CPU saturation up to 100%, so I know its possible to use ‘em all.

We’re actually already partitioning the table into 10 chunks, which seems to be close to optimal. More chunks actually slows us down in this use case.

0

Peter Boros

13 years ago

Patrick: do you really have 16 cores, or you have 8 cores with hyperthreading? Also, partitioning would help you if the table is large, index maintenance will be cheaper.

0

Patrick Casey

13 years ago

Thanks Baron,

I need to look into whether or not its practical to upgrade. Be easier of course if I could guarantee beforehand “yeah, verily, if we upgrade it will be 3.14159 times faster”, but, of course, I can’t . I’ll have to lab it out and see what I find.

Its one of thoses cases though where I’m honestly curious. Usually I’m pretty good at figuring out where the bottleneck is, and in this case I’m stumped, not IO saturation, not CPU saturation, not an obvious db concurrency limit, all of which leaves me waving my hands and rambling on about OS wait states and various other speculative things.

Basically, I just like to understand how this stuff words and I’m off my mental map on this one so was hoping somebody here had a hint or two .

0

Baron Schwartz

Author

13 years ago

Patrick, I can’t tell from your description what the problem you’re seeing is, but it sounds typical of a number of possibilities, and as a general rule, 5.0.51a is abysmally bad at those types of things compared to late 5.1 releases with the InnoDB plugin or 5.5 (or Percona Server). I would honestly not suggest trying to optimize 5.0.anything if there is any way at all to upgrade to something newer. Or at least use Percona Server 5.0 — we were able to backport a lot of the improvements (though not all of them).

0

Baron Schwartz

Author

13 years ago

Jonathan, I have not benchmarked on the HPC clusters, and have no immediate plans to do so. Part of the reason is the cost. You can rack up a steep bill pretty quickly doing these things. We usually wait until a client wants to know the answer and is willing to pay for it — that’s our business, after all. I don’t have budget allocated for doing it pro bono. That said, I’ve put my foot in my mouth with this half-done blog post on insert throughput, without measuring half the things I need to be able to explain the behavior. As a result, I’m going to have to eat a little humble pie and spin up another RDS instance on Percona’s dime to repeat it, and do it right this time.

0

Patrick Casey

13 years ago

Does anybody have any (recent) experience with highly parallel inserts into INNODB?

I’m working on a project now where I have to write out several hundred million rows into an innodb database in the shortest possible time frame (long story as to why).

On my current hardware/software configuration (Mysql 5.051a/INNODB), I seem to cap out something in the innodb kernel around 8 parallel inserting threads. Any more than that and I actually get (slightly) negative throughput.

I checked the obvious stuff :

innodb_thread_concurrency is set to 32
vmstat shows 6-7 cores spinning and about 1 core worth of IO wait
I have 8 idle CPU cores.
IOStat shows I’m not capped (busy, but not pegged)
Log files are quite large (2G)

I’ve read some posts (on this blog I think), that older versions of innodb have concurrency issues around 8 cores and up, but frankly I’ve not seen that issue before with my typical, read heavy, workload; was wondering if I’ve finally run up against some of the internal database limits that things like xtra-db are aiming to correct?

0

Jonathan Marston

13 years ago

Have you looked at benchmarking the EC2 High Performance Cluster Compute instance type?

http://aws.amazon.com/ec2/hpc-applications/

Not available for RDS (yet) but this would look to be a good path to optimize MySQL performance for in the amazon environment. App servers and replication slaves could all be running on HPC instances in the same placement group.

0

Baron Schwartz

Author

13 years ago

Peter, yes, absolutely.

0

Peter Zaitsev

Admin

13 years ago

Baron,

I should note what with Insert the “working set” is not going to be the whole table, The Auto increment key insert go in the “end” of the primary key. The other keys in sysbench are also relatively clustered. So this should not be random IO bound workload. It should be mainly CPU bound. If you want IO bound workload you have to have some keys which are random, for example if you ad SHA1(id) as a key it would become io bound on insert at some time in the future.

0

Baron Schwartz

Author

13 years ago

Jim — sorry, I replied too hastily and didn’t pay attention to your comment. You’re right.

0

Jim B.

13 years ago

3M rows/minute = 50K/second
400M rows in ~135 minutes = ~50K/second

0

Baron Schwartz

Author

13 years ago

Jim, that is inserts per minute, not per second.

Jamie, RDS doesn’t provide access to iostat. You can only connect to the machine via mysql. I should have monitored the machine with Cloud Watch to get better metrics, although it still isn’t as good as I’d like.

0

Jamie McCarthy

13 years ago

The key figure I’d be interested in is write ops/sec. With innodb_flush_log_at_trx_commit = 1, how many update transactions can you commit per second, and what does iostat’s “w/s” say while you’re doing them? Can you even do those tests using RDS?

0

Pedro Werneck

12 years ago

I see no improvement from RDS with innodb_flush_log_at_trx_commit set to 2 or 0. My benchmarks have the same results, regardless of the setting. Does RDS fools the flush in some way or has the setting disabled somehow?

0

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

MySQL on Amazon RDS part 1: insert performance

Related

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

MySQL on Amazon RDS part 1: insert performance

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation