April 23, 2014

SSL Performance Overhead in MySQL

NOTE: This is part 1 of what will be a two-part series on the performance implications of using in-flight data encryption.

Some of you may recall my security webinar from back in mid-August; one of the follow-up questions that I was asked was about the performance impact of enabling SSL connections. My answer was 25%, based on some 2011 data that I had seen over on yaSSL’s website, but I included the caveat that it is workload-dependent, because the most expensive part of using SSL is establishing the connection. Not long thereafter, I received a request to conduct some more specific benchmarks surrounding SSL usage in MySQL, and today I’m going to show the results.

First, the testing environment. All tests were performed on an Intel Core i7-2600K 3.4GHz CPU (8 cores, HT included) with 32GB of RAM and CentOS 6.4. The disk subsystem is a 2-disk RAID-0 of Samsung 830 SSDs, although since we’re only concerned with measuring the overhead added by using SSL connections, we’ll only be conducting read-only tests with a dataset that fits completely in the buffer pool. The version of MySQL used for this experiment is Community Edition 5.6.13, and the testing tools are sysbench 0.5 and Perl. We conduct two tests, each one designed to simulate one of the most common MySQL usage patterns. First, we examine connection pooling, often seen in the Java world, where some small set of connections are established by, for example, the servlet container and then just passed around to the application as needed, and one-request-per-connection, typical in the LAMP world, where the script that displays a given page might connect to the database, run a couple of queries, and then disconnect.

Test 1: Connection Pool

For the first test, I ran sysbench in read-only mode at concurrency levels of 1, 2, 4, 8, 16, and 32 threads, first with no encryption and then with SSL enabled and key lengths of 1024, 2048, and 4096 bits. 8 sysbench tables were prepared, each containing 100,000 rows, resulting in a total data size of approximately 256MB. The size of my InnoDB buffer pool was 4GB, and before conducting each official measurement run, I ran a warm-up run to prime the buffer pool. Each official test run lasted 10 minutes; this might seem short, but unlike, say, a PCIe flash storage device, I would not expect the variable under observation to really change that much over time or need time to stabilize. The basic sysbench syntax used is shown below.

If you’re not familiar with sysbench, the important thing to know about it for our purposes is that it does not connect and disconnect after each query or after each transaction. It establishes N connections to the database (where N is the number of threads) and runs queries though them until the test is over. This behavior provides our connection-pool simulation. The assumption, given what we know about where SSL is the slowest, is that the performance penalty here should be the lowest. First, let’s look at raw throughput, measured in queries per second:

sysbench-throughput

The average throughput and standard deviation (both measured in queries per second) for each test configuration is shown below in tabular format:

# of threads
SSL key size
12481632
SSL OFF9250.18 (1005.82)18297.61 (689.22)33910.31 (446.02)50077.60 (1525.37)49844.49 (934.86)49651.09 (498.68)
1024-bit2406.53 (288.53)4650.56 (558.58)9183.33 (1565.41)26007.11 (345.79)25959.61 (343.55)25913.69 (192.90)
2048-bit2448.43 (290.02)4641.61 (510.91)8951.67 (1043.99)26143.25 (360.84)25872.10 (324.48)25764.48 (370.33)
4096-bit2427.95 (289.00)4641.32 (547.57)8991.37 (1005.89)26058.09 (432.86)25990.13 (439.53)26041.27 (780.71)

So, given that this is an 8-core machine and IO isn’t a factor, we would expect throughput to max out at 8 threads, so the levelling-off of performance is expected. What we also see is that it doesn’t seem to make much difference what key length is used, which is also largely expected. However, I definitely didn’t think the encryption overhead would be so high.

The next graph here is 95th-percentile latency from the same test:

sysbench-response-time

And in tabular format, the raw numbers (average and standard deviation):

# of threads
SSL key size
12481632
SSL OFF1.882 (0.522)1.728 (0.167)1.764 (0.145)2.459 (0.523)6.616 (0.251)27.307 (0.817)
1024-bit6.151 (0.241)6.442 (0.180)6.677 (0.289)4.535 (0.507)11.481 (1.403)37.152 (0.393)
2048-bit6.083 (0.277)6.510 (0.081)6.693 (0.043)4.498 (0.503)11.222 (1.502)37.387 (0.393)
4096-bit6.120 (0.268)6.454 (0.119)6.690 (0.043)4.571 (0.727)11.194 (1.395)37.26 (0.307)

With the exception of 8 and 32 threads, the latency introduced by the use of SSL is constant at right around 5ms, regardless of the key length or the number of threads. I’m not surprised that there’s a large jump in latency at 32 threads, but I don’t have an immediate explanation for the improvement in the SSL latency numbers at 8 threads.

Test 2: Connection Time

For the second test, I wrote a simple Perl script to just connect and disconnect from the database as fast as possible. We know that it’s the connection setup which is the slowest part of SSL, and the previous test already shows us roughly what we can expect for SSL encryption overhead for sending data once the connection has been established, so let’s see just how much overhead SSL adds to connection time. The basic script to do this is quite simple (non-SSL version shown):

As with test #1, I ran test #2 with no encryption and SSL encryption of 1024, 2048, and 4098 bits, and I conducted 10 trials of each configuration. Then I took the elapsed time for each test and converted it to connections per second. The graph below shows the results from each run:
connection-throughput

Here are the averages and standard deviations:

EncryptionAverage connections per secondStandard deviation
None2701.75165.54
1024-bit77.046.14
2048-bit28.1831.713
4096-bit5.450.015

Yes, that’s right, 4096-bit SSL connections are 3 orders of magnitude slower to establish than unencrypted connections. Really, the connection overhead for any level of SSL usage is quite high when compared to the unencrypted test, and it’s certainly much higher than my original quoted number of 25%.

Analysis and Parting Thoughts

So, what do we take away from this? The first thing is, of course, is that SSL overhead is a lot higher than 25%, particularly if your application uses anything close to the one-connection-per-request pattern. For a system which establishes and maintains long-running connections, the initial connection overhead becomes a non-factor, regardless of the encryption strength, but there’s still a rather large performance penalty compared to the unencrypted connection.

This leads directly into the second point, which is that connection pooling is by far a more efficient method of using SSL if your application can support it.

But what if connection pooling isn’t an option, MySQL’s SSL performance is insufficient, and you still need full encryption of data in-flight? Run the encryption component of your system at a lower layer – a VPN with hardware crypto would be the fastest approach, but even something as simple as an SSH tunnel or OpenVPN *might* be faster than SSL within MySQL. I’ll be exploring some of these solutions in a follow-up post.

And finally… when in doubt, run your own benchmarks. I don’t have an explanation for why the yaSSL numbers are so different from these (maybe yaSSL is a faster SSL library than openSSL, or maybe they used a different cipher – if you’re curious, the original 25% number came from slides 56-58 of this presentation), but in any event, this does illustrate why it’s important to run tests on your own hardware and with your own workload when you’re interested in finding out how well something will perform rather than taking someone else’s word for it.

About Ernie Souhrada

Ernie joined Percona in April 2012 as a Senior Consultant. In his previous lives, he has been everything from a Perl/Java developer to a Linux sysadmin, a MySQL DBA to a Cisco network engineer, and a security auditor to an IT engineering manager, many of these things all at the same time. When not working on MySQL, he might be found on the ski slope, at a psytrance festival, or at the nearest sushi bar.

Comments

  1. RyanC says:

    Could you let us know that cipher suites were being used in your SSL connections? There’s a substantial performance difference between, DES-CBC3-SHA and AES128-GCM-SHA256, for example, especially if both client and server support AES-NI acceleration (an i7-2600k should).

  2. Would a VPN or stunnel therefore be a better solution for systems that make frequent MySQL connections? Because the SSL handshake is done once, and removed from cost of repeated connections. The old lessons are still true, e.g. remove loop-invariant code from the loop.

  3. Ernie,

    Wow. that is quite a lot more than I would think… I think what would be very interesting is also to run the test with very simple queries (lookup by primary key) vs streaming – ie SELECT * FROM sbtest; to see the overhead per query vs “streaming”

  4. Quick question: was the sysbench and the server on the same machine?

  5. Baron says:

    The smoothed lines in the charts are making my nose twitch. I will put Tufte on your Christmas list :)

  6. Topher says:

    I concur with Bill and Ryan’s comments. The SSL key size is little bit of a red herring in that it’s only used in the handshake at connection setup time. When using a connection pool, this penalty applies ideally once when the pool is setup… I’m sure there are reconnects and re-negotiations. The supporting algorithms that carry the SSL sessions would seem to be more important here if there isn’t any acceleration. Even wit AES-NI I’m curious if the CPU can carry both DB threads and SSL termination effectively. AES128 is light and secure. If sniffing is low risk and performance is critical, I’d maybe consider RC4 despite some of it’s crypto short comings.

  7. Paul says:

    What is causing such a significant drop in thruput when using SSL?
    Granted there is a handshake to establish, which will obviously have some overhead. But surely this handshake overhead is almost identical to that used by HTTPS?

    Interestingly, when Google changed Gmail to require HTTPS in 2010 (instead of it being an option), they required no further hardware, servers, hosts, nothing. The cpu penalty was less than 1% across the board. Admittedly they used 1028 bit keys, but the graphs above show that strength is not significant factor in handshake time.

    Is there something peculiar about mysql’s implementation of ssl, which means it cannot be comparable to https?

  8. To answer these in order…

    @Ryan: I did not specify a cipher during the configuration, so I got the default: DHE-RSA-AES256-SHA. I’m sure that the results would be different with different ciphers.

    @Bill: My thinking here is yes – that’s something I was planning to do for the second post.

    @Peter: At one thread, “SELECT * FROM sbtest” is 4x slower with encryption than without, regardless of key length, so that matches up pretty closely with the sysbench results.

    @Razvan: For the tests reported on in this post, yes, but I see where you’re going with that question. I have not tried the sysbench tests from a separate machine, but for the connection setup tests, the performance isn’t really any better. With no encryption, network speed becomes the limiting factor, and over a gigE link I can create about 930 connections per second. At 1024 bits, throughput drops to 72 cps, and it just gets worse from there.

    @Baron: I’m happy to accept Christmas presents of all kinds. :-)

    @Paul: My working theory at this point is that it has a lot to do with openSSL and not necessarily anything specific to MySQL. I have a copy of Percona Server 5.6 which, I think, has openSSL statically linked, and connection setup time is more than 2x faster at 1024 bits and almost 4x faster at 4096 bits. Choice of cipher, as Ryan mentioned, is also likely to play a large part. More investigation required – this is a fairly deep rabbit hole.

  9. I am curious about connection create performance with/without SSL when there is more network latency between client and server. Any chance you will run such a test?

  10. Paul says:

    Hi Ernie,

    I agree in theory the results should be influenced primarily by OpenSSL rather than MySQL.
    If you could share your openSSL version (and operating system ideally, since RedHat backports things but keeps version same), that would be very useful. Id like to see if I can replicate these tests myself. They are unexpected yet interesting

    Thanks

  11. marc castrovinci says:

    Any idea if the same impact is seen with Galera using ssl certs for group communication?

  12. Ernie S says:

    @Mark – I hadn’t originally planned on running a test like that, but that is an interesting question. It wouldn’t take very long to set up and run, so I’ll put that on the list for the follow-up to this post.

    @Paul – CentOS 6.4 (kernel 2.6.32-358.18.1.el6.x86_64), and these are the versions of OpenSSL I have installed:
    openssl098e-0.9.8e-17.el6.centos.2.x86_64
    openssl-1.0.0-27.el6_4.2.x86_64
    And I used the binary tarball of MySQL 5.6.13 – not a CentOS/RHEL-specific RPM, although I would think (hope?) that it shouldn’t matter.

    @Marc – Another interesting question. I would assume that the performance hit for Galera would be similar to the performance hit for standard replication, which I would assume (yes, I know, a lot of assuming going on here) to be about the same as the performance hit for the connection-pool test. Time permitting, I’ll take a look into this, too.

Speak Your Mind

*