February 8, 2012

mysql-proxy, urgh performance and scalability ?

For one our project I needed proxy solution, and mysql-proxy is one of well-known, so it was logical first choice. The obvious question which come in mind is what performance penalty we have using mysql-proxy, version 0.7.1.
So it is easy to test. (By the way sysbench recently was pushed to Launchpad, see lp:sysbench, and Percona is going to be active developer of this project and scripting benchmarks).

I took lp:sysbench with LUA script oltp_complex_ro, and tested for couple connections, here are results (in transactions per second, more is better)

Threads MySQL-5.0.77 MySQL-proxy+MySQL-5.0.77
1 660.02 349.86
2 1158.66 477.77
4 1223.84 485.21
8 1224.22 455.69
16 1109.72 441.55
32 1059.23 419.05
64 909.98 414.30
128 882.46 406.28

ok, and let’s look on response time (for 1 thread).

MySQL

    per-request statistics:
         min:                                  1.31ms
         avg:                                  1.51ms
         max:                                  5.30ms
         approx.  95 percentile:               1.56ms

Proxy+MySQL

    per-request statistics:
         min:                                  2.04ms
         avg:                                  2.86ms
         max:                                  6.44ms
         approx.  95 percentile:               4.30ms

Well, I expected some penalty of using proxy… but 2-3x times, that’s overkill. Worth to consider if you want to run Query Analyzer with MySQL-proxy on your MySQL Enterprise setup.

There is alternative – Dormando-proxy, which I want to try also, but the problem is it crashed under sysbench load, so I was not able to get any results yet.

About Vadim Tkachenko

Vadim leads Percona's development group, which produces the Percona Server and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. peter says:

    Vadim,

    It is interesting what peak overhead would you observe ie if you’re looking at simple sysbench.

    A lot of web applications have a lot of very simple queries (close to single row lookup by key) and these will show maximum overhead.

    On the lower part we probably can try some streaming of large data.

    And well…. Compared to this the overhead we observe by using full slow query logging on server is negligible.

  2. Jan Kneschke says:

    Hi Vadim, please benchmark the current trunk. It has a threaded network-io (–event-threads=4) and want for the 0.9 release which will remove the global Lua lock which is limiting the throughput too. Without a Lua script trunk should show a huge difference already.

  3. Kay Roepke says:

    Hi Vadim!

    As you’ve found, the single-threaded implementation of 0.7.1 has its limits. And depending on the workload yes the hit can be up to 75% (which matches quite nicely with worst-case benchmarks I have done in the past).

    As Peter points out: It is interesting to look at the overhead curve across different applications.
    My rule of thumb for 0.7 performance is: The shorter the query execution time, the higher the overhead will be. This is a direct consequence of 0.7 being single-threaded and event-based, because MySQL Proxy 0.7 can make progress whenever it waits for events, so naturally many short queries will mean that Proxy needs to do a lot of work for each query (because the result packets will arrive very shortly after the query was issued, decreasing the time it can spend on other queries).
    Lots of work in Lua scripts will also block progress on other connections in 0.7.

    For 0.8 (current trunk on Launchpad) the picture is different: The multithreading in it applies to all network communication, even when one thread is spending time in Lua (although in 0.8 only one thread can be in Lua-land at any given time – lifting this is scheduled for 0.9 as Jan says). Thus Proxy can make progress on up to as many connections as you have event-threads running. Connections can wander between threads, so on average you can have as many concurrent active connections being serviced as you have threads.

    We are ramping up on regular scalability testing for it, but are not quite there yet (waiting on some infrastructure to be ready) but once we are we will be publishing results regularly.

    If you are interested, we’ve spent quite some time improving the code and architecture documentation in the current trunk (you’ll need doxygen/graphviz/mscgen). It explains the architecture clearly, we are hoping ;)
    In case you find any bugs, please report them at http://bugs.mysql.com in the Proxy component. If you have additional questions, we are on #mysql-proxy on freenode, as well.

    cheers,
    -k

  4. It’s worth noting that this is an improvement over the last time I remember Vadim trying out mysql-proxy. Maybe I’m wrong, but I remember it crashing so we couldn’t really benchmark it? Vadim, do you remember that, or am I telling lies?

  5. Michael Peters says:

    Have you looked at DBD::Gofer from the Perl world? It doesn’t support transactions but I’d be interested to see how it stacks up.

  6. erik says:

    What about using ha-proxy? Its not an sql specific proxy but seems like it could balance the connections just as well as anything.

  7. Vadim says:

    Jan,

    I will try recent trunk.
    Current run was without Lua scripts, but actually I am looking to add some scripts, we I need Lua…

  8. Vadim says:

    Baron,

    It was long time ago and was very early releases, so I do not remember all details. Can’t say it was in previous release.

  9. Vadim says:

    Erik,

    I did not try ha-proxy, it seems it does not support functionality I need, I am looking to handle coming queries in one specific way.

  10. Robert says:

    Hi Vadim,

    Try Tungsten Connector. (http://www.continuent.com/community/tungsten-connector) It’s written in Java and schedules threads efficiently across multiple cores.

    Also, what do you need Lua for? If you provide a use case we can look at how to get it for you. We have a lot of work afoot in this area, for example implementing session consistency load balancing.

    Cheers, Robert

  11. Robert says:

    BTW Vadim, it looks as if you are hitting the proxy math problem I talked about at your recent and highly esteemed “Performance is Everything” conference. (Slides: https://s3.amazonaws.com/extras.continuent.com/Tungsten-Proxy-Architectures-2009-04-22.pdf)

  12. mtkopone says:

    Hi,

    Informative post, but it wasn’t clear to me whether the you used some kind of connection pooling? I.e. is the performance bad because of connection creation, or will the penalty effect each query run thru an open connection.

  13. B Clark says:

    This is disappointing. So … what are we supposed to use or do for loadbalancing?

  14. K. Heraud says:

    Yes, goog question…
    I have been adviced not to implement using two databases in my webapplication just to split between writes and reads (master / slaves). So I thought mysql-proxy could be great…

    If it leads to 2-3x overheads, is there another alternative to deal with that ?

  15. Mike says:

    I’m running MySQL 5.5 with MySQL Enterprise Manager 2.3.6

    I was wondering how many use mysql proxy “in the middle” in order to get Query Analyzer functionallity? This is really all I want it for. I’m at a crossroads trying to decide if going forward want to start implimenting mysql agent “in the middle” on all our production databases to get the Query Analyzer functionality. The main drawbacks I’ve found are the following:

    - Upgrading the agent requires downtime
    - Not able to use IP level access control on user accounts. Have to use ‘webuser’@'%’ for example.
    - Not able to see what servers users are logging in from since all logins are from the agent and are displayed as “localhost” in the processlist.
    - If agent crashes, access to database is lost.
    - Increased overhead.

    I’m just not sure it’s worth it!!

    Just wondering what others are doing. Is MySQL Agent used “in the middle” commonly in production databases?

    Thanks!

Speak Your Mind

*