Be careful with my findings, I appear to have compile in debug mode, I am redoing the benchmarks. Updated version here.

I recently had to work with many customers having large number of connections opened in MySQL and although I told them this was not optimal, I had no solid arguments to present. More than once, I heard: it is not a problem, those connections are Sleeping! In order to remedy to this situation, I decided to run a series of DBT2 benchmarks while the number of idle connections was varied from 0 to 1500. I was expecting an influence, because those idle threads are presents in internal MySQL list objects that need to be scanned and also the socket SELECT call needs to build a FD bitmap after every network activity and the size of the bitmap is proportional to the number of active threads. What I found is not a small impact at all.

For my daily work, I use 2 computers linked with a gigabit switch so I decided to use them for benchmarking. On the desktop, I installed Percona-Server-11.2 which I configured with the following Innodb settings:

The desktop is running Ubuntu 10.04, has 8GB of RAM and a Core i5 CPU (dual core with HT enabled). Since my goal is to test concurrency, I decided to use only one warehouse for DBT2 which easily fits inside the buffer pool. Then, with innodb_flush_log_at_trx_commit=0, I ensured I was not benchmarking my disk.

On the laptop, I ran the following script:

Where 10.2.2.129 is the IP of the Desktop. This script uses a PHP script, make_conn.php, to generate the idle connections. This script is the following:

Before each benchmark, the database is reinitialized to ensure consistent benchmarks. During all the benchmarks, the CPU load on the laptop was never above 10%. I also hacked a bit the “run_mysql.sh” script to allow more processing threads (100) instead of the max of 20. This is required for another series of benchmarks I’ll present soon, this time looking at the number active connections. For these benchmarks, I use pools of 4 connections, basically to match the number of available computing threads the Core i5 allows. I know by experience that this is about the max DBT2 NOTPM for small number of threads.

So, here are the results:

As you can see, the performance drop is shocking. At 1500 idle connections, the performance is 1.3% of the one with 0 idle. Even for as few as 20 idle connections, the drop is already of 40%. I also verified the laptop (running dbt2) was not slowed down by handling all those connections. I tried generating the idle connections from a third box and the results were the exact same. The conclusion is quite straightforward, idle connections hurts performance a lot!!! I am curious as to where MySQL is spending its time, I am planning to use profiling tools to identify the culprit but I had no time yet to do that. I am afraid the same scalability issues affects actives connections although these are also affected by concurrency issues. With new servers having 24+ cores, this phenomenon is seriously affecting performances.

21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Hervé COMMOWICK

You can limit the number of maximum concurrent connections to mysql server and proactively kill idle connections with HAProxy, check this :
http://flavio.tordini.org/a-more-stable-mysql-with-haproxy/comment-page-1

Raine

It would be interesting in making this test on MariaDB with threadpoll based connection handling (event based)…

Peter Zaitsev

Yves,

Great find ! It was for long well known the active connections significantly limit performance but it is good to have a data idle connections are not free ether.

thread-pool is one approach to this problem though I believe even with 1 thread per connection situation can be improved a lot by using epool or similar approaches which scale will with many inactive connections.

Hervé COMMOWICK

Really difficult to educate users in shared hosting..

tobi

this is an outrageous finding. having this information in your consulting toolbox surely is a value-add. looking forward to the profiler analysis.

Andy

This is great info. Thanks.

Would you ever port MariaDB’s thread pool to Percona Server?

Baron Schwartz

For those who are wondering about MariaDB’s pool-of-threads implementation, that is not an option for Percona Server currently. It creates more problems than it solves. Mark Callaghan has written some good material on its shortcomings. However, some solution is surely needed — this was one of the things I mentioned in http://www.mysqlperformanceblog.com/2010/10/27/mysql-limitations-part-4-one-thread-per-connection/ (thanks Yves for providing the proof I didn’t have at that time either!)

Vladislav Vaintroub

Yves, it looks like you’re benchmarking binaries compiled for debug (even with debug-full, as there is safemalloc inside).

Yoshinori Matsunobu

Hi Yves,

When I tested similar benchmarks a few months ago(read only in-memory pk lookup queries from 30 persistent clients, and 0-2000 sleeping clients. I used two 8-core Nehalem E5540 boxes, Gigabit Ethernet, CentOS5.5, MySQL5.1.latest), I didn’t find noticeable performance difference. What if you test DBT-2 on modern servers, not on cheap desktops?

Kristian Nielsen

Yves, sorry, but your benchmark is totally bogus 🙁

As Vladislav Vaintroub said, you are testing a debug-compiled server with safemalloc enabled. This totally kills performance, as it uses a really expensive memory sanity check that has quadratic cost in the number of active malloc() allocations.

(Note that a “debug build” in MySQL is much more than just gcc -g, it enables lots of extra testing code that is very costly for performance, safemalloc is only one of them (but a particularly expensive one).

You will have to re-check with a non-debug build.

Note btw. that there is no SELECT overhead for idle connections (with default of no threadpool). The select() only has the two server listening sockets (tcp and unix socket), each idle connection has its own waiting call in its thread. I am also curious if you know of any particular lists of idle connections that MySQL needs to traverse?

svar

, can comfirm from a client benchmark that reducing from 500 to 20 minimum pool size gave us improvement in avg response time on 24 cores box , killing the connections on the server side does not help because socket will still exist in the system. fixing the client code is just needed. Funnel on launchpad an Hyves contribution to mysql-proxy can also do the job with libevent.

Vladislav Vaintroub

, connections are not multiplexed inside normal mysql, i.e there is no select() or poll() that would be waiting for activity on some connections, instead server thread is stuck in recv() call, as long as client is idle.

Thread-per-connection wastes system resources (at least a thread plus virtual memory for threads stack). And if there are many very many active connections, number of threads == number of clients has an impact on scheduler, locks have higher chances to collide, leading to context switching.´

Traditionally, in Unix scalable server architectures were build around multiplexing with poll() and select() and worker threads. This model is better than thread-per-connection, but neither poll() nor select() did scale well. This is the problem you’re talking about in the last comment, and this is what epoll, kevent, solaris event ports or /dev/poll, Windows completion ports or libevent are aimed to fix .

But normal MySQL does not have problems with select()/poll() scalability, because MySQL server does not use multiplexing at all.

Mark Grennan

Really good work sir.

I knew network connections where an issue. I did some research on network timeout settings and published it on my blog (http://www.mysqlfanboy.com/2010/05/mysql-network-connections/). I’m thinking of springing off your work and investigating them again. Most of my fellow DBA told me MySQL had NO issue with network connections and I toned back my blog post accordingly.

Here is what I was looking at:

max_connections = ? # number of simultaneous client connections allowed
max_user_connections = ? # number of connections a user can make, 0 = Unlimited
wait_timeout = ? # seconds to waits for activity on non interactive connection
interactive_timeout = ? # seconds to waits for activity on interactive connection
connect_timeout = ? # on connect, seconds to waits for a handshake complete packet
max_connect_errors = ? # IO stops after this number of bad connections one good connect reset
max_allowed_packet = ? # How big is your BLOB?

It will be interesting to see how these numbers change the performance.

LGB

Hmmm. I’m building a quite complicated heavily used XML-RPC server which uses MySQL as its database. I noticed that MySQL connection establishment on each request limits the performance, so I am using a pool of already-established MySQL connections, also “unused” ones (by one thread) is reused later, so there are idle connections. Now I am thinking, what I can do. Building new MySQL connection on each XML-RPC request is expensive. Having too many idle connections can be expensive too, as far as I can see now. So what can be the ideal solution, do I need to make benchmarks with playing the pool size and find the optimal solution? It can be OK as far as the nature of the load on my XML-RPC server does not change for whatever reason. Hmmm …

mdani

I have same issue with idle connection on mysql.
But my case, i’m using php to connect to mysql and i’m using mysql_pconnect().
with p as persisted connection. when the problem hit i simply replace function mysql_pconnect() with mysql_connect()

jitesh

hi,
i m using mysql 5.1 on windows platform,
and i want to know how many slave is possible of master in replication process in mysql.
what is maximum number to generate slaves in mysql replication

hongbin

where is make_conn.php script