I already wrote kind of about same topic a while ago and now interesting real life case makes me to write again 🙂

Most Web applications we’re working with have single tier web architecture, meaning there is just single set of apache servers server requests and nothing else – no dedicated server for static content, no squid in front nothing else. This architecture is frequently used even for medium size web sites which have millions of page views per day.

Typically single Apache server in this configuration will have rather high MaxClients settings (in hundreds) and would argue web site performance suffers if the value is decreased, only few however understand why they need MaxClients to be set to some high number.

First lets talk about performance and concurrency. It is often considered the higher concurrency is better, in fact however systems typically perform best with limited concurrency when they are able to saturate all resources but yet not cause scheduling and switching overhead to become serious problem. Depending on application and hardware this “optimal” number can be different and it is best to find one by benchmarking. If you want some ballpark figure it can be something like 2*(Num_CPUs+Num_disks), sometimes less sometimes more.

With optimal concurrency you get optimal throughput, which is number of transactions per second. In benchmarks which perform operations in the loop as fast as possible response time might not be best in such case (at least max and 95%) but in real world it is usually OK as you’re not planning to run your web servers at peak capacity they can handle and this is what is usually good to keep response time within the limits.

Sometimes I see people to use the following formula to count optimal number of children – I have Page generated in 1.0 seconds in average and I’d like to handle 100 req/sec so I need 100 children to keep with the load. This Formula looks right from glance view while it is really wrong as page generation time depends on concurrency. Average generation time for concurrency of 4 will be quite different from concurrency of 100. You can use this formula but you need to consider response time growth with growth of concurrency and not keep it static.

So with Web applications your need limited number of workers – (I’d guess 20-30) to get best performance, this is of course if all your operations are local – meaning you only deal with your local network – Database Server, File Servers etc, if you’re querying external web services or do any kind of other network IO situation is a bit different. So why do do you need large number of Apache children (assuming pre-fork mode) to get decent performance ?

Network IO – This is actually the only valid one. If you’re querying Amazon Web Service for example to display affiliate goods and you have timeout of 3 seconds you better have enough apache children (fast-cgi processes etc) to handle this worse case. So if you have 10 page views/sec you need at least 30 workers, otherwise if Amazon (or your network connection to Amazon) slows down site may become inaccessible. If you want your visitors to wait for 3 seconds or use some caching or lower timeout is other story 🙂

Handling Keep Alive – Keep alive connection in Apache keeps child busy and especially if you have KeepAliveTimeout high can consume a lot of them. For dynamic pages you typically do not need keep alive enabled and you’d better to server images from some other server anyway (possibly even apache still, but configured differently)

Spoon Feeding Clients – Slow clients may take a lot to get the page back and apache child is busy until content is fully sent. This can take a lot of time if client is slow and also allows to DOS site pretty easily by pretending very slow client and starting number of downloads. To avoid this problem keep Timeout low and better avoid apache talking to web clients directly – use something which will do the buffering and can spoon feed clients without much overhead. The danger of spoon feeding lies in ever changing nature of the Internet. Slow down or packet loss may happen somewhere and if you have large amount of users from that segment amount of spoon feeding needed may skyrocket.

Why are apache children are problem at all ?

Well, because they are fat and ugly. Seriously with modern PHP applications each apache children may require 64MB of memory, sometimes even more. Part of this memory is kept between requests so considerable amount of memory is used even if child is serving static page hit at the moment. Besides excessive memory use (which is inefficient resource usage issue) you have other issues such as requiring a lot of connections to the database and then if database slows down all these hundreds of connections may start running queries at the same time, which does not help database to recover from slowdown quickly. It also requires more treatment from OS and generally inefficient.

Now couple of War stories.

I decided to post this thing as on weekend one Chinese bot almost took down the site we’re working with. That site is still low traffic and we thought 300 apache children is fine for now until we find time to configure things properly. But the bot have come and started Spidering the site – it was not bad, waiting for a few seconds between request submission, and at other time we would not notice it but it was extremely slow at getting the data back. There were page downloads taking 10 minutes+ in the apache status. As it was accepting the data, just very slowly, apache TimeOut was not triggered. Disabling bot is easy however the fact single bot on slow connection may affect things so dramatically is uneasy.

This thing remind me the problem we had years ago while I was with SpyLOG. We had our data center connection become bad at last mile giving packet loss of about 1% looks like small number but it was enough to skyrocket number of connections in all different states. The main problem at that time was different – It was Linux Kernel 2.2.x which had problem with SYN backlog being linearly scanned rather than hashed (so a lot of outstanding connections in SYN_RECV state caused system overload) but other nasty things also were happening.

13 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jay Pipes

Hi Peter!

Very interesting stuff. I was wondering if you could comment (either in this post or perhaps in another) about a question I get quite a bit in my seminars. I often get asked whether it is better to use persistent connections (specifically for MySQL) and my answer has been the following (based on my understanding of Apache):

If you are talking about web applications that serve up dynamic data, a persistent connection can actually hurt scalability because the database connection resource gets tied to the Apache child process and cannot be re-used until that specific child process is re-used itself. Since the connection cost for connecting to MySQL is relatively low compared to other databases, I usually recommend not using persistent connections with Apache web servers so that database connections are not consumed by processes that are not actively running. I recommend this partly because I know that MySQL itself keeps a thread_cache which implements a rudimentary connection caching mechanism and partly because of the process-attaching I described above.

What’s your take on this? I’m interested to hear.

Cheers!

Jay

Jan Kneschke

Peter,

mpm-event will handle this better when it is stable. Until then there is always your good buddy lighttpd. 🙂

Jan

Maciek Dobrzanski

Hey!

You mentioned data caching, which is extremely important for dynamic content serving (making it not so dynamic ;)), but what I’m writting about is only closely related – the http acceleration software (varnish, squid, oops, etc.). If a proxy server is forbidden to cache any data, it becomes nothing but a forwarding software. But, it makes much better job in handling connections than most regular http servers (e.g. Apache) in terms of performance and resources utilization. Furthermore, it takes out from the web server any need of dealing with slow clients or being affected by the effects of network related problems, because it talks only to a locally based application (over the LAN or even the loopback interface, both fast enough). Thus the revese proxy becomes something of a seawall, which protects all the resources, not only the underlying Apache, but also possibly all the back-end software.

Alexey

mpm-event is targeted for Win32 arch. On *nix there’s no big difference between threads and processes. All pages that remain unchanged after fork() don’t consume memory anyway, main memory consumtion comes from allocations on stack and heap.
it’s pretty easy to test – launch 10 children of mod_php-enabled apache, then launch 255 children, and compare overall memory usage.

using light frontend for serving static content and buffering dynamic content is a must for a big project. otherwise your server is getting too vulnerable to simple attacks – just open 255 connections, send first line of the request (to overcome accept filters), and saturate all slots. light frontends have no trouble keeping tens of thousands connections open.

Apache

I’ve wrote an article about this a while ago…
http://www.devside.net/articles/apache-performance-tuning

The top on my list, is RAM, using the right MPM, configuring the MPM for your situation, setting KeepAlive to 2 seconds, and separating servers for static and dynamic content by either using Tux or Lighttpd.

Usually, the biggest issue is the default MPM settings, which are almost always set too high. You want the server to utilize all available resources, and not crash the system or make it unresponsive.

Peufeu

> spoon feeding

This is extra worse when you read the PHP docs and believe you’re better without output buffering. Then you start a transaction, emit some HTML, and the PHP page stays there waiting for the client to receive the data and sitting on its transaction locks.

Apache

How about apache tuning? Disabling modules that are not needed. Nice tips thought!

Eric

Thanks very much peter.

I found the problem of slow down of our site with small load of db server, web server thanks to this article.

Timeout setting in httpd.conf do great harm to so many mysql connections in sleep state for so long time(wait_timeout).

Original timeout was set to default (300 seconds). Some terribly slow http connection consumes timeout limit to get served from our web server, so there are many child processes even if actual traffic is not overloaded.

So mysql connection opened in slow HTTP connection reaches to wait_timeout and MySQL aborts those mysql connections.

There were about 200 Aborted_Clients per hour in mysql error log.

After changing Timeout directive in httpd.conf, all seems to be ok.

Mysql connections in sleep state only are available for short duration and there are no aboted_clients in mysql error log.

Opening site is so fast.

I have some relationship with chief architect of naver (no1. korean search portal).

Naver guru explained to me that the main bottleneck of web server only serving php scripts is CPU not RAM.

Modern best dual CPUs can handle at most 200 concurrent apache child processes, so Naver prepare to add new php dedicated web server when concurrent apache connection reaches to 100 at peak time.

Their daily page views handled by single php web server is 2 million or so.

So I think 100 concurrent apache child processes in well arranged server architecture( separation role to multiple servers like separation static file server, db server with highly optimized query and schema ) may be more optimal.

Anyway, thanks for your great articles.

I am always taking benefit from your articles.

I had no experiece at web developement 1 year ago, however, I am struggling to make scalable and stable large web site( no. 1 social networking site in China). In China, it is really hard to find guru on mysql, optimization because there are so many web sites cheating traffic.

Many Chinese web sites does not have real traffic, only cheating alexa traffic and do evil.
So there are small guru on optimization, scalability..

Eric

Moreover I have one question on optimal concurrent mysql connection in active state.

If all the queries required in each php script on a web site are optimized best, which is the optimal concurrent mysql connections in busy state?

If single query is optimized well, it takes only 0.005 seconds or so.

If so 100 concurrent mysql connections per second can handle 20000 queries per second.

Front end queries does not include complex group by or extra file sorting at all.

Of course, the optimal concurrent mysql connections may be dependent on the available RAM and speed of disks.

I just want to know overall number to predict when we should separate master db server into several ones.