Sometimes I see people thinking about buffers as “larger is always better” so if “large” MySQL sample configuration is designed for 2GB and they happen to have 16, they would simply multiply all/most values by 10 and hope it will work well.

Obviously it does not. The least problem would be wasting memory, allocating a lot of memory for operations which do not need it. Allocating 10MB to sort 10 rows will not make it faster than allocating 1MB, both are more than enough.

However not only it may cause memory being wasted but you may see some of queries actually performing slower, and not because the system starts to swap. Generally you want buffers and other values to be sized “just right” – working with smaller data structures would improve cache locality, will make it easier for OS to manage memory as well as cold provide quite unexpected improvements.

sort_buffer_size – recently I worked with case which was running much faster with 32K sort_buffer_size, compared to 32M. The problem was memory allocation – sorting was used in corelated subquery which was executed hundreds of thousands of times but only few rows were sorted. 32K block is allocated fully in user space, while 32M block requires mmap() call and a lot of work with tables. In this extreme case performance difference was 9 seconds vs 8 minutes – 50 times. I would not call it typical but watch out. This could be called but at some extent as why would you allocate/free sort buffer if it is used by so many queries.

Here is example if someone wants to play

Populate it with 100.000 rows, having 100 values with d=50 and some 10 distinct c values.

read_buffer_size and read_rnd_buffer_size – These are buffers used by MyISAM to perform reads, in different scenarios. So should we go ahead and set them to 64M ? Not really – this could result in nasty suprises with full table scan LIMIT queries as well as in some other cases. If you run SELECT * FROM LARGETABLE LIMIT 5 MyISAM will allocate the buffer and populate if fully when MySQL will read only 5 rows from the buffer and throw away the rest.

These are some examples I’m quite sure there are others. For example too large table_cache may slow your opens and closes down, too large query_cache may cause long stalls on invalidation.

Configure it smart. Larger is not always better 🙂

5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Apachez

Do there exist some kind of graphs/tables with benchmarks between different buffersizes in mysql 4.x/5.x like the case with the graphs for mmap feature of 5.1 or if it was 5.2 in this blog?

Because I have the impression that the “demo configs” which are included with mysql are somewhat outdated (im thinking of the huge, large etc which are included with the mysql installation for demostration purposes), while my own tests (not that scientific but still :P) shows that a relation of:

sort_buffer_size = 8M
read_buffer_size = 2M
read_rnd_buffer_size = 32M
join_buffer_size = 8M

seems to be a good performance wise start to use the buffers (compared to the default values).

Or is the 32k example in this blog post just a special case of when you perform a count inside a subquery along with a count in the outer query (which is most likely not that common) ?

Apachez

The benchmarks in my case was more manual like in sqlyog and when running some of my applications (which in the end I could see the throughput of the queries).

What would be nice is if there could exist some kind of “recommended” values when using mysql administrator.

I mean like a specific page in mysql administrator which based of the output from “SHOW STATUS” etc could hint the admin of which values “might” be good to use for buffers etc from a theoretical point of view (there exist a couple of formulas regarding “recommended” settings in the manual).

I know flupps for example has a couple of “recommended” configs depending of the use of the server in question.

Fedge

Thanks for doing all of these benchmarks and actually sharing the results with everyone. You’ve saved me a lot of work and I can show this stuff to my sysadmins without having to do a bunch of benchmarking and stuff myself all the time to show them which settings I need. For a lot of them, the tables used are rather simplistic, though. Results might not be the same for matching against multiple columns with different values in the same query as they are for a table with one column. The only way to know it to look at the relevant MySQL source code or run your own benchmarks on your own tables. These are a great starting point though and get the basics out of the way and it’s a big timesaver for everyone who needs to work with big databases.