August 1, 2014

Predicting Performance improvements from memory increase

One common question I guess is how much should I see performance improved in case I increase memory say from 16GB to 32GB. The benefit indeed can be very application dependent – if you have working set of say 30GB with uniform data access raising memory from 16GB to 32GB can improve performance order of magnitude by converting very IO bound load to CPU bound, it is well possible to see limited gains – if your working set already fits in 16GB you may not see any significant gains upgrading memory to 32GB. Interesting enough similar can happen for very large working set – for example if your main queries do full table scan of 100GB table it does not matter if you have 16GB or 32GB the load is going to be way too much IO bound anyway.

Interesting enough because of MySQL scaling issues it is also possible to see performance to go down as you increase buffer pool size. Some threads which would be safely sleeping waiting on IO completion are now finding their data in buffer pool so they start to compete on hot latches and performance go down.

Now back to original question – how do we predict the benefit from increasing the memory and so cache sizes ? I typically start by looking at the type of load we’re dealing with. If it is CPU bound and there is little IO wait we typically do not expect to gain much by increasing the memory.

This however have to be watched carefully. Performance does not always stays the same and the goal may not be optimizing average performance. It may be heavy data processing batch job which is IO bound and which runs too slow (and affects other transaction) and may be increasing memory is helpful to solve this problem.

If it is IO bound (high IO system utilization, low CPU usage) one should think about how much CPU capacity is available. If your CPU is 25% busy you will unlikely get more than 4x even if you eliminate all IO completely (unlikely because there is increased CPU overhead going IO path as well), so account for that.

Besides pure CPU based computation you should account for locking. Consider for example bunch of transactions updating single row in the table. Having such workload you would likely see no IO and a lot of CPU idle and not because of internal Innodb limits but because your application logical serialization problems.

So what if we have very IO bound application without serialization issues (say reporting slave) which is very IO bound showing 100% IO subsystem utilization on 5% of CPU usage ? This is the true challenge becomes because MySQL has no tools to analyze working set (we have per query working set in our patches but it is not enough). We have couple of ideas how to do global working set profiling but it should wait for now.

At this point I typically use my intuition to try to guess how much data application to get some ballpark figure and often it is enough.

If you would like to be more scientific there are couple of other things you can do. First – you can test by scaling down the data. If you have data for say 500.000 users on the 16GB server get it down to the half of that and you will often be close to seeing performance 32GB server would have. You however have to be careful and understand how data is used in your application. If you say have data for 10 years and load data for 5 years only to compare performance you may get misleading results if reports are typically done for last few months. Basically in such exercise your goal is load data so the working set would be half of original so it would have cache fit similar to one on the larger system you’re trying to compare to. Using this approach you also should be careful with your estimations and take IO subsystem into account – even with same cache hit ratio more data and more load means there are higher demands for IO subsystem performance.

By far the best method is trying, if you can afford it – just get memory upgrade and see how it affects performance. With many vendors you can get the memory upgrade or the whole system to try and return it back if it does not fit your needs. This approach especially works well if you have many slaves (or many shards) in which case you can see performance or capacity improvements quite easily.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Matic says:

    I presume increasing the physical memory size will not affect the performance of MySQL Cluster (NDB engine) since the entire dataset and indexes are already in the memory?

  2. Ewen says:

    Matic,

    The NDB engine supports data on disk and in memory.
    http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-disk-data.html

    The important factor to consider when upgrading memory on an NDB Cluster is that the data nodes all need to run with the same memory specification, therefore any memory upgrade must occur across all data nodes.

  3. Matic says:

    Ewen, yes, but if all the tables and columns are memory based and the entire dataset can already fit in to the memory, will increasing the memory on all servers affect the performance or not? Will the excess memory be used or wasted? Are there any benefits to having more memory installed than the size of the dataset? In that case, the number of replicas could be increased from 2 to 3 (for example). Would that speed up queries?

  4. Ewen says:

    Hi Matic,

    I am not as current as I would like to be with NDB Cluster, but certainly more memory would allow more resources for transaction and query handling. Look out for configuration parameters with a prefix of MaxNoOf like MaxNoOfConcurrentOperations. Increasing the number of replicas will not increase speed, however increasing the number of nodes and therefore fragments per partition should increase speed.

  5. Gil says:

    Hi Peter,

    This is one thing I’ve always been curious about: in your post you mentioned “100% IO subsystem utilization.” Is this a theoretical number, or is it possible to determine the exact percentage of IO utilization on a Linux system? Is it something you can gather from the iostat command?

  6. peter says:

    Gil,

    It is tricky. If you have single drive 100% utilization as reported by iostat actually means drive is 100% busy because there is at least one outstanding request all the time. However even with single drive having multiple outstanding request allows drive to execute them in better order and thus driving performance higher than it was when it was 100% busy with only one request.

    When you get to the RAID it becomes very complicated because now you need multiple requests in parallel to load the system and they have to come to the right drives – otherwise some drives may have capacity while others are pegged.

    In practical terms I look at “wa” in VMSTAT, iostat -dx utilization rate as well as service time and average time in queue. When IO system becomes overloaded the average time in queue can increase dramatically.

Speak Your Mind

*