April 16, 2014

MySQL Wish for 2013 – Better Memory Accounting

With Performance Schema improvements in MySQL 5.6 I think we’re in the good shape with insight on what is causing performance bottlenecks as well as where CPU resources are spent. (Performance Schema does not accounts CPU usage directly but it is something which can be relatively easily derived from wait and stage information). Where we’re still walking blind with MySQL is resource usage – specifically Memory Usage.

I can’t count how many time I had to scratch my head with system configured to consume only few GBs in global buffers growing to consume much more for some unknown needs leaving me puzzled whenever it is user variables, complex stored procedures temporary tables or something else. Not only such connection related allocations are invisible but many global allocations are poorly visible too. Sure, we know how much memory was allocated for Innodb Buffer Pool or Query Cache but amount of memory used for example for Table Cache can be only guessed.

Not only lack of proper memory accounting is operational problem, it is problem for QA too as it makes it very hard to ensure the memory consumption by various database operations is reasonable. For example if some INFORMATION_SCHEMA query were to take couple of GB of memory we might not find it reasonable yet it is not a “memory leak” as it is freed after operation is completed.

What I would like to see is similar to Performance Schema “wait names” we could get something similar in terms of memory allocation target, which could be something like “Prepared Statement Cache”, “Table Cache” etc. For global pools we need just global accounting, for connection basics accounting should be done per connection but aggregated globally. This way it would be possible to see for example which connection is using most memory. We could also aggregate per purpose to understand what memory is currently used for.

One thing to note with memory allocation is it is rather volatile – a lot of memory might be allocated for very short period of time and hence not very well visible. To assist catching these situations it is good idea to track not only current amount allocated but also maximum, which can be reset as needed.

If such functionality is implemented not only it can help DBA to configure memory allocation a lot more successful but I also believe similar to Performance Schema helping to find many performance bottleneck such profiling will help to find a lot of wasteful memory allocation and help to get MySQL on the memory diet. It would be easy to extend QA suite to check for peak memory usage after each test and hence find memory usage regression – when new code unexpectedly requires a lot more memory than previous variant.

If memory accounting is implemented it can be used to help restricting memory usage in addition to monitoring it which I’m sure hosting providers, MySQL as a Service vendors will especially appreciate as currently they are on complete mercy of their users not being able to limit memory usage by individual MySQL users or even track how much these users are using.

Now as MySQL 5.6 is getting close to be done I imagine the planning and feature design is long started for MySQL 5.7 (or whatever next release is going to be called) and I really hope something along those lines will be picked up for inclusion. If not perhaps this is opportunity for MariaDB to pave the road ? Whatever way I’m sure MySQL community will appreciate it.

Happy New 2013 Year everyone !

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Mark Leith says:
  2. Mark,

    Thanks. This looks like a great start covering a lot of what I’m looking for. I would note though design mentions MEMORY_POOL is “usually NULL” even though this is one of the most important pieces. Though I understand the challenge of going over the code and changing malloc/free calls to track the purpose for which memory allocation happens. This is the reason we did not tackle it in Percona Server as amount of changes to do it well would be way too large.

    Also accounting by Thread might not work very well considering you might have multiple connections sharing the same thread, though at the same time there are different system threads which also good to be tracked yet they do not have connection.

    Finally I guess INT in those specs is going to be BIGINT as INT is way to small for many allocation types

  3. Mark Leith says:

    Yea it could potentially turn in to a large patch (many of the P_S instrumentation patches turn out like that).. Not having a connection doesn’t matter for P_S either, since all background threads are instrumented as well.

    As Marc notes at the start of the worklog, this still needs further clean up of the spec, and we have to be careful of overhead concerns tracking every little malloc that could happen etc too.

    It’s certainly something we’ve talked over though, as the spec shows. :) Can’t guarantee exactly when it may get done, however.

    And yea, it would be a BIGINT, as with all other columns in P_S like this now.

  4. Mark,

    Yes I understand about background threads. What I mean is when multiple connection share the same thread because of pooling mode it is the most interesting to see information per connection not per thread.

    The overhead is always the challenge in instrumentation. I think you have achieved great results with Performance Schema – frankly I was expecting more overhead. For some small allocations MySQL allocates data on stack which might not need to be instrumented and even for standard heap allocations timing might not be very important which is the more expensive part I assume.

    I understand you can’t guaranty when it is going to be done but it is great to hear planning work is going on.

  5. When it comes to MySQL and memory usage, I’d like to see totally different approach to tuning the memory usage. There are dozens of knobs to configure the amount of memory for each particular need, but what I’d really want is to be able to say to MySQL: “Here’s 10GB of memory for you, use it as you see fit”. I don’t really want to specify that 8GB should be used for the innodb_buffer_pool_size, some memory for the sort and read buffers, some memory for the query cache, some memory for the temp tables etc etc. MySQL SHOULD know better where it needs the memory the most. If it doesn’t, it’s a bug which needs to be fixed. The memory allocation could also be dynamic, e.g. if there’s some operation which requires more memory, it could borrow that memory from, say, the buffer pool, and then reallocate the memory again when the memory-consuming operation is finished. I have learned the proper amounts of memory that should allocated to each particular functionality over the years, but I feel it has been a somewhat pointless excercise. I would much rather let MySQL figure out how it should use the resources allocated to it.

  6. Roel Van de Paar says:

Speak Your Mind

*