July 22, 2014

Why you should ignore MySQL’s key cache hit ratio

I have not caused a fist fight in a while, so it’s time to take off the gloves. I claim that somewhere around of 99% of advice about tuning MySQL’s key cache hit ratio is wrong, even when you hear it from experts. There are two major problems with the key buffer hit ratio, and a host of smaller ones. If you make some assumptions that are very hard to prove, there actually is a very limited use for the statistics from which the ratio is derived (but not the ratio itself, which you should ignore). Read on for the details.

In this article, I will use key buffer, key_buffer, and key cache interchangeably. However, I will be careful about the difference between “rate” and “ratio”. In this article, the key cache miss rate is defined as the number of misses per unit of time, with the units of operations per second. The key cache miss ratio is the ratio between reads from the disk and reads from the cache; it is a dimensionless number because the units are the same in the numerator and denominator, and they cancel out. The key cache hit ratio is simply one minus the miss ratio.

The key_buffer hit ratio

The metrics we’re interested in are all defined in terms of counters you can get from SHOW GLOBAL STATUS. I’ll start out by copying and pasting from the MySQL manual:

  • Key_read_requests

    The number of requests to read a key block from the cache.

  • Key_reads

    The number of physical reads of a key block from disk.

So far, so good. All of the above is mostly factual (more on this later). Here’s another fact from the manual:

The cache miss rate can be calculated as Key_reads/Key_read_requests. [Note: their use of 'rate' doesn't match mine; I would call this the 'miss ratio'].

The problem occurs when you start to assign any importance to this ratio. The MySQL manual doesn’t fall into this trap, but if you search Google for Key_read_requests, you will find lots of advice on “tuning by ratio,” including phrases such as “The ratio of key_reads to key_read_requests should be as low as possible, no more than 1:100″ or “your key cache hit ratio should be very high, ideally above 99.9%” or “if your key cache hit ratio is low, your key_buffer_size is too small.”

So here’s a summary of two pieces of bad advice:

  • Bad advice #1: you should care about the key cache hit ratio.
  • Bad advice #2: you should set your key_buffer_size according to this ratio.

Tuning by ratio is one of those things that is widely accepted because of “proof by repeated assertion,” but as you know, that doesn’t make it valid. Let’s see why the above two things are bad advice.

Problem 1: Ratios don’t show magnitude

Ratios of counters are virtually meaningless for two major reasons. The first is that ratios obscure magnitude. Look at the following and tell me what you think:

  • Server #1 has a key cache miss ratio of 23%
  • Server #2 has a key cache miss ratio of 0.001%

Which server is badly tuned? Maybe you’re objecting that you don’t know enough about the servers. OK, I’ll give you more information. Both servers are real production servers, with powerful hardware and heavy query workloads. Both have a key_buffer_size of 4GB. Now can you tell which server is badly tuned? — No, you can’t tell anything meaningful based on a ratio of counters, because the process of dividing one counter by the other to get the ratio has discarded vital information. You don’t know how many Key_reads and Key_read_requests those servers have done.

Just for fun, consider these fake but entirely possible scenarios: server #1 has 23 Key_reads and 100 Key_read_requests (23% miss ratio). Server #2 has one trillion Key_reads, and one hundred quatrillion Key_read_requests (1/100th of a percent). Given that information, which server is badly tuned? If you said “I still can’t tell,” maybe you want more information, so I’ll tell you that both servers are identically tuned, and they have identical data, hardware, and workload. Even that doesn’t help, though.

Problem 2: Counters don’t measure time

The reason you still can’t tell which server is badly tuned is because, even if you know the absolute numbers, you are missing the element of time, in two important ways. First, you don’t know how long of an interval I used to measure the statistics on those two servers. Maybe I measured the first server immediately after starting it, and that’s why its counters are so small. The second server has been online practically forever, and that’s why its counters are big. Let’s say this is the case. Now, you’ve got all the information you need to form an opinion, right? Instead of asking the same annoying question, let me ask it a different way: is either of these servers badly tuned?

There’s still not enough information — I hope you’re beginning to appreciate that tuning by ratio is a waste of time! The Oracle folks arrived at this conclusion a long time before the MySQL world started to come around. There are even tuning utilities (anti-tuning anti-utilities?) for Oracle, specifically designed to mock and frustrate those who would tune by ratio. They are capable of creating any buffer hit ratio the user desires by running silly queries that do nothing but cause buffer hits, skewing the result towards “this ratio looks great!”

The second kind of time information you’re lacking is how much time each buffer hit or miss takes. If you approach application performance optimization from the standpoint of response time measurements, which you should, you will eventually arrive at this question. “I have a query I know is slow and is a problem for my application. I have profiled it with SHOW STATUS and I know it causes a great many Key_reads to occur. How much of this query’s execution time is consumed by those operations? Should I try to reduce Key_reads?”

There is no way to know. All you get is counters — you don’t get the time elapsed. In technical terms, counters are surrogate measures. They are not helpful. And as Cary Millsap says, the unfortunate problem is that surrogate measures work sometimes, simply because there is sometimes a correlation (but not a cause) relationship between the counter events and the query’s execution time. Alas, that correlation fools us into thinking it’s a cause, and we optimize-by-surrogate-measure a time or two and it appears to work — so we turn into little Pavlovian DBAs and try to do that every time. It would be better if optimizing-by-counter never worked!

A partially valid use of Key_reads

There is a partially valid reason to examine Key_reads, assuming that we care about the number of physical reads that occur, because we know that disks are very slow relative to other parts of the computer. And here’s where I return to what I called “mostly factual” above, because Key_reads actually aren’t physical disk reads at all. If the requested block of data isn’t in the operating system’s cache, then a Key_read is a disk read — but if it is cached, then it’s just a system call. However, let’s make our first hard-to-prove assumption:

  • Hard-to-prove assumption #1: A Key_read might correspond to a physical disk read, maybe.

If we take that assumption as true, then what other reason might we have for caring about Key_reads? This assumption leads to “a cache miss is significantly slower than a cache hit,” which makes sense. If it were just as fast to do a Key_read as a Key_read_request, what use would the key buffer be anyway? Let’s trust MyISAM’s creators on this one, because they designed a cache hit to be faster than a miss.

  • Hard-to-prove assumption #2: A key cache miss is probably slower than a hit, maybe.

What else? Maybe this physical I/O operation is randomly positioned, which is a worst-case scenario for spinning disks. This is also very hard to prove, but seems reasonable based on the structure of a B-tree index, so let’s assume anyway:

  • Hard-to-prove assumption #3: A Key_read might cause a random I/O operation, maybe.

Now, given those assumptions, we can further assume the following:

  • It is good to minimize Key_reads because they are slow and cause random disk I/O.

Notice that we still don’t know anything about any relationship between Key_reads and the execution time of our query. All we can do is guess, like good Pavlovian DBAs, that there is a relationship. However, we can again reason that random I/O can cause collateral damage: if the disk head is seeking all over for random I/O, then other I/O (including non-random I/O) is likely to be impacted. If we manage to reduce Key_reads, we might make the database server faster overall, and perhaps the query of interest will accidentally get faster too, and we’ll get a treat.

There is one interesting question that we haven’t really addressed yet. How bad is bad? This is where we return to the notion of the key cache miss rate in units of operations per second. Given our assumed correlation between a Key_read and a random physical disk I/O, it is partially valid to say that we are going to get in trouble when Key_reads gets close to the number of random I/Os our disk can do. Here’s another formula for you:

Key_cache_miss_rate = Key_reads / Uptime

Note the conspicuous absence of Key_read_requests in the formula. The number of requests is absolutely irrelevant — who cares how often the key is requested? What’s relevant is that our assumed connection between Key_reads and random I/Os means that Key_reads/Uptime is assumed to be the same as “random I/Os per second.”

And now, I would finally like to show you something partially useful you can do with Key_reads:

This server is doing approximately 100 Key_reads every ten seconds, so we can assume Key_reads are causing about ten random I/Os per second. Compare that to what your disks are capable of, and draw your own conclusions about whether this is a performance problem. I know what I’d like: I’d like to ask the disk itself how much random I/O it’s doing. But alas, that’s virtually impossible on most systems I work on. So there you have it — yet another surrogate measure.

How to choose a key_buffer_size

Let’s recap. So far I’ve shown you the fallacy of tuning by ratio, and told you to ignore the ratio and in fact, ignore Key_read_requests altogether. I’ve explained that counters are a surrogate measure, but the fact that they’re easy to get and sometimes correlated with the true problem causes people to mistake counter analysis for a true performance optimization method. I’ve shown that if we make some assumptions that are hard to prove, we can compare Key_reads to the disk’s physical capacity for random I/O and get an idea of whether index I/O might be causing a performance problem.

But I haven’t shown you how to choose an appropriate key_buffer_size. Let’s look at that now.

This topic deserves an entire blog post, because there are many subtleties including the possibility of having multiple key caches. But I’ll give the simple version here. In my opinion, you should choose a key_buffer_size that is large enough to hold your working set — the index blocks that are frequently used. How large is that? This is yet another thing that’s really hard to measure, alas! So we need to either pick a surrogate, or pull a number out of thin air. Here are some suggestions that are about as good as any:

  1. Just set it really big and forget it. If you have enough memory, who cares. The memory isn’t allocated until it’s used — if you set it to 4GB, that doesn’t mean that 4GB is actually used. This is not as much of an abdication of responsibility as it might sound like on a machine that’s dedicated to MyISAM tables.
  2. Consider your mixture of storage engines (some InnoDB, some MyISAM, which is more important to you, etc) and choose an amount of memory based on how important those tables are to you, how big the indexes are on disk, and so on.
  3. Raise the key_buffer_size until, when the buffer is full, Key_reads/Uptime reduces to a number you’re comfortable with.
  4. Set key_buffer_size really big, and then measure its size as it fills up, in something like 1 minute intervals. Pull this into a spreadsheet and graph it as a curve. When the curve’s growth tapers off, pick that point on the curve and use it as a heuristic for how big your working set is. Set the key_buffer_size to that.

If the above methods shock you with their unscientific-ness, they shouldn’t. The reality is that this server setting is very subjective, and there is no good instrumentation in MySQL to guide your decisions. It is also not the be-all and end-all of MySQL performance, and people frequently obsess over it far out of proportion. But again, 99% of the advice I’ve seen is based on something much worse: a red herring that only sounds scientific and authoritative — the “key cache hit ratio.” This is a shame. When you are new to MySQL, trying to configure my.cnf, and you have heard guidance that seems so definite, mathematical, and authoritative, but still makes no sense, why wouldn’t you obsess over it?

What about InnoDB tuning?

You might be wondering, what about InnoDB tuning? What is the best way to choose an innodb_buffer_pool_size setting? This is a topic that deserves its own article too, but the short version is: ratio-based tuning is just as wrong for InnoDB as it is for MyISAM. Ratio-based tuning is invalid and wrong in general, not just for specific things. All of the above points (loss of magnitude, lack of timing information, etc) apply to all types of ratio-based and counter-based tuning techniques.

Summary

Major points in this article:

  • Counter ratios are meaningless as a performance analysis metric because they are a) ratios of b) counters.
    • You need to look at absolute magnitude, not ratios.
    • For performance analysis, you need to measure elapsed time, not just the number of times something happens.
  • In the absence of timing information, and if you trust the cache’s creator and assume that misses are more expensive than hits, then you care about cache misses, not cache hits.
  • A Key_read is not guaranteed to be a random physical I/O, but it might be. Gather Key_reads/Uptime over 10-second or 60-second intervals and compare the results to your IO system’s capabilities.
  • MySQL doesn’t have good instrumentation for scientifically choosing a key_buffer_size setting, but there are many unscientific approaches that are better than ratio-based tuning.
  • Counter ratios suck for everything, not just for MyISAM tuning.

I’ve had some heated arguments over these points, so I don’t expect the above to pass without controversy. But really, it’s time to stop with the bad advice about counter ratios. The sooner we do that, the sooner we can move on to better things.

About Baron Schwartz

Baron is the lead author of High Performance MySQL.
He is a former Percona employee.

Comments

  1. Baron,
    Excellent and mind provoking post.
    “Counter ratios suck for everything, not just for MyISAM tuning.” — it took me a very long time, and although I got good advice months ago, to realize that counter ratios actually say little, and still I’m falling for this.
    I would like to note, tough, that they are actually useful, when combined with at least either the numerator or denominator. The combination of an absolute number with ratio gives you the wider picture.

  2. Patrick Casey says:

    If you operate under the assumption that most people googling for “tune mysql” don’t have the background to understand the details of their IO stack and just want their web site to go faster, I don’t see any harm in giving them a simplified “rule of thumb” to tune by. Sure it’ll be wrong in various edge cases, but in the 80% case if you tune you key-cache by ratio you’ll get a better result than if you don’t tune it at all.

  3. What is an example of what it is useful for?

  4. Purely mathematically, when you have a/b and you have b, then you can deduce a. But I think I created a late hour, poor argument; I’ll drop it.

  5. Tom Hanlon says:

    Well since you are seeking controversy, I will bite. But before we engage in a fist fight, let me say .. Good post.

    We probably agree on this, but since you are being purposely provocative, I will as well.

    So let me say this..

    Suppose that Random IO for reading keys is our bottleneck. It is this issue that is slowing down our server.

    Suppose our Database needs to read 100,000 keys a day, total. The keys can either be in Cache or on disk.

    Now… Tune this server by allocating RAM that you have available for the key cache, and do your tuning in such a way as to _not_ improve the standard metric of ratio of key_read_requests to key_reads.

    Get back to me when you have screen shots to show you can go from poorly tuned to better tuned without affecting the ratio in a way that supports the standard argument.

    That said, I always look at the frequency of what I call “expensive operations” Key_reads, opened_tables, created_tmp_disk_tables more than the ratio of the “expensive operation” to the “cache hit”

    I must say that I like method 4, make it really large and measure growth over time to determin “working set” very nice.


    Tom

  6. Hi Baron

    Really good article.
    I think that I kind of noticed that a high Key Cache Hit percent wasn’t an absolute indicator when I saw that some of the important queries the site ran were still quite slow.
    So I relied on a good-ish cache hit percent and then went over my slowlog.

    I have to say that I will definitely use your formula of key_read/uptime.
    Again, really good article.

  7. Baron — counters and % indeed are not as good as “how many per time”. Take a look at mysqltuner 2.0 — https://launchpad.net/mysqltuner — I designed it so you can easily get the count, %, and also the rate of variables, and you can set the threshold based on the output.

    (“temp tables written to disk” is a better example of this — you can have 0.01% of temporary tables written to disk, but that might be 100 per second. The default mysqltuner config sets the threshold as >=1 per minute or >=1 per second…so for example, if you have 20% of temporary tables written to disk but that’s only 2 per hour, you’re not warning on anything).

  8. (3rd attempt at posting: think there’s a bug with your wordpress)
    I was making a late night poor argument. Purely mathematically, when you have a/b and b, then you can deduce a.

    Regardless, looking at read hit ratio and write hit ratio, I see good correlation to the number of read/write operations, but even better, a good correlation to the number of read/written rows. Not pure science, but gives some general view; what with the lack of exact counters for MyISAM rows (existent for InnoDB).

  9. Patrick, can you substantiate that assertion more? In my experience, “tuning by ratio” leads to wildly stupid results that consume large amounts of effort barking up the wrong tree. If you were able to calculate the true cost to the business of such “tuning,” I’d say most of those people would’ve been better off not touching their default my.cnf at all. So I’d flip things around: in 80% of cases they spend a ton more time on it than if they’d just wet a finger and poked it into the breeze: “I have 4GB of RAM and 1GB of indexes on disk, so 256MB of key cache might do OK.” And after spending that time they have no better result, so I’d say in 80% of cases, tuning by ratio is pure FAIL. Actually I’d say more than 80%, but I’m just choosing the reverse of your number.

  10. Shlomi, I think you are right — I think there is some Javascript problem with our comments. The responsible team has been alerted.

  11. Sheeri, I’ve actually been reading the source of mysqltuner.pl recently. For purposes of cross-linking, y’all check out http://www.pythian.com/news/9035/database-tuning-ratio-vs-rate/

  12. Baron — neat. Make sure to look at the launchpad version, and I can of course answer any questions you have. I guess you still haven’t turned on trackbacks, or the link would have appeared automatically.

    Also I would like to specifically state that I don’t think ratio is *meaningless* and shouldn’t be looked at. I agree that usually people are looking at ratio when they really want to look at rate….but that doesn’t mean rate is unimportant.

  13. Sheeri – trackbacks are on, but many are going to spam :( I usually do try and manually mark ‘not spam’ for the comments containing keywords (InnoDB/XtraDB/Percona/our names etc.). It’s a lot more difficult to do this with just the snippet provided in a trackback.

    Related to that, we are finding many more ‘spam’ false positives:
    http://twitter.com/Percona/statuses/9832854484

  14. Tom, great and very enjoyable counter-argument. You made me smile! But alas it is flawed.

    If “P implies Q,” then “not Q implies not P,” but it’s not true that “not P implies not Q.” That’s denying the antecedent. “It is raining, therefore the streets are wet” doesn’t guarantee that the streets are only wet when it’s raining. You are essentially arguing that if reducing the workload on the server reduces the cache miss ratio, then reducing the cache miss ratio will reduce the workload on the server. Not so — it’s a logical trap, and mixes correlation with cause. So I decline your challenge on the grounds that it is invalid :-)

    I counter-challenge you to a duel, using sledgehammers in six feet of water ;-)

  15. The inverse of this article, is what instrumentation should be added to MySQL?

  16. Steven, right. Good instrumentation requires a) carefully definitions of events that you want to measure b) counts of how many times that happens and c) timing information on the events. MySQL has some of a) and some of b) but nearly none of c). Perhaps the best source of timing information in stock GA builds of MySQL is in SHOW PROFILES. The Percona builds started their life years ago as a patch to add much more timing information into the slow query log. I want to get timing information on nearly everything that’s currently a counter. I am not sure yet whether the PERFORMANCE_SCHEMA will provide what’s needed for scientific performance optimization.

  17. romain says:

    Great article thanks !

    However quick question : what would be ,if any, the equivalent Innodb_* status variable to use against the uptime when using exclusively InnoDB, to correlate to IO subsystem performance ?

  18. You can find that in the docs. But InnoDB’s IO is largely done in the background and can rarely be traced directly to any user action with certainty.

  19. Correction: the write IO is largely done in the background. The read IO is even more complicated in my view — there is read-ahead, for example, and a mixture of foreground and background IO.

  20. rguinot says:

    Thanks!
    I guess we can then rely on system tools such as iostat or sar to tell if disk overall io is being saturated

  21. Andrew says:

    Thanks for the information. Interestingly, I decided to use suggestion 4: “Set key_buffer_size really big”.

    Well, I upped the size from 256MB to 512MB and interestingly, when I look at Memory Used through WebHost Manager, I expected the memory used to increase. It had been showing at about 44% prior to changing the key_buffer, but immediately after doubling key_buffer and restarting mysql, Memory Used has been showing no higher than 25%.

    Any idea why?

  22. I don’t know. You changed two things at once: you restarted MySQL, and you changed its configuration. If you want to isolate the change to just one thing in the future you should use “SET GLOBAL key_buffer_size=512 * 1024 * 1024″ next time.

  23. Cédric says:

    But if you retrieve a ratio every 15 minutes, you don’t need to know if the server has been restarted, and you have a more accurate ratio !

  24. Yes, it will be accurate and still meaningless :-)

  25. Gabek says:

    Hi,

    I am not an expert at mysql.

    Can you help me optimize my server?
    (Might give me a conf file example?) :)

    Thies are te informations I have about mysql:

    [Ed: deleted output of a tuning script]

  26. Gabek,

    Percona offers professional services for MySQL and would be glad to help you optimize your server on a commercial basis. If you would like free help, please try forum.percona.com instead of posting into the comments here.

  27. Akhthar says:

    I would like to disagree. Although I agree that it’s a waste of time to only look at ratio and be done with Key buffer, I think it would be helpful to understand the Efficiency of Key buffer as well as compare the same with multiple servers with only one condition that the Uptime should complete a time cycle (24 hours if you are not sure about the cycle) and should be equal for all. You can still get the per day stats, but you won’t get the accurate results unless you have same Uptime for all servers.

  28. share says:

    WOW just what I was searching for. Came here by searching for arbitrage

  29. J says:

    Claiming that hit ratio is a meaningless statistical value is just incorrect, and really quite childish of you. Sure, if people spend hours after hours fine tuning their systems soly basted on their hit ratio, and still doesn’t get improved performance, then it is a bad thing. But I am sure most people are not that stupid.

    The meaningful use of the hit ratio, in my view:
    - It can give a quick indication of the health of the cache
    - It can be used as a quick and simple comparison value when comparing different caches, even caches that are very different in their nature and usage

    Lets say for example that we have a system with some kind of performance problems, and the system contains 100 different caches. Lets also say that we have an easy way to get access to the hit ratio values for all these caches, but the other statistics are more troublesome to get to and have to be fetched manually, lets say it takes 5 minutes per cache. Now, if the hit ratio is meaningless, as you claim, then you have no choise, you have to go through the manual process for each cache to get all the statistics, and that would take 5 x 100 = 500 minutes, or about 1 days work. I, on the other hand, use the hit ratio information to get a short list of lets say 5 potential candidates (the ones with the lowest hit ratio), and focus on them. Based on pure mathematical probabillity, it is more likely that I can improve performance by concentrating on tweaking these 5 caches, then if I would have choosen 5 randomly selected caches, so chanses are that I might not need to look at any of the 95 remaining caches, and if that is the case I have saved several hours of work.

  30. Tinel Barb says:

    Baron, I totally agree with your article.
    Myself, I use a personal approach derived from experience.
    At first I have observed a lot of disk IO during backups and restores and a low throughput during queries in a heavy transactional environment.
    Along with other settings, “key_buffer_size”, “myisam_sort_buffer_size”, “max_heap_table_size” and “innodb_log_buffer_size” gave the greatest improvement regarding the disk IO operations. For instance, the backup was lowered by 30% and the restore was reduced by 50% (because of the operations on mysql schema, of course).

    Overall, I have analyzed the following indicators:
    - Created_tmp_disk_tables, Created_tmp_tables, Table_locks_immediate, Handler_read_next, Handler_read_rnd_next.
    - Key_blocks_used/(Key_blocks_used+Key_blocks_unused).
    - The number of largest keys from databases.
    - The slow query log – to catch the queries not on indexes.

    I was monitoring second indicator in a graphical manner (0.5 Hz read) as it shows if the key_cache is overwhelmed *sometimes* (which it did mostly during backups and restores and large joins), so I have raised accordingly to the SUM of the mostly used (in joins) large key from databases.
    I was monitoring first indicator in a graphical manner also to correlate the reads with the operations.
    I think the approach was good since the results were great.

    Also, giving the fact that I couldn’t influence the quality of the queries (meaning refining the queries to be always on indexes), so increasing step-by-step the “join_buffer_size” and “read_rnd_buffer_size” also gave a very good result regarding the creation of temporary tables during joins, especially the on-disk based temporary tables.

  31. Tinel Barb says:

    I don’t know how others look at the key_buffer, but this buffer is used only when needed, and the data is not retained in it as in CACHES.
    It’s a simply and regular *buffer* so “HITS” and “MISSES” are meaningless (answer to “J.”?), and the key-words should be “usefulness” and “sufficient”.
    Best regards!

  32. J says:

    @Tinel Barb:

    I have no opinion on the mysql specific parts of this discussion (like the key_buffer). I only commented on Baron’s dissing of hit ratio in general, when he called it meaningless.

Speak Your Mind

*