I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything would go back to normal. What was interesting is Same query was taking large portion of slots in PROCESSLIST. I do not just mean query with same fingerprint but literally the same query with same constants.
What we observed was a cache miss storm – situation which can happen with memcache (as in this case) as well as with query cache. If you have the item which is expensive to generate but which has a lot of hits in the cache you can get into situation when many clients at once will have miss in the cache and will attempt to re-create the item pushing server to overload. Now because a lot of requests being proceed in parallel the response time for initial request may take a lot longer than if it is ran all by itself increasing the time it takes server to recover.
What do I mean by expensive query in this case ? This is the query which has too high ratio of requests to be served with 100% misses for portion of time. For example if I have 100 accesses for given cache objects per second and it takes 500ms to populate it, it still will be too expensive, because for these 500 ms it takes to populate the item 50 requests will be started (this is the average case, because of random arrivals the worse case is worse) which takes 25 seconds to deal with (assuming there is just one execution unit). Because we normally have multiple cores and multiple drives it can be less than that but it is enough to cause hiccup for a few seconds which is unacceptable for a lot of modern applications.
How can you deal with this problem ? You should carefully watch frequently accessed cache items as well as cache items which take long to generate in case of cache miss. To find first one for memcached you can use mk-query-digest to analyze which items are requested frequently, it can decode memcached wire traffic. For second you can have instrumentation in your applications or take a look at MySQL Slow queries – which is good enough if you populate each cache item with single query.
Optimize query if you can. This is a good thing to do in any case but it may not be the only part of best solution. You can get some query patterns getting slow over time as data size growths or execution plan changes, you can also have some items becoming hot unexpectedly due to changes to content interest or launch of new features.
Use Smarter Cache Especially with memcache it is you who decide how to populate the cache. There is number of techniques you can use to avoid this problem such as probabilistic invalidation, you can also put the special value in the cache to reflect it is being updated right now so you’re better wait rather than starting populating it. For MySQL Query Cache the solution should have been to make queries wait on first query started to complete. Unfortunately this have not been implemented so far.
Pre-Populate the cache In some cases you can’t change how caching works easily, especially if it is built in in the application however it may be easier enough to identify hot items
and pre-populate them before they expire. So if item expires in 15 minutes you can refresh it every 10 minutes and so you get basically no misses. This works best when there are few hot cache entries which cause the problem.
So if your server has seemingly random spikes of activities check this out – cache miss storm could be one of the possible causes.
I can’t resist adding some Google keywords for alternative ways to say “cache miss storm”: dogpile effect, thundering herd, cache stampede.
Baron,
Sometimes I’m bad remembering terms so I have to invent my own 🙂
You could also build a denormalized table every N minutes so falling back on SQL isn’t as big of a deal, and the stampede effect should be minimized.
peter,
another solution is — don’t use the only memcache for caching the same data and use multiple DB-sources for each application. so different memcaches will “forget” the data in different moments.
Mike,
Yes. In this case it is somewhere between. You can see it as query optimization by changing schema to have stored value or you can consider such table as another level of cache and you’re just generating the data before it expires 🙂
hlopetz,
This can work to reduce the problem and I think the application I looked at used multiple memcache servers. It however just question of scale. consider naive implementation for example showing number of registered users which is counted as count(*) from Innodb table on the front page. You can easily have 100 front page views per application server all of them hit this cache item and if number of users is large you can have this count(*) query take a lot of time causing the problems.
Though there are number of reasons why you may consider having multiple memcache instances and reasons you would consider against it 🙂
Baron, please don’t confuse “Thundering herd” (which is miss storm) with “Dogpile” (which is, of course, a query pileup, that could be caused by many reasons!)
I think we have all just discovered Baron’s true calling is SEO 😉
It’s true, so true!
In Perl, the standard caching interface module CHI (http://search.cpan.org/perldoc?CHI) has built-in probabilistic expiration.
Cache Miss Storm. Nice term. But not news:
Bug #15044 Query Cache synchronisation for multiple identical threads
Date: 18 November 2005
http://bugs.mysql.com/bug.php?id=15044
Corresponding forums discussion, from 15 November 2005:
http://forums.mysql.com/read.php?24,54799,54799
Not long off half a decade for this bug to be lying around!
James,
Yeah…. This problem is known for ages and solutions are proposed too. It is just question someone coming around doing it. May be we get sponsorship to do it or MariaDB comes around to implement it.
I’m note sure where I read this, but IMHO the best approach is something like:
On cache miss, verify if this is the first miss or if the database query has already been requested.
If the database query has been requested (i.e., this is not the first cache miss), serve stale/expired data from the cache, and be done with it.
If the database query has not been requested yet, query it, cache it, and then return the new data. Of course you could opt to return stale/expired data even here, for the sake of 1st-miss response time at the cost of async query complexity.
I supposed you could embed this second/shorter cache data time-to-live inside the cached data itself. Chosen appropriately, in general it would only actually get used for high-usage cached items.
P.S. Since the problem is caused by/at the cache layer, it seems sensible to solve it at the cache layer itself, instead of hacking the database to minimize the side-effects. But that’s my opinion of course.
Edmar,
Yes you’re right. This is case serving expired cache is acceptable. This is something what probabilistic expiration tries to archive – starting the refresh some time before it is truly expired.
Note however there are other cache policies – it is possible for the item in cache to be explicitly invalidated so you have to generate the new one before responding.
Regarding cache layer – the database fix suggestions mentioned apply to the MySQL Query Cache which is the caching layer… but it is part of database.
The Squid HTTP Proxy cache first approach to this problem is to cache early. When seeing two identical queries it just forwards/processes one, and internally attaches the second one as a hit on the first waiting for it’s response.
There is also other models, such as background cache update, where the cache gets updated in the background if the requested cached item is about to expire.
and finally error override, serving the cached response even if already expired if the cache can not be updated (data source unavailable etc…)
I have seen the same with my simple implementation of caching formatted results from MySQL in a file. So the solution would be to first update the timestamp of file (so that other threads can serve the data from file as new) and only after that run the query and save the new data to the file. Of course if something goes wrong, we have old data for another cache time period.
I recently wrote a recursive spin lock for exactly this issue. Code is here (in Ruby, for Rails) http://gist.github.com/578303
It’s interesting to point out, this is in the memcached FAQ:
http://code.google.com/p/memcached/wiki/FAQ#Race_conditions_and_stale_data
I used Microsoft’s .NET Internal locking implementation to bypass this:
http://blogs.microsoft.co.il/blogs/moshel/archive/2009/07/11/cache.aspx
This function used for execute other functions that process heavy work, such as heavy SQL queries.
I think what you really want is something like Mint Cache for Django (http://djangosnippets.org/snippets/155/) It basically stores the time of invalidation along with the cached object so when it becomes invalidated, the first request to hit that item in the cache can repopulate the cache while other requests can continue to use the current cache value until its replaced.
I think it’s rare that cache expiry times require a strict deadline, so this provides a clean solution 🙂
I am not a Python coder. But that code really looks like it has a race condition and will still cause a rush to the database at high traffic levels.
Unfortunately, it is. But at the very least, it’s likely to generate far less traffic. The only time you’ll get multiple hits to the database is when a second request for the memcached key comes in before the first request has had a chance to store the fact that it is performing the fetch. So it is possible for high traffic levels, you’ll get several hits to the database, but it still should be significantly less than you would otherwise, depending on how long the database query takes.
You’re making some assumptions that will not hold in lots of cases. This kind of code could be very expensive or even disastrous on a frequently accessed item that is expensive to generate. An atomic operation such as incr() is an easy way to make this work properly. Or, just run a routine job to pre-refresh the item instead of letting the application code refresh it.
I agree that it might not solve the problem, but I fail to see how it makes the situation worse. Normally, the fetch would always occur for any one accessing the cached item during the refresh interval. A technique such as this can significantly reduce the number of requests.
It won’t make the situation worse… sorry I was not very clear in my meaning. The biggest danger here is that over time as your load grows, you are kind of shielded from seeing how expensive it would be to have a bunch of cache misses, and then you get surprised. See http://www.mysqlperformanceblog.com/2010/09/23/more-on-dangers-of-the-caches/ for another look at this.
That aside, I believe that when possible (see my previous comment), it’s better to make code such as this always correct, rather than “incorrect but it doesn’t matter for some people.”
@Micheal Mior
That code does have a problem, however it could be corrected.
I posted to the memcached group.
http://groups.google.com/group/memcached/browse_thread/thread/d123857968b76087/4985b0b4fb91bdbc?lnk=gst&q=CAS+#4985b0b4fb91bdbc
Pseudo code using PHP & MemCached.. http://gist.github.com/613772
you can use phplock http://code.google.com/p/phplock/ to avoid this problem .
Wow this was useful. Take a look at our issue as reported on server fault: http://serverfault.com/questions/429013/mysql-server-hitting-100-unexpectedly-amazon-aws-rds
It may have a proper name but the technique we used was very effective. Benchmark results are much better!
This blog post was referenced from a paper on optimal pre-expiration of cache items. http://www.vldb.org/pvldb/vol8/p886-vattani.pdf