A while back I did Cache Performance Comparison for LAMP Stack.
Looking at this data you can see memcached is about 5 times as slow as APC, and this is with tests done on localhost – with network difference is going to be larger, even with fastest network. Such latency can add up especially if you’re, being lazy “P” Developer, request objects from cache one by one rather than fetching all items you need for the page at once (not to mention this is not always possible).
So I thought if there is any way to use both of them at once benefiting from strong sides of each of them.
APC Cache (Eaccelerator and other similar caches) is Fast but it is not distributed so you’re wasting cache and reducing possible hit rate by caching things locally if you have many web servers. MemcacheD is relatively slow but distributed and so you do not waste memory by caching same item in a few places, it is also faster to warmup as you need only one access to bring item into the cache, not access for each of web servers.
The good thing however is you do not have to select one or another, you can use both at the same time. APC will be great for caching small but frequently accessed things which are not taking too much memory. For example if you store list of states in the database you can cache it this way. For NNSEEK we can use it to cache list of languages, top groups and much of semi-static modules shown on group directory pages. Memcached is good for caching things which take large amount of space combined and which you only need to fetch few per page. For example search results may be good candidate (assuming we want to cache them and want to cache them in memory).
Sometimes I add third level of caching – disk based (database or file based) to cache large size or persistent objects which have long time and which would be too bad to generate each time. This especially applies to data which comes from network – Web services results etc.
What do you think ?
Yes…. use a hierarchy of caches.
local (APC in this case ) -> memcached -> disk
If you’re using something like MogileFS of tagulacache (memcahced with a berkeley DB backend) you can even get a pretty fast disk backend if you fetch in parallel.
Note that they’ll all need LRU semantics since you don’t want to fill up your APC cache and run out of memory.
In Tailrank we only really have a few objects in memcached so we just use that and skip APC…
Our real boost is using Squid because we stick the home page in memory so it’s just waiting to be served when visitors hit the site.
It does the same thing where I use a hierarchy of memory and disk. I think 2G of disk and 400M of memory is what we’re using now.
Also……. if you’re oging to use a hierarchy like this you NEED to get stats for your cache hit rates. If you’re allocating 500M to your APC cache but only seeing a 1% hit rate it probably just makes sense to use that 500M for something else. If you’re blind you never know if you should use the memory elsewhere.
Kevin
Oh….. the main problem with disk based caches is you can only ever get about 100 transactions per second from the boxes. You can probably get about 8x this throughput if you use getMulti but this is essentially cheating because you can allow the disk to queue up reads and line them up….
Kevin
Not sure why you’d want to do this, but it’s not so hard to make APC distributed as well. The distribution of cache entries in Memcached is managed by the client API. Basically it is a very simple Distributed Hash Table (http://en.wikipedia.org/wiki/Distributed_hash_table). One could write a PHP class that takes care of this, and inserts/requests cache entries from/into the different nodes in the pool. It works like a webservice, between the webservers.
It allows you to use APC both for local caching, and distributed caching. The same method applies to file based or database based caching.
For illustration see http://r.schuil.googlepages.com/phpdsm.html
Robin,
Your idea sounds pretty interesting, perhaps this is one I’ll try to implement.
Kevin, Thanks.
Hierarchy of caches is great of course. My main point was there might be place both for APC and Memecached, even though one may think about them being on the same layer, as they are both in memory caches.
I left out squid in this case – not creating pages at and just serving them from the cache is best of course.
Speaking about stats – you always need them as you can easily waste time by adding caches if your cache hit rate is too low.
Besides having hit rate, I also look at timing statistics – how much does it take to store object in the cache for given object type and store it vs it normal production. This helps you to understand where to place the object – if it take just 1ms to create it you probably do not want it in disk cache but better throw it away if it does not fit in memory as creating it faster than reading from the disk.
Timing information also helpful for understanding where response time comes from so you can see these 100 separate fetches from the cache are not cheap 🙂
Robin,
I do not think it make sense adding the client to use APC as remote cache. MemcacheD is implemented pretty well and has decent client API (It is extension in PHP so it is faster than you would write in PHP itself) – APC is faster because it is local not because memcached is slow – I think it gets close to being as fast as you can get remote cache to be.
Peter, that’s correct. However, for some reason, some people don’t feel like using Memcached (took me 3 months to convince my manager).
If you want a local cache, you could run a local instance of memcached as well and connect to it using a local socket.
Personally I prefer a setup where each webserver runs a single instance of memcached, acting as both a local and distributed cache. Very frequently accessed objects are fetched from the localhost’s instance, while less frequently accessed objects can be distributed among the webservers. You’d develop a wrapper class for the memcached api that allows you to define in which mode the cache should operate for a fetch/insert.
Robin,
It might be hard to convince people to use anything. Especially if this is technology which they are not familiar with.
In the case you’re speaking about I would probably use small local APC cache plus distributed memcached, running on the same set of boxes. Memcached is fast but even run locally it is much slower than APC.
For most applications you would not notice too much differnece to be honest as even 20.000 of lookups per second means your pages will be lightning fast even with 100 cache lookups per cache 🙂
If you have multiple webservers, then local APC caches aren’t going to cut it. Request will be load balanced across various servers and so what was cached on one server will no be in the cache on the other server so you’ll end up w/ more cache misses.
Memcached solves this problem since it is distributed.
Enzo,
Even if you have multipe web servers there are some things which you can cache in local APC – yes they will be copied on all of your web nodes but if their total size is say 50MB why would you care ?
Cache invalidation is another problem which can be done with TTL and versions even in such envinronment.
50MB?
Some of my developers have larger objects than that…
(external xml imports, so no way to shrink them either)
Memecached has a 1 megabyte item limit, so your 50mb objects are out of luck with memcached anyway.
I just laugh when I read this, I developed my own framework focused on advanced cache control methods and functionality. I use both APC and Memcached combined. I use each where they will be most efficiently effective. One is not really better than the other, they are designed to do different things. If you want to create something amazing, use them in tandem.
I have not much experience to both of these technologies. But I have an interesting question about performance of APC and Memcache. Consider a scenario where “An application that is running on 4 webservers behind load balancer (which divides requests according to load on each server)”. Each web server has 4gb of RAM.
Now what would be better solution for implementing caching strategy on above servers?
1: Assigning 100MB to APC on each webserver OR
2: Using a separate Mamcached enabled machine with 4gb of RAM.
“Now what would be better solution for implementing caching strategy on above servers?
1: Assigning 100MB to APC on each webserver OR
2: Using a separate Mamcached enabled machine with 4gb of RAM.”
Jhala saheb,
Assign 100 MB for APC to each webserver, then assign 1 GB of RAM for memcached on each server. It is not either/or situation. APC caches PHP opcode, whereas memcached caches results from database queries. When you use both, you avoid repetetive PHP compilation as well as costly connections to database. Use both, use it with nginx instead of Apache, and be prepared to be blown off your feet with the astonishing performance.
@Abhimanyu
As far as I know, APC is also used for user data cache 🙂 so you can store results of database queries also.
@Abhimanyu
Maybe you didn’t get my question properly. I am interested in user data cache only as far as this thread is concerned. I know that I can use both software for effective data caching strategies. But what would be better solutions?
1GB of RAM on each web server for APC OR using dedicated memacached enabled server having 4GB RAM? Because as effective I want to get 4GB or RAM for caching user data. Just question is how to divide it.
A very important consideration people forget: APC is hooked to your webserver, so if you restart your Apache often to release memory (I do), your entire APC cache gets deleted. memcached runs as a separate daemon and whatever I store in memcache stays intact. I store data for long periods of time that gets accessed again and again.
Why do they use the words “store” and “cache” in the same sentence in the APC documentation? The apc variable store is useless because it doesn’t implement an LRU algorithm. TTL doesn’t do the trick.
Another good option for a distributed caching is redis:
http://redis.io/topics/partitioning
http://stackoverflow.com/questions/10558465/memcache-vs-redis
I used APC at first for both opcode cache and user cache but user cache create a hight fragment rate. After many parameter tunning (shm_size, ttl,..) it only works for a day or two, fragment rate go high in third day athought only 60% of space used.
Today i have just use memcached to handle user cache, now I can lowing the APC shm_size just enough for optcode and give the remain for memcache.