A while back I did Cache Performance Comparison for LAMP Stack.

Looking at this data you can see memcached is about 5 times as slow as APC, and this is with tests done on localhost – with network difference is going to be larger, even with fastest network. Such latency can add up especially if you’re, being lazy “P” Developer, request objects from cache one by one rather than fetching all items you need for the page at once (not to mention this is not always possible).

So I thought if there is any way to use both of them at once benefiting from strong sides of each of them.

APC Cache (Eaccelerator and other similar caches) is Fast but it is not distributed so you’re wasting cache and reducing possible hit rate by caching things locally if you have many web servers. MemcacheD is relatively slow but distributed and so you do not waste memory by caching same item in a few places, it is also faster to warmup as you need only one access to bring item into the cache, not access for each of web servers.

The good thing however is you do not have to select one or another, you can use both at the same time. APC will be great for caching small but frequently accessed things which are not taking too much memory. For example if you store list of states in the database you can cache it this way. For NNSEEK we can use it to cache list of languages, top groups and much of semi-static modules shown on group directory pages. Memcached is good for caching things which take large amount of space combined and which you only need to fetch few per page. For example search results may be good candidate (assuming we want to cache them and want to cache them in memory).

Sometimes I add third level of caching – disk based (database or file based) to cache large size or persistent objects which have long time and which would be too bad to generate each time. This especially applies to data which comes from network – Web services results etc.

What do you think ?

21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Kevin Burton

Yes…. use a hierarchy of caches.

local (APC in this case ) -> memcached -> disk

If you’re using something like MogileFS of tagulacache (memcahced with a berkeley DB backend) you can even get a pretty fast disk backend if you fetch in parallel.

Note that they’ll all need LRU semantics since you don’t want to fill up your APC cache and run out of memory.

In Tailrank we only really have a few objects in memcached so we just use that and skip APC…

Our real boost is using Squid because we stick the home page in memory so it’s just waiting to be served when visitors hit the site.

It does the same thing where I use a hierarchy of memory and disk. I think 2G of disk and 400M of memory is what we’re using now.

Also……. if you’re oging to use a hierarchy like this you NEED to get stats for your cache hit rates. If you’re allocating 500M to your APC cache but only seeing a 1% hit rate it probably just makes sense to use that 500M for something else. If you’re blind you never know if you should use the memory elsewhere.

Kevin

Kevin Burton

Oh….. the main problem with disk based caches is you can only ever get about 100 transactions per second from the boxes. You can probably get about 8x this throughput if you use getMulti but this is essentially cheating because you can allow the disk to queue up reads and line them up….

Kevin

Robin

Not sure why you’d want to do this, but it’s not so hard to make APC distributed as well. The distribution of cache entries in Memcached is managed by the client API. Basically it is a very simple Distributed Hash Table (http://en.wikipedia.org/wiki/Distributed_hash_table). One could write a PHP class that takes care of this, and inserts/requests cache entries from/into the different nodes in the pool. It works like a webservice, between the webservers.
It allows you to use APC both for local caching, and distributed caching. The same method applies to file based or database based caching.
For illustration see http://r.schuil.googlepages.com/phpdsm.html

Vadim Tkachenko

Robin,

Your idea sounds pretty interesting, perhaps this is one I’ll try to implement.

Robin

Peter, that’s correct. However, for some reason, some people don’t feel like using Memcached (took me 3 months to convince my manager).
If you want a local cache, you could run a local instance of memcached as well and connect to it using a local socket.

Personally I prefer a setup where each webserver runs a single instance of memcached, acting as both a local and distributed cache. Very frequently accessed objects are fetched from the localhost’s instance, while less frequently accessed objects can be distributed among the webservers. You’d develop a wrapper class for the memcached api that allows you to define in which mode the cache should operate for a fetch/insert.

Enzo

If you have multiple webservers, then local APC caches aren’t going to cut it. Request will be load balanced across various servers and so what was cached on one server will no be in the cache on the other server so you’ll end up w/ more cache misses.

Memcached solves this problem since it is distributed.

perlchild

50MB?
Some of my developers have larger objects than that…
(external xml imports, so no way to shrink them either)

Stephen Johnston

Memecached has a 1 megabyte item limit, so your 50mb objects are out of luck with memcached anyway.

Johnny

I just laugh when I read this, I developed my own framework focused on advanced cache control methods and functionality. I use both APC and Memcached combined. I use each where they will be most efficiently effective. One is not really better than the other, they are designed to do different things. If you want to create something amazing, use them in tandem.

Anirudh Zala

I have not much experience to both of these technologies. But I have an interesting question about performance of APC and Memcache. Consider a scenario where “An application that is running on 4 webservers behind load balancer (which divides requests according to load on each server)”. Each web server has 4gb of RAM.

Now what would be better solution for implementing caching strategy on above servers?

1: Assigning 100MB to APC on each webserver OR
2: Using a separate Mamcached enabled machine with 4gb of RAM.

Abhimanyu

“Now what would be better solution for implementing caching strategy on above servers?

1: Assigning 100MB to APC on each webserver OR
2: Using a separate Mamcached enabled machine with 4gb of RAM.”

Jhala saheb,

Assign 100 MB for APC to each webserver, then assign 1 GB of RAM for memcached on each server. It is not either/or situation. APC caches PHP opcode, whereas memcached caches results from database queries. When you use both, you avoid repetetive PHP compilation as well as costly connections to database. Use both, use it with nginx instead of Apache, and be prepared to be blown off your feet with the astonishing performance.

Anirudh Zala

@Abhimanyu

As far as I know, APC is also used for user data cache 🙂 so you can store results of database queries also.

Anirudh Zala

@Abhimanyu

Maybe you didn’t get my question properly. I am interested in user data cache only as far as this thread is concerned. I know that I can use both software for effective data caching strategies. But what would be better solutions?

1GB of RAM on each web server for APC OR using dedicated memacached enabled server having 4GB RAM? Because as effective I want to get 4GB or RAM for caching user data. Just question is how to divide it.

Herman

A very important consideration people forget: APC is hooked to your webserver, so if you restart your Apache often to release memory (I do), your entire APC cache gets deleted. memcached runs as a separate daemon and whatever I store in memcache stays intact. I store data for long periods of time that gets accessed again and again.

Mongo Park

Why do they use the words “store” and “cache” in the same sentence in the APC documentation? The apc variable store is useless because it doesn’t implement an LRU algorithm. TTL doesn’t do the trick.

Edemilson Lima
Ngonhan2k5

I used APC at first for both opcode cache and user cache but user cache create a hight fragment rate. After many parameter tunning (shm_size, ttl,..) it only works for a day or two, fragment rate go high in third day athought only 60% of space used.
Today i have just use memcached to handle user cache, now I can lowing the APC shm_size just enough for optcode and give the remain for memcache.