On many web sites you would see a counter how many time given object – blog post, forum thread, image, movie etc was viewed. This is sometimes handy feature but it can be rather expensive from performance point of view.

The nasty thing with counters as they are implemented the most trivial way – they convert read load to write load. When you would simply fetch given object information now you do not only fetch the data but also update the view counter.

For smaller single system web site with no caching the overhead may well be insignificant – as you run update for the same row which just was fetched it does not cause any synchronous IO and background IO can be batched rather effectively.

As the system growths however and you implement some form of caching, ie memcache you end up reducing number of reads from database dramatically but writes still have to happen at full speed. It also really hurts if you choose replication (rather than partitioning) as your scaling path – it becomes very easy to overload the replication, especially as it has to serialize transactions on the slave which could be executed in parallel on the master.

So we learned it can be quite nasty, so how can we deal with this problem ?

I assume you really need to have these counters (in some cases there are more counters then there really needs to be) and also assume you would like to see semi-realtime update for them so we can’t simply update them nightly using batch job. What to do in this case ?

First separate them into special counter table, having say “id” and “counter” columns. Using separate table for updates already solves the problem in a lot of cases, and here is why. Assume you have a counter in the blogpost table instead which has 2000 byte rows in average. 10mil of posts result in 20GB table which may be too much to fit into the buffer pool (or other cache). Now the table which I just mentioned would have approximately 20 byte rows even for Innodb tables which makes it 200MB in total which easily fits in buffer pool. Fitting of the working set in buffer is paramount as if you get it you will be able to do many thousands of updates per second with no problem (assuming log commit would not be bottleneck). Of course the fact you update short rows also helps.

The next problem to deal with is log flushes. If you have innodb_flush_log_at_trx_commit=0 or 2 or if you use MyISAM tables you would not have the problem, if you have innodb_flush_log_at_trx_commit=1 log flushes can well be the problem – even with BBU thousands of log flushes per second may cause significant overhead or may simply be bottleneck. If you can reduce your durability requirements this is best thing to do if not you would either use MyISAM table as counter table (can well work if you do not run into table locks bottleneck) or implement delayed table update as explained below.

At this point we’ve optimized update complexity significantly however sheer volume of updates may be very high and ask for optimization. Thinking about this and some other problems I found it too bad memcache is a passive cache, meaning it just caches stuff and you can’t easily tell it to perform certain action (ie write back to the database) when object is removed from the cache, or just do it periodically. If you’re in system programming you perhaps could implement cache with such semantics and it will handle real-time counters very efficiently pushing few updates back to MySQL server.

If you rather use existing solutions you can use memcache + another mysql instance (or simply the database which is not replicated) to log updates in heap/myisam/archive table and aggregate it in the database for example each hour. If your memcache is sized so objects which touched within last hour are not removed from it this would always give you correct counters while being very efficient.

By doing these delayed updates you can not only reduce number of updates, say if popular object viewed 1000 times within an hour you still will need to do only one update but you also can implement all updates from the chunk in single transaction saving on log commits significantly.

I mentioned you can use MyISAM or HEAP tables for this purpose, how do you deal with Table Locks in this case – the script which processes the views within last hour has to run rather complicated group by query ? Sure it does but it does not have to do it for the same table as inserts go to. The trick you can use here is standard “shadow table” trick – use two tables insert into one and process and truncate another. MySQL offers atomic RENAME TABLE call which can be used to swap two tables.

14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Haiken

Alternative to memcache : MCache (http://www.mohawksoft.org/?q=node/8) which offers access to memory data objects accross multiple servers, and can commit objects to filesystem or a mysql database.

Brian Aker

Hi Peter!

Here are my thoughts:
http://krow.livejournal.com/532321.html

I think you should consider whether or not an exact count is required at all times.

Cheers,
-Brian

Robin

Hi Peter,

For my employer we have thousands of updates every N minutes (N=~5). We’ve encountered the problems you mention and came up with similar solutions (having a two-column table with only the foreign key and the count, grouping the hits from the last five minutes by key to reduce number of updates)
Another optimization is to group the update queries by their counts. For example, if you have 100 rows that need to be increased by 3 you could group these into one query like this: UPDATE table SET count = count + 3 WHERE id IN(…);
This greatly reduces the number of update queries.
During rush hours you may want to introduce a threshold: don’t update the ‘+1’ records but keep them (reinsert them) into your log table so you don’t need to bother the counts table with these minor updates.
Also table partitioning based on the foreign key works well for better performance.

Best,

Robin

Robin

Not really contributing to the discussion, but maybe someone gets inspired to write a new storage engine 😉 : When implementing a ‘log’ table and a ‘counts’ table which is regenerated every so often it would be nice to have a ‘Constant DB’ (CDB, see: http://cr.yp.to/cdb.html) storage engine, wouldn’t it? It has limited features (write once, read many – no updates) but it provides fantastic read/write performance.
Shouldn’t be too difficult to implement something like that as cdb’s API is similar to bdb. Ofcourse update, replace & delete wouldn’t be supported in such an engine, as it is ‘constant’.

guillaume

The other idea is to use a dedicated database instance for view summaries. It works quite well. And after all you don’t re-read the value every time. Just use ON DUPLICATE KEY UPDATE instead!

Jay Pipes

Hi Peter!

You wrote: “When implementing a ‘log’ table and a ‘counts’ table which is regenerated every so often it would be nice to have a ‘Constant DB’ (CDB, see: http://cr.yp.to/cdb.html) storage engine, wouldn’t it? It has limited features (write once, read many – no updates) but it provides fantastic read/write performance.”

What about ARCHIVE? No updates, good read and write performance…

Cheers, and BTW, I 100% agree with everything you say in this article; I’ve been including this information in my slides for a while now.

-jay

sebcio

if your application is PHP you can use XCache and its API to provide nice counter.
A small pseudo-code example could be found in its wiki:
if (!xcache_isset(“count”)) {
xcache_set(“count”, load_count_from_mysql());
}
?>
This guest book has been visited times.

Sheeri

don’t forget INSERT DELAYED — that’s a big win for batch inserting when you can approximate what “the count as of now” really is.

Hartym

Thanks for sharing, that’s a very interesting post i’ll use soon to optimise my pastebin.