Continuing my look at Tokyo Tyrant/Cabinet and addressing some of the concerns I have seen people have brought up this is post #2.

#2.  As your data grows does  Tokyo Cabinet slow down?

Yes your performance can degrade. One obvious performance decrease with a larger dataset  is you start to increase the likelihood that your data no longer fits into memory.  This decreases the number of memory operations and trades them for more expensive disk based operations.    As fast as any application is, as you read off disk opposed to memory performance is going to drop off substantially.  One of the more difficult things to test with Tyrant is disk bound performance.  The FS Cache can make Tyrant seem like small amounts of memory will still make it scream.  Once your data set is larger then that, people start to claim they hit the performance “wall”.

In order to help test this I went ahead an mounted the FS with my data files with the sync option which effectively disables the FS cache.  This should help show the real performance of the hash engine.  Here performance dips substantially, as expected :

FS Mounted As Sync

Look at the IO rate:
NoSync:  31 MB/s
Sync:  3.2 MB/s

As one would expect the IO goes crazy when the drive is mounted with the sync option hitting 99% IO wait.  The interesting this here is we are actually bottlenecking on writes and not reads.  You see without the FS cache to buffer the writes when we need to remove data from memory we now have to rely on the internal Tyrant cache and when that is exhausted have to then really write to disk not the FS Cache.  Now Tyrant starts to take on the same characteristics as your classic DB, the bigger the buffer pool the faster the performance:

Difference Memory Sizes for Tyrant
Even here the performance drop-off once you exhaust memory is relative.  The focus here should be the drop off versus other solutions with the same configuration, not the drop off versus a completely cached version.  In this case ask yourself given similar datasets and similar memory requirements what is the performance?  Take the above sync test, when I use 256M of memory and run my test with writes going directly to disk I hit 964 TPS, in previous MySQL tests the same setup (256M BP) netted ~160 TPS.  So 5x improvement all things being equal.  Of course this is a far drop off from the 13K I was getting when everything was effectively in the file system cache or in memory, but 5x is still a very solid improvement.

Next up is looking at Tyrant’s and Cabinet’s write bottleneck.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
herodiade

This is strange because afaik most linux filesystems do flush dirty pages
every 5 second per default (commit kernel param) unless sync syscall
or mount option. So with “NoSync: 31 MB/s” we should expect 5*31 + small
overhead would be enough to keep all writes in dirty buffers for a whole
pdflush cycle, and then 512MB should be enough to handle the load exactly
as 1024MB would, no? Or… what /proc/sys/vm/dirty_ratio value did you
use for this test?

Though “I use 256M of memory and run my test with writes going directly to
disk” confuses me a lot; are you saying that the second test ran sync too?
(if so I don’t understand why memory size do impact results); or are you
measuring read perfs in this second test? or mixed r+w with w being sync?

But I guess the question (“As your data grows does Tokyo Cabinet
slow down”) was probably more about read performances than write
performances, wasn’t it?

Maybe justing testing reads (on different keys) after a cache purge
(like sync && echo 3 > /proc/sys/vm/drop_caches) would show worst
case (similar to when a very small proportion of data do fit in
kernel buffers and everything is retrieved from disk) for read perfs?

kenny

hello, there is a problem troubles me so long.
Why the size of a InnoDB page is 16k?

As the page size of system is 4k.
If the InnoDB page is also 4k, perhaps system io can be reduce?
As we know that InnoDB wirtes a least a whole page whenever it writes.

But I think there must be more advantages for the setting.
Can pls tell me why or how to find the answer.
Much thanks.