This is part 3 of my Tyrant extra’s, part 1 focused on durability, part 2 focused on the perceived performance wall.

#3.  Tokyo Cabinet Can have only a single writer thread, bottlenecking performance

When writing an application using Tokyo Cabinet only one connection can be opened as a “writer”  while the rest are readers.  Tyrant allows for multiple “writes”  to be sent in from multiple applications but it still single threads them when writing out to disk.   If you run several threads all just inserting into Tyrant your will see tyrant hit 100% Cpu on 1 core, and your writes will start to peter out quickly.

Single Threaded Writes

In my tests when I was not disk bound (FS Cache writes) I was able to complete 4Million inserts in a little over 91 seconds using 8 threads.  I actually averaged 43896.98 inserts per second during my 8 thread test.  Moving to 10 threads doing the same 4Million inserts I completed the test in 96 seconds and averaged 41649.42 inserts per second.    Compare this to 4 Million rows using 4 threads which averaged  40933.86 and you start to see that around 40K inserts per second is the most this particular server is capable of ( single threaded ).  Hopefully this is something that maybe able to be fixed internally in the near future.  Until then you may consider breaking up your data into multiple tables each with there own cache.  This limit is per TC DB so this should work.  I had an idea about using the memcached client to distribute the data accross multiple TC database files in the back end.  This should work, I just need to test it 🙂

Ever notice how as my multi-part posts go on they get shorter and shorter:)  This will be the last Tyrant related post for a little bit.  The 4th & 5th posts were supposed to deal with replication and scaling… this may take a little while.  Thanks for reading!

4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Sergi

Short but interesting nevertheless for all of us beginning with the “no-SQL” DBs

Uriel Katz

The reason for one thread is that there is one logical resource,having multiple threads writing to disk wont make it faster,it will make it slower.
the only way to make it faster is using more disks,i don
t think that writing in TC is bounded by the cpu.

Richie Vos

I’m just starting to try out some no-SQL dbs (mongo currently). These are really interesting posts to read and exactly the sort of info I was looking for.

surplus ammunition

Nice post. I used to be checking constantly this blog and I am inspired!
Very helpful information particularly the ultimate section 🙂 I take care of such
information a lot. I used to be looking for this certain information for a long time.

Thanks and good luck.