We are using Percona Server + TokuDB engine extensively in Percona Cloud Tools and getting real usage operational experience with this engine. So I want to share some findings we came across, in hope it may help someone in their work with TokuDB.

So, one problem I faced is that SELECT * FROM INFORMATION_SCHEMA.TABLES is quite slow when I have thousands tables in TokuDB. How slow? For example…

This is very similar to what InnoDB faced a couple years back. InnoDB solved it by adding variable innodb_stats_on_metadata.

So what happens with TokuDB? There is an explanation from Rich Prohaska at Tokutek: “Tokudb has too much overhead for table opens. TokuDB does a calculation on the table when it is opened to see if it is empty. This calculation can be disabled when ‘tokudb_empty_scan=disabled‘. ”

So let’s see what we have with tokudb_empty_scan=disabled

An impressive improvement, but still somewhat slow. Tokutek promises a fix to improve it in the next TokuDB 7.2 release.

4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Phil

Will we see tokudb support in Percona cluster do you think? Are you guys just running master/slave for Peexona cloud tools at the moment?

Rich Prohaska

Hello,
TokuDB 7.1.7 uses a linked list of all of the open fractal trees. In Vadim’s example, the list has thousands of entries. Whenever a fractal tree is opened, a full linked list search occurs. We now use a weight balanced tree to maintain the set of open fractal trees rather than a linked list, so the cost to open a fractal tree is turned from O(n) to O(log n), where n is the current number of open fractal trees. This change will be shipped in the next version in a couple of months.

Peter Zaitsev

Rich,

Thanks for explanation. I was just going to ask Vadim about the performance difference between Innodb and TokuDB but with linked list issue I guess the difference will be larger with larger amount of tables. I would note though what even 3min sounds like a lot for such operation