We are using Percona Server + TokuDB engine extensively in Percona Cloud Tools and getting real usage operational experience with this engine. So I want to share some findings we came across, in hope it may help someone in their work with TokuDB.
So, one problem I faced is that SELECT * FROM INFORMATION_SCHEMA.TABLES
is quite slow when I have thousands tables in TokuDB. How slow? For example…
1 2 3 | select * from information_schema.tables limit 1000; ... 1000 rows in set (18 min 31.93 sec) |
This is very similar to what InnoDB faced a couple years back. InnoDB solved it by adding variable innodb_stats_on_metadata
.
So what happens with TokuDB? There is an explanation from Rich Prohaska at Tokutek: “Tokudb has too much overhead for table opens. TokuDB does a calculation on the table when it is opened to see if it is empty. This calculation can be disabled when ‘tokudb_empty_scan=disabled
‘. ”
So let’s see what we have with tokudb_empty_scan=disabled
1 2 3 | select * from information_schema.tables limit 1000; ... 1000 rows in set (3 min 4.59 sec) |
An impressive improvement, but still somewhat slow. Tokutek promises a fix to improve it in the next TokuDB 7.2 release.
Will we see tokudb support in Percona cluster do you think? Are you guys just running master/slave for Peexona cloud tools at the moment?
Phil,
At this time we have no timeframe for support of TokuDB engine in Percona Cluster software.
And it totally depends on Tokutek, we are waiting on their plans.
We are running master-slave configuration in Percona Cloud Tools, there will be another post about our setup.
Hello,
TokuDB 7.1.7 uses a linked list of all of the open fractal trees. In Vadim’s example, the list has thousands of entries. Whenever a fractal tree is opened, a full linked list search occurs. We now use a weight balanced tree to maintain the set of open fractal trees rather than a linked list, so the cost to open a fractal tree is turned from O(n) to O(log n), where n is the current number of open fractal trees. This change will be shipped in the next version in a couple of months.
Rich,
Thanks for explanation. I was just going to ask Vadim about the performance difference between Innodb and TokuDB but with linked list issue I guess the difference will be larger with larger amount of tables. I would note though what even 3min sounds like a lot for such operation