I’ve been working with Clustrix team for long time on the evaluation of Clustrix product, and this is the report on performance characteristics of Clustrix under tpcc-mysql workload.

I tested tpcc 5000W (~500GB of data in InnoDB) on Clustrix systems with 3, 6, 9-nodes and also, to have base for comparison, ran the same workload on HP ProLiant DL380 G6 powered by Fusion-io card, and on SuperMicro server powered by 7 Intel SSD 320 cards (this server is equal to hardware that Clustrix uses for its nodes).

The full report is available on our page with whitepapers, and in this post I would like to highlight the most interesting points.

The chart with comparison of all systems ( results in throughput per 10 sec, more is better)

So my conclusions from this benchmark:

  • Clustrix shows very good scalability in the high concurrent workload by adding additional nodes.
    In fact the throughput improves more than by 2 times (3 times) by doubling (tripling) amount of nodes. This is possible Clustrix automatically distributes data around new nodes, and data/memory ratio decreases, which allows to achieve better throughput per node.
  • Clustrix is able to handle such complex workload as tpcc, and automatically distributes load between nodes despite multi-statements transactions and foreign key relations.
  • For a workload with a small number of threads, Clustrix does not perform as well as the system with Fusion-io cards.
  • We also should take into account that Clustrix automatically provides high availability, maintaining redundant information on each node. Other systems in comparison are not fault- or crash-tolerant.

So looking on the results, Clustrix might be not your first choice for single-thread or low concurrency workloads from the performance point of view, but consider other factors such as high availability and transparent auto-rebalancing out-of-the-box. For high concurrent workloads, Clustrix provides great performance, and if you need better throughput, just add more nodes.

The other factor which would be interesting to compare, but I did not do that in this research, is the total cost of system. I need to ask Clustrix how cost of 3,6,9 nodes system is compared to other systems in comparison.

Standard dislaimer: this post is part of paid evaluation we perform for Clustrix, but is totally independent and fully reflects our opinion.


12 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Andy

Interesting that at high concurrency 7 Intel 320 perform pretty much the same as FusionIO.

7 320 cost about $2K. How much does that FuisonIO cost – around $10K?

When you RAID-0 7 SSD, did SATA become the bottleneck?

Mikael Ronstrom

Is there a HW specification available also of the Clustrix systems as well?

Aaron Passey

The Clustrix node is mostly off the shelf hardware with a couple extra components in it. It has 7 Intel 320 drives, 48GB RAM, dual 4 core Westmere processors. The extra components are Infiniband for inter-node communication and an NVRAM for very low latency writes with guaranteed durability.

The hardware is important, of course, but the real magic is in the Clustrix software. The software is what allows it to seamlessly scale a single database to span multiple nodes.

marrtins

How this compares to Galara replication?

Vadim

marrtins,

I am going to write detailed blog post about Custrix, in short it is very different from Galera replication.

Galera is based on MySQL/InnoDB, OpenSource, where each node contains full copy of data.

Clustrix is proprietary software which uses only MySQL protocol and internally it is absolutely different solution.
Clustrix does not keep full copy of data on each node, only partial and it automatically re-balances load and data distribution.

marrtins

Oh, thanks for insight! On quickview they looked very similar to me. Looking forward for detailed article.

Tim Vaillancourt

Very interesting performance numbers. Can anyone comment on the stability of Clusterix overall today? I am basing this on second-hand knowledge from colleagues who tested it, but my understanding was it still has many bugs to iron out. In the end we chose not to use it due to bugs. This was done about a year ago, however.

Cheers,

Tim

Dan Pollack

We have had Clustrix in production for quite some time. It lives at the core of an internal production storage service. It performs well, is scalable, and is stable. Check out the white paper here – http://www.clustrix.com/uploads/documents/Clustrix_Use_Case_AOL.pdf
If you have questions and I’d be happy to talk about our experience with Clustrix.

Pawel Sidoryk

Hello,
I came across this post because I would like to learn more about Clustrix. The comparison of Clustrix to MySQL is very interesting but I have doubts regarding one detail of your testing methodology. You used 4 clients to test Clustrix and you used only 1 client to test MySQL. Why ? This is very strange since I think that actually the client got saturated in the MySQL case, not the MySQL server. Or maybe I am wrong and there was a reason to use 4 clients to test Clustrix and only 1 client to test MySQL ?
Could you please give an evidence that it was really the server that was saturated in the MySQL test and that the client was NOT saturated ?

dbuzz007

Hello and thanks for detailed comparison. Can you provide TPC-C repository source to repeat your benchmark?