I have been following Virident for a long time (e.g. https://www.percona.com/blog/2010/06/15/virident-tachion-new-player-on-flash-pci-e-cards-market/). They have great PCIe Flash cards based on SLC NAND.
I always thought that Virident needed to come up with an MLC card, and I am happy to see they have finally done so.

At Virident’s request, I performed an evaluation of their MLC card to assess how it handles MySQL workload. As I am very satisfied with the results, I wish to share my findings in this post.

But first, I wish to offer an overview of the card.

Virident FlashMax Cards are available in 1TB and 1.4TB usable capacities (the models names are M1000 and M1400)
These specified sizes are already available for end users.
I evaluated M1400 (1.4TB size) model, which I will discuss:

Because Virident has competition in the SSD market, they have stated their goals to distinguish themselves from their competitors:

  • Stability of performance: That is to minimize variations in throughput
  • Better response times: This is very important for database performance and I appreciate that Virident has made this a priority.
  • Performance at full capacity: As we know, SSD-based cards have special characteristics; the throughput declines when space utilization increases. Virident’s design/programming minimizes this decline.
  • RAID5 on the card: The card comes with RAID5 support on the card to give better protection.

To deal with a throughput decline, all Flash cards have reserved space. The 1.4TB card, that I have, internally holds 2TB worth of space.

This additional space is used for two purposes:

      1. To amortize write-intensive workloads, by using additional space.
      2. To have replacements for failed MLC modules. When one MLC module fails, it is marked as unused, and gets replaced by one from the pool of reserved modules.

    Internally, Virident uses 25nm Intel NAND Flash MLC modules, this is the same technology that Intel uses for the Intel SSD 320 cards. 25nm modules allow the user a greater capacity, Physically you can place
    more GBs into a given area. However, the drawback is that 25nm has worse reading and writing latencies, compared to previous generations. However, I have yet to determine how this affects MySQL workloads.

    Virident has provided the following price list:

    • M1000 (1000GB Usable) – $13,000
    • M1400 (1400GB Usable) – $18,200
    • This amounts to $13/GB

    Second, it is important to compare the performance of Virident FlashMAX MLC with available competing solutions.
    It is fair to say Fusion-io ioDrive Duo 1.28TB MLC is the most well-known and most advanced competitor in the market.
    I had a chance to administer a head-to-head comparison of sysbench and tpcc-mysql workloads between FlashMAX 1.4TB and ioDrive Duo 1.28TB.

    It is important to highlight that Fusion-io ioDrive Duo is based on 34nm NAND technology, which is a full generation behind the 25nm NAND. However at this point, I have no access to Fusion-io ioDrive2, which is based on 25nm NAND.
    Another important factor is that ioDrive Duo is actually two cards visible in the OS, and the user needs to use a software RAID. For Virident all 1400GB shows up as one single drive so no software RAID is necessary.

    To compare performances I ran sysbench oltp and tpcc-mysql benchmarks. I will present the results
    for sysbench oltp (with full report available later) below, and the results for tpcc-mysql in a followup post.

    For sysbench, I used our multi-tables sysbench implementation with 256 tables and 10,000,000 rows each. This is a total of around 630GB of data, which allows one to adequately fill both cards in comparison.

    Some hardware used in benchmarks include:

    • Server: Cisco UCS C250, running Oracle Linux 6.1 and Percona Server 5.5.15
    • Client: HP ProLiant DL380 G6, sysbench v5

    Of course, our Percona Server was optimized for Flash cards, with variations for two settings.
    I tested combinations of innodb_buffer_pool_size=120GB, 174GB and innodb_flush_log_at_trx_commit=1, 2.

    The results in this post are for case innodb_buffer_pool_size=174GB and innodb_flush_log_at_trx_commit=1

    As in all my recent benchmarks, I use long runs of 1 hour each with measurements every 10 seconds. This methodology allows me to observe trends and the stability of the performance on graphs.

    The first graph represents throughput in transactions per second for different amounts of user threads (more is better). More concentrated dots represent less variance and better stability of throughput.

    A tabular format, for throughput I use a median of measurements for last 1800 seconds in each run:

    Card / Threads / tps12481632641282565121024
    1Fusion-io ioDrive Duo83.00177.00322.00523.00644.00740.00801.00798.00761.00784.00162.00
    2Virident FlashMAX96.00179.00357.00607.00821.00975.001083.001156.001064.001091.00465.00

    In order to examine the details of how throughput varies we have taken 32 threads and examined the timeline graph for each one:

    While you can see that with Virident FlashMAX we have a pretty stable line of around 975 tps, the Fusion-io ioDrive Duo has a variance of 700-800 tps.

    My conclusions are as follows:

    • It is great to see another player on MLC Flash cards market.
    • It is also great that Virident focuses on stability of performance for competitive advantage.
    • Beside stability, we also see better throughput in MySQL using the Virident FlashMAX card for every thread count. On 32-64 threads we have about a 35-40% advantage of using Virident FlashMAX.

    DISCLOSURE: This review was done as part of our consulting practice for which we compensated by Virident. However, this review was written independently of Virident, and reflects our opinion of this product.

    The full report is available there


    14 Comments
    Oldest
    Newest Most Voted
    Inline Feedbacks
    View all comments
    Chris Connor

    SSD technolgy and performance is a rapidly changing landscape.
    A lot of these performance characteristics are largely based on the drivers/firmware available for the devices, with each firmware upgrade bringing in new features/speeds/etc.

    Can you comment on the applicable software and firmware levels of the devices tested?

    Also, you mentioned that the ioDrive Duo device shows up as two separate devices and requires software raid to create a contiguous space. Can you outline the steps you used to create this space, including any md/LVM steps and filesystem creation as far as blocksize/stride are concerned?

    Thanks!

    Peter Zaitsev

    Vadim,

    I wounder if redundancy inside the cards is comparable in this case ? I see you use raid=disabled in this case. Would FusionIO for example preserve the data if one of flash memory modules were to fail ? What is about Virident ?

    Peter Zaitsev

    Vadim,

    ECC is Redundancy

    If I read this text correctly it says there is ECC data on redundant chip which could mean if chip fails data can be recovered from rest of chips and redundantly stored data.

    In general data loss prevention strategies can be rather complicated these days so it might be more interested to look at the likelihood of data loss which might be in the specs in some form.

    Frederick M'Cormack

    Indeed Failure Rate of chip could be publish but itdoesn’t sell the product – especially if the failure rate is measured in decades or times outsude the boundaries of useability of the product. Does virident have ‘replace bad chips on-the-fly’ capability?

    rj03hou

    Vadim,
    You run the sysbench in complex mode?
    how do you gathering the throughput metric every 10 seconds?
    For sysbench, I used our multi-tables sysbench implementation with 256 tables and 10,000,000 rows each. The multi-tables sysbench is your internal version?

    Sergey Kulagin

    Vadim,
    Do you have any idea about the life span of these products? The manufacturer says about reliability, but I haven’t found any numbers that resemble number of years or writes.
    Thanks!

    Bob

    I’ve used RAID-0 on the Fusion-IO cards including rading both halves of a Duo. Although the cards have power-loss protection and will go into a readonly mode generally if there is a problem including a driver failure, the caveat on RAID-0 is that you end up with a corrupt volume even if no data loss occurs. if one of the striped drives is written and the other one isn’t written for any operation this results in a corrupt volume. Therefore, even with the protections on the card with parity etc, RAID-0 is not safe because driver errors do occasionally happen even if the drives themselves rarely ever lose data. A failure to complete a write on RAID-0 corrupts the volume, even if no prior data was actually lost due to the stripes being out of sync.

    Bob

    Actually, I’ve been using on Windows Platform, not Linux. I think this issue applies to any RAID-0 volume regardless of hardware or software platform. If even one drive in a RAID-0 stripe set is unable to write its portion of the data stripe, the volume is hopeless corrupted, even if the cause might be something that has nothing to do with the actual drive integrity.- i.e. power failure, controller error, disk timeout, etc.That is true even if the drive itself is still healthy and still has all of the data before the failure.

    There may of course be hacks or tools to recover data from a RAID-0 and ignore the incomplete stripe, but that is my experience with Windows software Raid-0. Maybe some raid-0 is smart enough to ignore an incomplete stripe, rather than mark the drive as unusable.Haven’t had time to research in more detail.