MLC SSD card lifetime and write amplification

As MLC-based SSD cards are raising popularity, there is also a raising concern how long it can survive. As we know, a MLC NAND module can handle 5,000-10,000 erasing cycles, after which it gets unusable. And obviously the SSD card based on MLC NAND has a limited lifetime. There is a lot of misconceptions and misunderstanding on how long such card can last, so I want to show some calculation to shed a light on this question.

For base I will take Virident FlashMAX M1400 (1.4TB) card. Virident guarantees 15PB (PB as in petabytes) of writes on this card.
15PB sounds impressive, but how many years it corresponds to ? Of course it depends on your workload, and mainly how write intensive it is. But there are some facts that can help you to estimate.

On Linux you can look into the /proc/diskstats file, which shows something like:

 251       0 vgca0 30273954 0 968968610 416767 122670649 0 8492649856 19260417 0 19677184 220200747

1	251 0 vgca0 30273954 0 968968610 416767 122670649 0 8492649856 19260417 0 19677184 220200747

where 8492649856 is the number of sectors written since the reboot (sector is 512 bytes).

Now you can say that we may take /proc/diskstats stats with the 1h interval, and it will show write how many bytes per hour we write, and in such way to calculate the potential lifetime.
This will be only partially correct. There is such factor as Write Amplification, which is very well described on WikiPedia, but basically SSD cards, due an internal organization, write more data than it comes from an application.
Usually the write amplification is equal or very close to 1 (meaning there is no overhead) for sequential writes and it gets a maximum value for fully random writes. This value can be 2 – 5 or more and depends on many factors like the used capacity and the space used for an over-provisioning.

Basically it means you should look into the card statistic to get an exact written bytes.
For Virident FlashMAX it is

vgc-monitor -d /dev/vgca  | grep writes
                                 379835046150144 (379.84TB) (writes)

1 2	vgc-monitor -d /dev/vgca \| grep writes 379835046150144 (379.84TB) (writes)

Having this info let’s take look what a lifetime we can expect under a tpcc-mysql workload.
I put 32 users threads against 5000W dataset (about 500GB of data on the disk) during 1 hour.

After 1 hour, /proc/diskstat shows 984,442,441,728 bytes written, which is 984.44GB and the Virident stat shows 1,125,653,692,416 bytes written, which is 1,125.65GB
It allows us to calculate the write amplification factor, which in our case is
1,125,653,692,416 / 984,442,441,728 = 1.143. This looks very decent, but remember we use only 500GB out of 1400GB, and the factor will grow as we fill out more space.

Please note we put a quite intensive write load during this hour.
MySQL handled 25,000 updates/sec, 20,000 inserts/sec and 1,500 deletes/sec, which corresponds to
write throughput 273.45MB/sec from MySQL to disk.

And it helps to calculate the lifetime of the card if we put such workload 24/7 non-stop.
15PB (of total writes) / 1125.65GB (per hour) = 13,325.634 hours = 555.23 days = 1.52 years

That is under non-stop tpcc-mysql workload we may expect the card will last 1.52 years. However, in real production you do not have an uniform load every hour, so you may base your estimation on daily or weekly stats.

Unfortunately there is no easy way to predict this number until you start workload on the SSD.
You can take look into /proc/diskstat, but
1. There is write amplification factor which you do not know
and 2. A throughput on regular RAID is much less than on SSD and you do not know what your throughput will be when you put workload on SSD.

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Jake

12 years ago

Id like to know what happens when the SSD starts failing due to hitting these limits. Does the whole thing stop working catastrophically, or something else? How does MySQL perform on a dying SSD?

Edmar

12 years ago

Absolutely great post, something I’ve been wondering about for some time. Thanks!

Do you have any idea about how would the end-of-life of such a card manifest itself? Hard instantaneous failure, or maybe progressively severe performance degradation as less and less hidden reserve capacity (reserved MLC modules) is available?

Is there a way to measure/monitor failed MLC module count from vgc-monitor -d output?

Kep

12 years ago

Intel has Solid-State Drive Toolbox which shows Drive health & Estimated drive life remaining.
http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=18455

Shirish Jamthe

12 years ago

Hello Jake, Edmar

Let me respond to your questions in two parts.
First let me introduce you to the output of vgc-monitor that Vadim is looking at for lifetime.
In my second comment I will describe how Virident defines ‘End of Life’ for Virident’s FlashMAX cards.

As you can see from the output below, we not only tell you the bytes written but also tell you the remaining life left as a % so you can monitor it closely.

# vgc-monitor -d /dev/vgcf
vgc-monitor: 2.1(B535334M)

Card_Name      Raw_Capacity   Max_DaughterBoards  Num_Partitions      Status
/dev/vgcf      2048 GiB       16                  1                   GOOD

  Serial Number   : VS001D2-000076
  Card Info       : intel-mlc-2tb, 2048 GiB, Double decker
                    Rev : FlashMAX 33248, module 32968
  Temperature     : Safe


  Partition           Raw_Capacity        Usable_Capacity     RAID      Status
  /dev/vgcf0          2048 GiB            1404 GB             enabled   GOOD

    Mode                       : maxcapacity
    Total Flash Bytes          : 382474358784 (382.47GB) (reads),
                                 33881456360448 (33.88TB) (writes)
    Remaining Life             : 99.60%
    Garbage Collection Impact  : Low

    Daughter  Max       Life_Left      RAID               Status
    Board ID  Capacity  (cycles)       Groups
    0         128 GiB   99.61%      0 2 4 6               GOOD
    1         128 GiB   99.60%      1 3 5 7               GOOD
    2         128 GiB   99.61%      0 2 4 6               GOOD
    3         128 GiB   99.60%      1 3 5 7               GOOD
    4         128 GiB   99.61%      0 2 4 6               GOOD
    5         128 GiB   99.60%      1 3 5 7               GOOD
    6         128 GiB   99.61%      0 2 4 6               GOOD
    7         128 GiB   99.60%      1 3 5 7               GOOD
    8         128 GiB   99.61%      0 2 4 6               GOOD
    9         128 GiB   99.60%      1 3 5 7               GOOD
    10        128 GiB   99.61%      0 2 4 6               GOOD
    11        128 GiB   99.60%      1 3 5 7               GOOD
    12        128 GiB   99.61%      0 2 4 6               GOOD
    13        128 GiB   99.60%      1 3 5 7               GOOD
    14        128 GiB   99.61%      0 2 4 6               GOOD
    15        128 GiB   99.60%      1 3 5 7               GOOD

# vgc-monitor -d /dev/vgcf

vgc-monitor: 2.1(B535334M)

Card_Name Raw_Capacity Max_DaughterBoards Num_Partitions Status

/dev/vgcf 2048 GiB 16 1 GOOD

Serial Number : VS001D2-000076

Card Info : intel-mlc-2tb, 2048 GiB, Double decker

Rev : FlashMAX 33248, module 32968

Temperature : Safe

Partition Raw_Capacity Usable_Capacity RAID Status

/dev/vgcf0 2048 GiB 1404 GB enabled GOOD

Mode : maxcapacity

Total Flash Bytes : 382474358784 (382.47GB) (reads),

33881456360448 (33.88TB) (writes)

Remaining Life : 99.60%

Garbage Collection Impact : Low

Daughter Max Life_Left RAID Status

Board ID Capacity (cycles) Groups

0 128 GiB 99.61% 0 2 4 6 GOOD

1 128 GiB 99.60% 1 3 5 7 GOOD

2 128 GiB 99.61% 0 2 4 6 GOOD

3 128 GiB 99.60% 1 3 5 7 GOOD

4 128 GiB 99.61% 0 2 4 6 GOOD

5 128 GiB 99.60% 1 3 5 7 GOOD

6 128 GiB 99.61% 0 2 4 6 GOOD

7 128 GiB 99.60% 1 3 5 7 GOOD

8 128 GiB 99.61% 0 2 4 6 GOOD

9 128 GiB 99.60% 1 3 5 7 GOOD

10 128 GiB 99.61% 0 2 4 6 GOOD

11 128 GiB 99.60% 1 3 5 7 GOOD

12 128 GiB 99.61% 0 2 4 6 GOOD

13 128 GiB 99.60% 1 3 5 7 GOOD

14 128 GiB 99.61% 0 2 4 6 GOOD

15 128 GiB 99.60% 1 3 5 7 GOOD

Shirish Jamthe

12 years ago

Hello Jake, Edmar

In my above post the formatting didn’t come through. But the key things you may want to focus on are
1. The drive status, which shows ‘GOOD’ here.
2. RAID status, it shows ‘enabled’ here.
3. Remaining Life : 99.60%
4. Writes as mentioned by Vadim.

As you may already know all SSDs come with over provisioning. As blocks start to go bad they are replaced from reserve.
Virident has defined ‘End of Life’ as a threshold or watermark for the bad block number (or reserve capacity) such that the card performance doe not degrade from what it was spec’d at.

So it is definitely possible to use the card after EOL of warrantied writes, specially if the application is read mostly.
TWe have another watermark or threshold that is defined where the card will not be able to sustained writes. At this time the card will go into ‘READ ONLY’ mode as defined by card status and you will still be able to recover data from it.

So in summary there is no catastrophic failure after you have written over warrantied writes. You will have warning from life left reading in the monitor tool as well as ability to continue for a while to manage timely replacement or migration.

I am happy to chat more on this if you have further questions. Again, I have defined behavior specific to Virident SSDs and not for generic SSD.

-Shirish

Vadim Tkachenko

Author

12 years ago

Kep,

As I see Intel SSD Toolbox is available only for Windows ?

Vojtech Kurka

12 years ago

Vadim: Yes, but you can watch the drive’s health using smartctl.

/usr/sbin/smartctl -a /dev/sdd

232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always – 0
233 Media_Wearout_Indicator 0x0032 099 099 000 Old_age Always – 0

225 (E1) = total host writes, must multiply by 32MB

George

12 years ago

Suppose you could use pt-diskstats https://www.percona.com/doc/percona-toolkit/pt-diskstats.html in place of /proc/diskstats ?

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

MLC SSD card lifetime and write amplification

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

MLC SSD card lifetime and write amplification

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation