LVM read performance during snapshots

For the same customer I am exploring ZFS for backups, the twin server is using regular LVM and XFS. On this twin, I have setup mylvmbackup for a more conservative backup approach. I quickly found some odd behaviors, the backup was taking much longer than what I was expecting. It is not the first time I saw that, but here it was obvious. So I recorded some metrics, bi from vmstat and percent of cow space used from lvs during a backup. Cow space is the Copy On Write buffer used by LVM to record the modified pages like they were at the beginning of the snapshot. Upon reads, LVM must scan the list to verify that there’s no newer version. Here’s the other details about the backup:

Filesystem: 2TB, xfs
Snapsize: 60GB
Amount to backup: ~600GB
Backup tool: mylvmbackup
Compressor: pbzip2

As you can see, the processing of the COW space has a huge impact on the read performance. For this database the backup time was 11h but if I stop the slave and let it calm down for 10 min. so that the insert buffer is cleared, the backup time is a bit less than 3h and could probably be less if I use a faster compressor since the bottleneck is now the CPU overhead of pbzip2, all cores at 100%.

So, for large filesystems, if you plan to use LVM snapshots, have in mind that read performance will degrade with COW space used and it might be a good idea to reduce the number of writes during the backup. You could also compress the backup in a second stage if you have the storage capacity.

UPDATE: I ran a comparison on the twin server which runs ZFS and I have been able to pull data from the snapshot at about 120MB/s. The reason for this, I believe, is that LVM works at the block level and has no knowledge of files while ZFS is a the file level and is able to perform read-ahead.

12 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Chris Boulton

10 years ago

LVM snapshot performance is something I’ve wanted to look at for a while, and at the same time do a study of performance of snapshots taken using Idera’s HotCopy (http://www.idera.com/productssolutions/freetools/sblinuxhotcopy).

The biggest benefit of HCP is that it doesn’t require free space in a volume group to store the changed deltas while a snapshot is active – it uses a file on the file system. The downside is of course, it uses a proprietary kernel module.

john

10 years ago

So, how does it compare to ZFS, which always uses COW?

gebi

10 years ago

Yea, it would be _really_ interesting how ZFS compares to LVM snapshot based backup!

Dennis Jacobfeuerborn

10 years ago

Did you perform this benchmark with the old type snapshots or the new thinly provisioned ones? At least for writes the new implementation is dramatically faster than the old one as the COW is performed differently (and it has more features).

Rolf

10 years ago

Peter had a related writeup back in 2009 : http://www.mysqlperformanceblog.com/2009/02/05/disaster-lvm-performance-in-snapshot-mode/

Peter Zaitsev

Admin

10 years ago

There might be some misconception about what is happening here. Typically when you create LVM snapshot initially the reads from snapshot (assuming no other disk operations) will be as fast as from the origin – both for sequential and random IO. However if you leave database operate when read performance will degrade, especially for _sequential_ reads where it can go down 10x or more. The main reason for that not scanning COW zone but rather the fact there are a lot of blocks in snapshot which have to be recovered from various COW locations – so what is say 1MB sequential read for original partition might end up being multiple non sequential IO operations.

Yves Trudeau

Author

10 years ago

Sorry for my slow replies to the comments, I posted just before leaving for vacations. After some thought and discussion with Peter, the difference is the added random IO ops required to perform what would normally be a sequential read. I’ll do a comparison with ZFS today.

Yves Trudeau

Author

10 years ago

Just ran a similar test on ZFS and I got about 120MB/s while reading over 400GB. This is quite good, especially when considering there’re 4 mysql instances running on that servers.

Peter Zaitsev

Admin

10 years ago

Yves,

Is it same volume the – legacy drives not SSDs ?

Random updates could cause either sequential reads on the old data being much slower (snapshot) or on the currently active copy…. unless you can get some magic 🙂

Yves Trudeau

Author

10 years ago

Peter, these are all spinning drives(15k rpm) , both for LVM and ZFS servers. The difference is really a matter of read-ahead, if I disable the caching with “zfs set primarycache=metadata …”, the read performance falls by a factor of 60x. There’re approximately 60 drives in the zfs pool.

Peter Zaitsev

Admin

10 years ago

Yves,

Got it. I do not recognize there are so many drives. On such large number of drive indeed read-ahead will be reading factor.
with 120MB/sec and 60 drives it is just 2MB/sec per drive which is very reachable even with semi-random IO

Eric FRANCKX

10 years ago

Hi,
if you use a Netapp storage (NFS) do you think it will not go faster ? like some seconds (snapshot on WAFL)

Regards,

Eric

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

LVM read performance during snapshots

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

LVM read performance during snapshots

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation