Working with many files and file system fragmentation

Working on performance optimization project (not directly MySQL related) we did a test – creating 100 files writing 4K in the random file for some time and when checking the read speed on the files we end up with, compared to just writing the file sequentially and reading it back.

The performance difference was huge – we could read single file at 80MB/sec while fragmented files only deliver about 2MB/sec – this is a massive difference.

The test was done on EXT3 and it looks like it does not do very good job preventing file fragmentation for large amount of growing files.

It would be interesting to see how other filesystems deal with this problem, for example XFS with delayed allocation may be doing better job.

I also would like to repeat the test with MySQL MyISAM tables and see how bad the difference would be for MySQL but I would expect something along those lines.

Interesting enough it should not be that hard to fix this issue – one could optionally preallocate MyISAM tables in some chunks (say 1MB) so its gets less fragmentation. Though it would be interesting to benchmark how much such approach would generally help.

Until we have this feature – reduced fragmentation is one more benefit we get with batching. For example instead of inserting rows one by one in large number of tables once can be buffered in memory (application or MyISAM memory table) and flushed to the actual tables in bulks.

10 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Artem Russakovskii

16 years ago

I’ve been working with xfs a lot on the storage servers and fragmentation has been a huge problem with it too, though xfs prides itself in not needing defragmentation (at least as much as some other filesystems). Until fragmentation was sorted out, the load on the box would spike to 100+ during an otherwise reasonable load. I’ve used xfs_fsr to defrag and ended up cronning it to run for 30 seconds every 15 minutes. I suspect the issue may be related to nfs3 writing to it in relatively small chunks (as it’s a stateless protocol) and not xfs itself.

As far as the benchmarks above, I’m curious to see a similar one with reiser, if you find time and ext4 when it comes out.

Bill

16 years ago

It would also be interesting to repeat the test with the files on those new solid state drives, to see how much of an improvement you could get.

Peter Zaitsev

Author

16 years ago

Oh yes. SSD reduce seek penalty dramatically so fragmentation should be much less issue for them.

This is multiple Terabyte data storage project so it is not for SSD yet.

brian

16 years ago

Are you sure the speed difference is caused by fragmentation of the filesystem on the disk, and not due to the single large file being in the buffer cache? It might be interesting to use the following to see what’s in the buffer cache:

http://net.doit.wisc.edu/~plonka/fincore/

http://insights.oetiker.ch/linux/fadvise.html

tgabi

16 years ago

I’ve noticed this long time ago. I’m fighting this using different measures: a. small tables are optimized periodically b.big tables that are affected by fragmentation can be arranged as 1 file per filesystem c. recently Mysql 5.1 table partitioning allowed to use SSD for most recent data (most used) and disk for historical data (less used). Index files are the most prone to fragmentation – since 1 record inserted/updated can trigger updates on several indices. No filesystem is immune to fragmentation unfortunately, the only way to reduce it is to have some table options that will pre-allocate space in large chunks (like 1GB data, 1 GB index).

Peter Zaitsev

Author

16 years ago

Thanks tgabi,

If you have updates you may be suffering from internal fragmentation in addition to file system fragmentation – such as rows may get more than one piece in MySQL sequential pages in index order happen to be in different locations etc.

Good you agree on pre allocation option though I think even much smaller values such as 1MB or 4MB would make things much better.

Peter Zaitsev

Author

16 years ago

Brian,

Thanks for heads up I’ve written another blog post on those tools.

I’m quite sure that was not caching because number at VMSTAT matches these quite closely.

Markus

16 years ago

Be aware if you use XFS/ReiserFS on PC-Hardware, you risk loss of data in case of power failure:

http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc

Daniel Schneller

16 years ago

Some time ago (2006) I noticed that even on a fully defragged NTFS drive newly created InnoDB data files got scattered around the whole partition.
http://jroller.com/dschneller/entry/strange_innodb_fragmentation

Peter Zaitsev

Author

16 years ago

Daniel,

I’m not quite sure from your post – did you create the tablespace out of many files and it was already badly fragmented or it was autoextend during some workload run ?

In any case we can learn filesystems may not be as optimal as we would like them to be 🙂

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Working with many files and file system fragmentation

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis Configurations and Persistent Setting of the Key Parameters

WiredTiger Logging and Checkpoint Mechanism

InnoDB Performance Optimization Basics

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Working with many files and file system fragmentation

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis Configurations and Persistent Setting of the Key Parameters

WiredTiger Logging and Checkpoint Mechanism

InnoDB Performance Optimization Basics

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation