Working on performance optimization project (not directly MySQL related) we did a test – creating 100 files writing 4K in the random file for some time and when checking the read speed on the files we end up with, compared to just writing the file sequentially and reading it back.

The performance difference was huge – we could read single file at 80MB/sec while fragmented files only deliver about 2MB/sec – this is a massive difference.

The test was done on EXT3 and it looks like it does not do very good job preventing file fragmentation for large amount of growing files.

It would be interesting to see how other filesystems deal with this problem, for example XFS with delayed allocation may be doing better job.

I also would like to repeat the test with MySQL MyISAM tables and see how bad the difference would be for MySQL but I would expect something along those lines.

Interesting enough it should not be that hard to fix this issue – one could optionally preallocate MyISAM tables in some chunks (say 1MB) so its gets less fragmentation. Though it would be interesting to benchmark how much such approach would generally help.

Until we have this feature – reduced fragmentation is one more benefit we get with batching. For example instead of inserting rows one by one in large number of tables once can be buffered in memory (application or MyISAM memory table) and flushed to the actual tables in bulks.

10 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Artem Russakovskii

I’ve been working with xfs a lot on the storage servers and fragmentation has been a huge problem with it too, though xfs prides itself in not needing defragmentation (at least as much as some other filesystems). Until fragmentation was sorted out, the load on the box would spike to 100+ during an otherwise reasonable load. I’ve used xfs_fsr to defrag and ended up cronning it to run for 30 seconds every 15 minutes. I suspect the issue may be related to nfs3 writing to it in relatively small chunks (as it’s a stateless protocol) and not xfs itself.

As far as the benchmarks above, I’m curious to see a similar one with reiser, if you find time and ext4 when it comes out.

Bill

It would also be interesting to repeat the test with the files on those new solid state drives, to see how much of an improvement you could get.

brian

Are you sure the speed difference is caused by fragmentation of the filesystem on the disk, and not due to the single large file being in the buffer cache? It might be interesting to use the following to see what’s in the buffer cache:

http://net.doit.wisc.edu/~plonka/fincore/

http://insights.oetiker.ch/linux/fadvise.html

tgabi

I’ve noticed this long time ago. I’m fighting this using different measures: a. small tables are optimized periodically b.big tables that are affected by fragmentation can be arranged as 1 file per filesystem c. recently Mysql 5.1 table partitioning allowed to use SSD for most recent data (most used) and disk for historical data (less used). Index files are the most prone to fragmentation – since 1 record inserted/updated can trigger updates on several indices. No filesystem is immune to fragmentation unfortunately, the only way to reduce it is to have some table options that will pre-allocate space in large chunks (like 1GB data, 1 GB index).

Markus

Be aware if you use XFS/ReiserFS on PC-Hardware, you risk loss of data in case of power failure:

http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc

Daniel Schneller

Some time ago (2006) I noticed that even on a fully defragged NTFS drive newly created InnoDB data files got scattered around the whole partition.
http://jroller.com/dschneller/entry/strange_innodb_fragmentation