How to load large files safely into InnoDB with LOAD DATA INFILE
Recently I had a customer ask me about loading two huge files into InnoDB with LOAD DATA INFILE. The goal was to load this data on many servers without putting it into the binary log. While this is generally a fast way to load data (especially if you disable unique key checks and foreign key checks), I recommended against this. There are several problems with the very large transaction caused by the single statement. We didn't want to split the file into pieces for the load for various reasons. However, I found a way to load the single file in chunks as though it were many small files, which avoided splitting the file and let us load with many transactions instead of one huge transaction.
The smaller file is 4.1GB and has 260M lines in it; each row is just two bigints. The bigger file was about 20GB and had wider rows with textual data and about 60M lines (as I recall).
When InnoDB loads the file, it creates one big transaction with a lot of undo log entries. This has a lot of costs. To name a few:
- the big LOAD DATA INFILE clogs the binary log and slows replication down. If the load takes 4 hours on the master, it will cause the slave to fall 4 hours behind.
- lots of undo log entries collect in the tablespace. Not only from the load -- but from other transactions' changes too; the purge thread cannot purge them, so everything gets bloated and slow. Even simple SELECT queries might have to scan through lots of obsolete, but not-yet-purged, row versions. Later, the purge thread will have to clean these up. This is how you make InnoDB behave like PostgreSQL
- If the undo log space grows really big, it won't fit in the buffer pool and InnoDB essentially starts swapping between its buffer pool and the tablespace on disk.
Most seriously, if something should happen and the load needs to roll back, it will take a Very Long Time to do -- I hate to think how long. I'm sure it would be faster to just shut everything down and re-clone the machine from another, which takes about 10 or 12 hours. InnoDB is not optimized for rollbacks, it's optimized for transactions that succeed and commit. Rollback can take an order of magnitude longer to do.
For that reason, we decided to load the file in chunks of a million rows each. (InnoDB internally does operations such as ALTER TABLE in 10k row chunks, by the way; I chose 1M because the rows were small). But how to do this without splitting the file? The answer lies in the Unix fifo. I created a script that reads lines out of the huge file and prints them to a fifo. Then we could use LOAD DATA INFILE on the fifo. Every million lines, the script prints an EOF character to the fifo, closes it and removes it, then re-creates it and keeps printing more lines. If you 'cat' the fifo file, you get a million lines at a time from it. The code is pretty simple and I've included it in Maatkit just for fun. (It's unreleased as of yet, but you can get it with the following command: "wget http://www.maatkit.org/trunk/fifo").
So how did it work? Did it speed up the load?
Not appreciably. There actually was a tiny speedup, but it's statistically insignificant IMO. I tested this first on an otherwise idle machine with the same hardware as the production machines. First, I did it in one big 4.1GB transaction, then I did it 1 million rows at a time. Here's the CREATE TABLE:
-
CREATE TABLE load_test (
-
col1 bigint(20) NOT NULL,
-
col2 bigint(20) DEFAULT NULL,
-
KEY(col1),
-
KEY(col2)
-
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Here's the result of loading the entire 4GB file in one chunk:
-
time mysql -e "set foreign_key_checks=0; set sql_log_bin=0; set unique_checks=0; load data local infile 'infile.txt' into table load_test fields terminated by '\t' lines terminated by '\n' (col1, col2);"
-
-
real 234m53.228s
-
user 0m1.098s
-
sys 0m5.959s
While this ran, I captured vmstat output every 5 seconds and logged it to a file; I also captured the output of "mysqladmin ext -ri5 | grep Handler_write" and logged that to a file.
To load the file in chunks, I split my screen session in two and then ran (approximately -- edited for clarity) the following in one terminal:
-
perl mk-fifo-split infile.txt --fifo /tmp/my-fifo --lines 1000000
And this in the other terminal:
-
while [ -e /tmp/my-fifo ]; do
-
mysql -e "..... same as above.... ";
-
sleep 1;
-
done
After I was done, I ran a quick Perl script on the vmstat and mysqladmin log files to grab out the disk activity and rows-per-second to see what the progress was. Here are some graphs. This one is the rows per second from mysqladmin, and the blocks written out per second from vmstat.

And this one is the bytes/sec from Cacti running against this machine. This is only the bytes out per second; for some reason Cacti didn't seem to be capturing the bytes in per second.

You can see how the curves are roughly logarithmic, which is what you should expect for B-Tree indexes. The two curves on the Cacti graph actually show both files being loaded. It might seem counter-intuitive, but the second (smaller) curve is actually the larger file. It has fewer rows and that's why it causes less I/O overall.
I also used 'time' to run the Perl fifo script, and it used a few minutes of CPU time during the loads. So not very much at all.
Some interesting things to note: the load was probably mostly CPU-bound. vmstat showed from 1% to 3% I/O wait during this time. (I didn't think to use iostat to see how much the device was actually used, so this isn't a scientific measurement of how much the load was really waiting for I/O). The single-file load showed about 1 or 2 percent higher I/O wait, and you can see the single-file load uses more blocks per row; I can only speculate that this is the undo log entries being written to disk. (Peter arrived at the same guess independently.)
Unfortunately I didn't think to log the "cool-down period" after the load ended. It would be fun to see that. Cacti seemed to show no cool-down period -- as soon as the load was done it looked like things went back to normal. I suspect that's not completely true, since the buffer pool must have been overly full with this table's data.
Next time I do something like this I'll try smaller chunks, such as 10k rows; and I'll try to collect more stats. It would also be interesting to try this on an I/O-bound server and see what the performance impact is, especially on other transactions running at the same time.
11 Comments











del.icio.us
digg
Hi Baron,
could you tell us the specification of the idle server where you did a single transaction test?
Thanks in advance,
Comment :: July 3, 2008 @ 1:39 pm
Baron,
Indeed after load is completed you should have significant portion of buffer pool dirty which should take some time to be flushed to the disk.
What I would also like to highlight is the slowdown in the log formula happens as data well fits in memory, otherwise you would see number of inserts/sec to drop off through the cliff
Comment :: July 3, 2008 @ 4:14 pm
Pedro,
It’s a client’s machine so I’m not quite sure all the details; but it’s an 8-core Intel Xeon L5535 @ 2GHz, 32GB RAM, RAID 10 on 15k SAS drives (I think).
Comment :: July 4, 2008 @ 8:35 am
Interesting article. However, isn’t “control-d signals EOF” only applicable to terminal devices? If it worked for binary files how could you ever write \x04 to a file?
Comment :: July 7, 2008 @ 12:35 am
I’m sure you are right Timo. I didn’t think it was a signal but I didn’t think much about it anyway!
Comment :: July 7, 2008 @ 5:14 am
I found your article and thought it is very interesting – Thanks.
As result of your test, do you have recommendation or method of efficiently loading very large file?
BTW – I am NOT heavy DB programmer and don’t know much about DB. If you don’t mine I would like to seek your advice and help.
I have about 120,000 rows – rec size 130 bytes with about 13 fields (Avg 15 GB), which need to be inserted into InnoDB table every min.
I am using LOAD command to accomplish this but in some occasion , the LOAD command takes longer than 1 min. When this happened, the following LOAD file get bigger and bigger and eventually, I get DB gone away error and the program abort.
Any suggestions.
Kye Lee
Comment :: July 25, 2008 @ 8:48 am
I would suggest breaking it into smaller pieces, but it sounds like you have other problems and need a completely different approach — perhaps the problem is that you even need these bulk loads. Beyond that, I won’t say; this is what we do for a living
Comment :: July 25, 2008 @ 4:29 pm
Please send me the private email with contact info.
Thanks
Kye Lee
Comment :: July 28, 2008 @ 6:11 am
Hi Kye,
Please use the Contact Us form on our website http://www.percona.com, as this goes into our ticketing system.
Comment :: July 28, 2008 @ 6:13 am
What’s the best way to load lots of large and small files for full text indexing? Which database engine is best suited for FTI? of large files?
Comment :: October 2, 2008 @ 2:24 pm
Only MyISAM supports full-text indexing in MySQL. If you have a lot of content to index (bigger than your available memory) and you need high performance, you probably need an external solution such as Sphinx or Lucene. Sphinx has a storage engine for easy integration with MySQL.
Comment :: October 3, 2008 @ 6:28 am