Concatenating MyISAM files

Recently, I found myself involved in the migration of a large read-only InnoDB database to MyISAM (eventually packed). The only issue was that for one of the table, we were talking of 5 TB of data, 23B rows. Not small… I calculated that with something like insert into MyISAM_table… select * from Innodb_table… would take about 10 days. The bottleneck was clearly the lack of concurrency on the read part from InnoDB and then the key management for MyISAM. The server has many dozen drives so it was easy to add more concurrency so I kicked off, from a script, insertions into 16 identical MyISAM files for distinct parts of the table. That was much faster and would complete within a day.

Then, while the Innodb extraction was running at a nice pace, I thought about the next phase. My first idea was simply to do insert into MyISAM_table select * from MyISAM_table1 and so on for the 16 files. Since MyISAM are flat files, that should be faster, especially with the keys disabled. At that point, I remembered, from a previous disaster recovery work where a database directory has been wiped out that the MyISAM files have no headers which make them difficult (read almost impossible) to locate on a drive with tools like ext3grep. No headers… that means the first byte of byte of a file is the first byte of the first row… So we should be able to concatenate these files. Let’s see.

mysql> create table test_concat(id int unsigned not null, primary key (id)) engine=myisam;
Query OK, 0 rows affected (0.01 sec)

mysql> create table test_concat_part like test_concat;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into test_concat (id) value (1),(2),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> insert into test_concat_part (id) value (4),(5),(6);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> flush tables;
Query OK, 0 rows affected (0.01 sec)

mysql> create table test_concat(id int unsigned not null, primary key (id)) engine=myisam;

Query OK, 0 rows affected (0.01 sec)

mysql> create table test_concat_part like test_concat;

Query OK, 0 rows affected (0.01 sec)

mysql> insert into test_concat (id) value (1),(2),(3);

Query OK, 3 rows affected (0.00 sec)

Records: 3 Duplicates: 0 Warnings: 0

mysql> insert into test_concat_part (id) value (4),(5),(6);

Query OK, 3 rows affected (0.00 sec)

Records: 3 Duplicates: 0 Warnings: 0

mysql> flush tables;

Query OK, 0 rows affected (0.01 sec)

Then, at the shell command line:

root@django:/var/lib/mysql/test# ls
test_concat.frm  test_concat.MYD  test_concat.MYI  test_concat_part.frm  test_concat_part.MYD  test_concat_part.MYI
root@django:/var/lib/mysql/test# 
root@django:/var/lib/mysql/test# cat test_concat_part.MYD >> test_concat.MYD
root@django:/var/lib/mysql/test# myisamchk -rq test_concat
- check record delete-chain
- recovering (with sort) MyISAM-table 'test_concat'
Data records: 3
- Fixing index 1
Data records: 6

root@django:/var/lib/mysql/test# ls

test_concat.frm test_concat.MYD test_concat.MYI test_concat_part.frm test_concat_part.MYD test_concat_part.MYI

root@django:/var/lib/mysql/test#

root@django:/var/lib/mysql/test# cat test_concat_part.MYD >> test_concat.MYD

root@django:/var/lib/mysql/test# myisamchk -rq test_concat

- check record delete-chain

- recovering (with sort) MyISAM-table 'test_concat'

Data records: 3

- Fixing index 1

Data records: 6

And then, back in mysql:

mysql> use test
Database changed
mysql> flush tables;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test_concat;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
|  4 |
|  5 |
|  6 |
+----+
6 rows in set (0.00 sec)

mysql> use test

Database changed

mysql> flush tables;

Query OK, 0 rows affected (0.00 sec)

mysql> select * from test_concat;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

| 6 |

+----+

6 rows in set (0.00 sec)

So, yes, you can concatenate MyISAM files, even when multiple keys are defined. Not for everyday use but still pretty cool.

Addendum

Following Peter’s comment, I added varchar and deleted rows to the mix:

mysql> truncate table test_concat;
Query OK, 0 rows affected (0.00 sec)

mysql> truncate table test_concat_part;
Query OK, 0 rows affected (0.01 sec)

mysql> alter table test_concat add data varchar(10);
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table test_concat_part add data varchar(10);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into test_concat (id,data) value (1,'one'),(2,'two'),(3,'three');
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> insert into test_concat_part (id,data) value (44,'todelete'),(4,'four'),(5,'five'),(6,'six');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> delete from test_concat_part where id = 44;
Query OK, 1 row affected (0.00 sec)

mysql> flush tables;
Query OK, 0 rows affected (0.00 sec)

root@django:/var/lib/mysql/test# myisamchk -rq test_concat
- check record delete-chain
- recovering (with sort) MyISAM-table 'test_concat'
Data records: 3
- Fixing index 1
myisamchk: error: Couldn't fix table with quick recovery: Found wrong number of deleted records
myisamchk: error: Run recovery again without -q
MyISAM-table 'test_concat' is not fixed because of errors
Try fixing it by using the --safe-recover (-o), the --force (-f) option or by not using the --quick (-q) flag
root@django:/var/lib/mysql/test# myisamchk -r test_concat
- recovering (with sort) MyISAM-table 'test_concat'
Data records: 6
- Fixing index 1

mysql> select * from test_concat;
+----+-------+
| id | data  |
+----+-------+
|  1 | one   |
|  2 | two   |
|  3 | three |
|  4 | four  |
|  5 | five  |
|  6 | six   |
+----+-------+
6 rows in set (0.00 sec)

mysql> truncate table test_concat;

Query OK, 0 rows affected (0.00 sec)

mysql> truncate table test_concat_part;

Query OK, 0 rows affected (0.01 sec)

mysql> alter table test_concat add data varchar(10);

Query OK, 0 rows affected (0.01 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> alter table test_concat_part add data varchar(10);

Query OK, 0 rows affected (0.02 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> insert into test_concat (id,data) value (1,'one'),(2,'two'),(3,'three');

Query OK, 3 rows affected (0.00 sec)

Records: 3 Duplicates: 0 Warnings: 0

mysql> insert into test_concat_part (id,data) value (44,'todelete'),(4,'four'),(5,'five'),(6,'six');

Query OK, 4 rows affected (0.00 sec)

Records: 4 Duplicates: 0 Warnings: 0

mysql> delete from test_concat_part where id = 44;

Query OK, 1 row affected (0.00 sec)

mysql> flush tables;

Query OK, 0 rows affected (0.00 sec)

root@django:/var/lib/mysql/test# myisamchk -rq test_concat

- check record delete-chain

- recovering (with sort) MyISAM-table 'test_concat'

Data records: 3

- Fixing index 1

myisamchk: error: Couldn't fix table with quick recovery: Found wrong number of deleted records

myisamchk: error: Run recovery again without -q

MyISAM-table 'test_concat' is not fixed because of errors

Try fixing it by using the --safe-recover (-o), the --force (-f) option or by not using the --quick (-q) flag

root@django:/var/lib/mysql/test# myisamchk -r test_concat

- recovering (with sort) MyISAM-table 'test_concat'

Data records: 6

- Fixing index 1

mysql> select * from test_concat;

+----+-------+

| id | data |

+----+-------+

| 1 | one |

| 2 | two |

| 3 | three |

| 4 | four |

| 5 | five |

| 6 | six |

+----+-------+

6 rows in set (0.00 sec)

So varchar columns are supported without any issue but, deleted rows prevent the use of the quick option for myisamchk.

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Peter Zaitsev

Admin

11 years ago

Yves,

It works in simple case I wonder how it works if tables have deleted rows (would not delete chain be all weird) and also how about dynamic row format (Varchar etc) would it also work or is it fixed row format feature only ?

Khalid

11 years ago

Wouldn’t it be safer to use something like Mydumper/Myloader which will do parallel dump/load.

Combine that with dropping the secondary indexes and it speeds up things.

Here is an article about Mydumper

http://2bits.com/backup/fast-parallel-mysql-backups-and-imports-mydumper.html

And here is a presentation on deleting the secondary indexes, though it is for InnoDB, but should work for MyISAM too

http://2bits.com/drupal-planet/presentation-huge-drupal-site-381-modules-174gb-mysql-database-and-200-million-row-tables.html#comment-1500

Brian Cavanagh

11 years ago

Yves, You sir are an evil genius. Not only for figuring out how to do this, but for all the support calls from when it is inappropriately applied. I mean this in only the greatest terms of respect 😉

Kedar

11 years ago

hmmm!! This is great to know but I wonder if it’d be better to have a MERGE table covering all MyISAMs instead!!

J Jorg

11 years ago

I have been using ‘myisamchk -rq’ to quickly drop, create, alter table indexes with great success, so long as the table does not have any deleted records. This can be done by copying the .frm and .MYI from a table with exact same column definitions, having different indexes, back onto the original table. This technique cuts the cpu/clock time by nearly 2/3rd compared to traditional SQL ‘ALTER TABLE x ADD/DROP INDEX’, which rewrites the whole .MYD.

CREATE TABLE z LIKE x; — duplicate the structure of the original table.
ALTER TABLE z DROP INDEX PRIMARY, DROP INDEX my_idx, ADD INDEX PRIMARY ( f1, f2, f3, … ) — Define new keys
LOCK TABLE x WRITE, z WRITE; — safety first, lock the tables involved
FLUSH TABLE x,z; — safety, allow usage of cp and myisamchk while databse is HOT.
unix> cp z.frm x.frm — take new table
unix> cp z.MYI x.MYI — take empty index
unix> myisamchk -rqaS x — quickly rebuild the index
FLUSH TABLE x;
UNLOCK TABLES;

Note: ‘Lock Table x Write’ and ‘Flush Table x’ commands are required to safely run myisamchk ( or myisampack ) when the database is hot.

— J Jorg —

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Concatenating MyISAM files

Related

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Concatenating MyISAM files

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation