The ARCHIVE Storage Engine for MySQL - does it do what you expect?

Sometimes there is a need for keeping large amounts of old, rarely used data without investing too much on expensive storage. Very often such data doesn’t need to be updated anymore, or the intent is to leave it untouched. I sometimes wonder what I should really suggest to our Support customers.

For this purpose, the archive storage engine, added in MySQL 4.1.3, seems perfect as it provides excellent compression and the only DML statement it does allow is INSERT. However, does it really work as you would expect?

First of all, it has some serious limitations. Apart from lack of support for DELETE, REPLACE and UPDATE statements (which may be acceptable for some needs), another one is that it does not allow you to have indexes, although you can have an auto_increment column being either a unique or non-unique index. So usually straightforward converting your tables to archive engine will not be possible. See the list of features for reference.

But unfortunately, it does not always work as the manual says, within it’s described limitations. See the following very simple examples.

Problem I

Does the archive storage engine really ensure uniqueness for a primary or unique key?

mysql> CREATE TABLE `b` (
    ->   `id` int(11) NOT NULL AUTO_INCREMENT,
    ->   PRIMARY KEY (`id`)
    -> ) ENGINE=ARCHIVE;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into b values (null),(null),(null),(null);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from b;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
|  4 |
+----+
4 rows in set (0.01 sec)

mysql> repair table b;
+--------+--------+----------+----------+
| Table  | Op     | Msg_type | Msg_text |
+--------+--------+----------+----------+
| test.b | repair | status   | OK       |
+--------+--------+----------+----------+
1 row in set (0.00 sec)

mysql> insert into b values (null),(null);
Query OK, 2 row affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from b;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
|  4 |
|  1 |
|  2 |
+----+
6 rows in set (0.01 sec)

mysql> show indexes from bG
*************************** 1. row ***************************
        Table: b
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: id
    Collation: NULL
  Cardinality: NULL
     Sub_part: NULL
       Packed: NULL
         Null: 
   Index_type: NONE
      Comment: 
Index_comment: 
1 row in set (0.00 sec)

mysql> CREATE TABLE `b` (

-> `id` int(11) NOT NULL AUTO_INCREMENT,

-> PRIMARY KEY (`id`)

-> ) ENGINE=ARCHIVE;

Query OK, 0 rows affected (0.01 sec)

mysql> insert into b values (null),(null),(null),(null);

Query OK, 4 rows affected (0.01 sec)

Records: 4 Duplicates: 0 Warnings: 0

mysql> select * from b;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

| 4 |

+----+

4 rows in set (0.01 sec)

mysql> repair table b;

+--------+--------+----------+----------+

+--------+--------+----------+----------+

+--------+--------+----------+----------+

1 row in set (0.00 sec)

mysql> insert into b values (null),(null);

Query OK, 2 row affected (0.00 sec)

Records: 2 Duplicates: 0 Warnings: 0

mysql> select * from b;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

| 4 |

| 1 |

| 2 |

+----+

6 rows in set (0.01 sec)

mysql> show indexes from bG

*************************** 1. row ***************************

Table: b

Non_unique: 0

Key_name: PRIMARY

Seq_in_index: 1

Column_name: id

Collation: NULL

Cardinality: NULL

Sub_part: NULL

Packed: NULL

Null:

Index_type: NONE

Comment:

Index_comment:

1 row in set (0.00 sec)

That is really bad – a column being a primary key effectively allows duplicates! And another case exposing the same problem:

mysql> CREATE TABLE `c` ( `id` int(11) NOT NULL AUTO_INCREMENT, UNIQUE KEY (`id`) ) ENGINE=ARCHIVE;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into c values (null),(null),(null);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from c;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.01 sec)

mysql> optimize table c;
+--------+----------+----------+----------+
| Table  | Op       | Msg_type | Msg_text |
+--------+----------+----------+----------+
| test.c | optimize | status   | OK       |
+--------+----------+----------+----------+
1 row in set (0.01 sec)

mysql> insert into c values (null);
Query OK, 1 row affected (0.00 sec)

mysql> select * from c;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
|  1 |
+----+
4 rows in set (0.01 sec)

mysql> CREATE TABLE `c` ( `id` int(11) NOT NULL AUTO_INCREMENT, UNIQUE KEY (`id`) ) ENGINE=ARCHIVE;

Query OK, 0 rows affected (0.01 sec)

mysql> insert into c values (null),(null),(null);

Query OK, 3 rows affected (0.01 sec)

Records: 3 Duplicates: 0 Warnings: 0

mysql> select * from c;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

+----+

3 rows in set (0.01 sec)

mysql> optimize table c;

+--------+----------+----------+----------+

+--------+----------+----------+----------+

+--------+----------+----------+----------+

1 row in set (0.01 sec)

mysql> insert into c values (null);

Query OK, 1 row affected (0.00 sec)

mysql> select * from c;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

| 1 |

+----+

4 rows in set (0.01 sec)

So even a simple optimize table command does break it completely. After we realize that such operation made our data bad, we won’t be able to easily go back to different engine without sacrificing uniqueness first:

mysql> alter table c engine=innodb;
ERROR 1062 (23000): ALTER TABLE causes auto_increment resequencing, resulting in duplicate entry '1' for key 'id'

mysql> alter table c drop key id;
ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key

mysql> alter table c drop key id, add key(id);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> alter table c engine=innodb;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> alter table c engine=innodb;

ERROR 1062 (23000): ALTER TABLE causes auto_increment resequencing, resulting in duplicate entry '1' for key 'id'

mysql> alter table c drop key id;

ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key

mysql> alter table c drop key id, add key(id);

Query OK, 4 rows affected (0.00 sec)

Records: 4 Duplicates: 0 Warnings: 0

mysql> alter table c engine=innodb;

Query OK, 4 rows affected (0.01 sec)

Records: 4 Duplicates: 0 Warnings: 0

There were already bug reports related to auto_increment feature being broken, but I have filed a new, more specific bug report about this problem.
————–

Problem II

Are we always able to alter a table to use the archive storage engine, even if it is theoretically using supported table definition? Auto increment column issue again…

mysql> select * from c;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.01 sec)

mysql> select * from c;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

+----+

3 rows in set (0.01 sec)

We have the same c table using archive. We can change it’s engine to something different:

mysql> alter table c engine=innodb;
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from c;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.00 sec)

mysql> alter table c engine=innodb;

Query OK, 3 rows affected (0.02 sec)

Records: 3 Duplicates: 0 Warnings: 0

mysql> select * from c;

+----+

| id |

+----+

| 1 |

| 2 |

| 3 |

+----+

3 rows in set (0.00 sec)

But in some cases, we can’t set it back to archive!

mysql> alter table c engine=archive;
ERROR 1022 (23000): Can't write; duplicate key in table '#sql-1649_3'

1 2	mysql> alter table c engine=archive; ERROR 1022 (23000): Can't write; duplicate key in table '#sql-1649_3'

There is an old bug report about that.
————–

Problem III

And yet another weirdness around auto_increment values. It seems normal that databases allow us to insert explicit values into auto_increment columns, even lower then last inserted maximum, and all other engines – MyISAM, Memory and InnoDB do that:

mysql> CREATE TABLE ai (a int auto_increment primary key) ENGINE=InnoDB;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into ai values (10);
Query OK, 1 row affected (0.00 sec)

mysql> insert into ai values (1);
Query OK, 1 row affected (0.00 sec)

mysql> select * from ai;
+----+
| a  |
+----+
|  1 |
| 10 |
+----+
2 rows in set (0.00 sec)

mysql> CREATE TABLE ai (a int auto_increment primary key) ENGINE=InnoDB;

Query OK, 0 rows affected (0.01 sec)

mysql> insert into ai values (10);

Query OK, 1 row affected (0.00 sec)

mysql> insert into ai values (1);

Query OK, 1 row affected (0.00 sec)

mysql> select * from ai;

+----+

| a |

+----+

| 1 |

| 10 |

+----+

2 rows in set (0.00 sec)

But it’s not the case for Archive engine:

mysql> CREATE TABLE aa (a int auto_increment primary key) ENGINE=Archive;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into aa values (10);
Query OK, 1 row affected (0.00 sec)

mysql> insert into aa values (1);
ERROR 1022 (23000): Can't write; duplicate key in table 'aa'

mysql> CREATE TABLE aa (a int auto_increment primary key) ENGINE=Archive;

Query OK, 0 rows affected (0.00 sec)

mysql> insert into aa values (10);

Query OK, 1 row affected (0.00 sec)

mysql> insert into aa values (1);

ERROR 1022 (23000): Can't write; duplicate key in table 'aa'

This undocumented behavior was reported here.

Summary

The archive storage engine provides a very good compression and is available in all MySQL variants out of the box. However it does have serious limitations as well as works unreliable and not as expected in some cases.

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

The ARCHIVE Storage Engine – does it do what you expect?

Problem I

Problem II

Problem III

Summary

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis Replication and Auto-Failover With Sentinel Service

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

Valkey/Redis: Sets and Sorted Sets

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

The ARCHIVE Storage Engine – does it do what you expect?

Problem I

Problem II

Problem III

Summary

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis Replication and Auto-Failover With Sentinel Service

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

Valkey/Redis: Sets and Sorted Sets

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation