December 16, 2008

How much space does empty Innodb table take ?

Posted by peter

How much space would empty MyISAM table take ? Probably 8K for .frm file, 1KB for .MYI file and 0 for MYD file. .MYI file can be larger if you have many indexes.

How much space will Innodb take:

SQL:
  1. mysql> CREATE TABLE test_innodb(a int, b int) engine=innodb;
  2. Query OK, 0 rows affected (0.30 sec)

Check out files (using Innodb File Per Table)

-rw-rw---- 1 mysql mysql 8578 Dec 16 20:33 test_innodb.frm
-rw-rw---- 1 mysql mysql 98304 Dec 16 20:33 test_innodb.ibd

So we get about 100K and so about 10 times more for MyISAM. This is ignored space which needs to be allocated in main tablespace for Innodb data dictionary. But that one is pretty small.

This is the good reason to avoid having very small Innodb tables - they will take much more space than MyISAM.

So .ibd file we get in case of table having no indexes (besides clustered key) - takes 6*16K pages. I wonder why as much as 6 pages are required for start ?

If we add more indexes to this tables - each further index will take additional 16K page even if it contains no data. This is understandable - each index has to have at least one page allocated to it.

Now it is very interesting - SHOW TABLE STATUS does not seems to show everything:

SQL:
  1. CREATE TABLE `test_innodb` (
  2.   `i` int(10) UNSIGNED NOT NULL,
  3.   `c` char(100) DEFAULT NULL,
  4.   PRIMARY KEY  (`i`),
  5.   KEY `c` (`c`),
  6.   KEY `c_2` (`c`,`i`)
  7. ) ENGINE=InnoDB DEFAULT CHARSET=utf8
  8.  
  9. mysql> SHOW TABLE STATUS LIKE "test_innodb" \G
  10. *************************** 1. row ***************************
  11.            Name: test_innodb
  12.          Engine: InnoDB
  13.         Version: 10
  14.      Row_format: Compact
  15.            Rows: 0
  16.  Avg_row_length: 0
  17.     Data_length: 16384
  18. Max_data_length: 0
  19.    Index_length: 32768
  20.       Data_free: 0
  21.  AUTO_INCREMENT: NULL
  22.     Create_time: 2008-12-16 20:43:31
  23.     Update_time: NULL
  24.      Check_time: NULL
  25.       Collation: utf8_general_ci
  26.        Checksum: NULL
  27.  Create_options:
  28.         Comment: InnoDB free: 0 kB
  29. 1 row IN SET (0.00 sec)

Such table's .idb file takes 128K from the start while we only see 16K of data+32K of index, so another 5 pages are invisible. This tells me you can't use this information to reliably identify space tables take on disk, especially for large number of very small Innodb tables.

Also note amount of free space - even though pages contain no data they are not considered free.

Avg_row_length is another field which may need an explanation. This value is computed by dividing Data_Length (exact number) by number of rows (estimated number) which means this value is going to be changing back and forth and it would be very inaccurate for small tables. For example it will show 16K as average row size for table with one row:

SQL:
  1. mysql> SHOW TABLE STATUS LIKE "test_innodb" \G
  2. *************************** 1. row ***************************
  3.            Name: test_innodb
  4.          Engine: InnoDB
  5.         Version: 10
  6.      Row_format: Compact
  7.            Rows: 1
  8.  Avg_row_length: 16384
  9.     Data_length: 16384
  10. Max_data_length: 0
  11.    Index_length: 32768
  12.       Data_free: 0
  13.  AUTO_INCREMENT: NULL
  14.     Create_time: 2008-12-16 20:58:49
  15.     Update_time: NULL
  16.      Check_time: NULL
  17.       Collation: utf8_general_ci
  18.        Checksum: NULL
  19.  Create_options:
  20.         Comment: InnoDB free: 0 kB
  21. 1 row IN SET (0.00 sec)

Free Space for tables created in innodb_file_per_table mode is interesting question on its own.

As we populate table we will see Free space will remain at zero as Data_length is small:

SQL:
  1. mysql> SHOW TABLE STATUS LIKE "test_innodb" \G
  2. *************************** 1. row ***************************
  3.            Name: test_innodb
  4.          Engine: InnoDB
  5.         Version: 10
  6.      Row_format: Compact
  7.            Rows: 1069
  8.  Avg_row_length: 199
  9.     Data_length: 212992
  10. Max_data_length: 0
  11.    Index_length: 360448
  12.       Data_free: 0
  13.  AUTO_INCREMENT: NULL
  14.     Create_time: 2008-12-16 20:58:49
  15.     Update_time: NULL
  16.      Check_time: NULL
  17.       Collation: utf8_general_ci
  18.        Checksum: NULL
  19.  Create_options:
  20.         Comment: InnoDB free: 0 kB
  21. 1 row IN SET (0.00 sec)

When at certain point you will see Innodb Free space to become non zero:

SQL:
  1. mysql> SHOW TABLE STATUS LIKE "test_innodb" \G
  2. *************************** 1. row ***************************
  3.            Name: test_innodb
  4.          Engine: InnoDB
  5.         Version: 10
  6.      Row_format: Compact
  7.            Rows: 2669
  8.  Avg_row_length: 147
  9.     Data_length: 393216
  10. Max_data_length: 0
  11.    Index_length: 671744
  12.       Data_free: 0
  13.  AUTO_INCREMENT: NULL
  14.     Create_time: 2008-12-16 20:58:49
  15.     Update_time: NULL
  16.      Check_time: NULL
  17.       Collation: utf8_general_ci
  18.        Checksum: NULL
  19.  Create_options:
  20.         Comment: InnoDB free: 4096 kB
  21. 1 row IN SET (0.01 sec)

And the file size also jumps significantly (to 9MB):

-rw-rw---- 1 mysql mysql 8578 Dec 16 20:58 test_innodb.frm
-rw-rw---- 1 mysql mysql 9437184 Dec 16 21:06 test_innodb.ibd

If you do the math here you can see there is only about 1MB out of 9MB seen as Index_Length+Data_Length while another 4MB are visible in the "Innodb Free Space" and almost 5MB more is not visible at all.

This tells you it is not about tables which contain couple of rows which can take a lot of space in Innodb. Tables showing as using 1MB of Innodb Data can also really be taking almost 10 times more on the disk.

It is not quite clear to me what is happening here. According to documentation each index should get 2 segments one for non-leaf an one for leaf pages. However the space allocation should happen page by page until whole 32 pages allocated. In the case above no single segment should require more than 32 pages so It is surprising why all of them take 5MB (because 4MB are free)

What is clear however is what if some pages from segment are allocated it goes in the interesting space regarding space reporting - it will be gone from the free space, while only pages actually allocated will be shown in Data_Length and Index_Length fields.

Doing more tests with Inserts I can see Innodb seems to always try to keep at least 4MB free in the tablespace - populating table with more and more data I see free space never falls below 4MB while data file on disk continues to grow.

Finally it is worth to note if you're using innodb_file_per_table the per table tablespaces are not going to grow by innodb_autoextend_increment - instead file will grow by 1MB to 4MB increments. There is a bug reported about it.

As a Summary I should not the following:

  • Small Innodb tables will take more space on disk than one may anticipate.
  • Using innodb_file_per_table can cause significant space use overhead if small tables are used
  • The information in INFORMATION_SCHEMA can't be used to judge how much space table is really taking on disk.
Related posts: :Some little known facts about Innodb Insert Buffer::Efficient Boolean value storage for Innodb Tables::LVM Configuration mistake to Avoid:
 

5 Comments »

  1. Peter, very interesting,

    On large tables I have found that the difference is insignificant. There’s always these one or two very large tables which consume most disk space, and when considering disk limitations, they are the one being taken into account.

    Did you happen a lot on databases where there were many small tables, and which constituted the majority of disk space?

    Comment :: December 16, 2008 @ 11:17 pm

  2. Shlomi,

    For large tables the difference between reported size and physical size may be insignificant this is true.

    However there are very different applications around the world. There are applications which have hundreds of thousands and millions of tables. For these this difference is significant.

    Comment :: December 16, 2008 @ 11:29 pm

  3. 3. Pete

    Related question: what is the storage cost for sparse INNODb tables, for example tables with 20 columns where 18 often contain NULL values? Is space pre-allocated?

    Comment :: December 18, 2008 @ 7:22 am

  4. Pete,

    In the recent Innodb versions the NULL values are not simply not stored. Format will be something like column 1 value 2, column 10 value 3 skipping all 8 NULL columns in between.

    Comment :: December 18, 2008 @ 9:32 am

  5. We’ve been using Innodb tables and finding that the mix of federated and innodb makes any attempt at reporting on innodb via information_schema useless.

    Has anyone run across federated tables causing the inability to report anything useful from information_schema?
    What I have found is that *AnythinG* more complicated than a select table_name, table_schema, engine (no like statements, concat, or math on the fly will not work — see below for the concat ex)

    I have and wonder if there is a good work around for 5.1.30
    mysql Ver 14.14 Distrib 5.1.30, for unknown-linux-gnu (x86_64) using readline 5.1

    All these wonderful ideas to monitor growth seem to get whacked when I run specific queries. Some queries to information_schema are ok though so it isn’t consistently and completely broken.

    I’ve tried to dump the entire set of databases with triggers and functions (-R) to search for it and that won’t work even with –no-data and –force

    One, this db.acct_set table does not exist because there is no database db. I assume it could be defined anywhere.
    Two, the obscure error 1146 (select error) and 1431 (error on mysqldump) don’t get me very far

    mysql> SELECT concat(table_schema,’.',table_name),concat(round(table_rows/1000000,2),’M') rows,concat(round(data_length/(1024*1024*1024),2),’G') DATA,concat(round(index_length/(1024*1024*1024),2),’G') idx,concat(round((data_length+index_length)/(1024*1024*1024),2),’G') total_size,round(index_length/data_length,2) idxfrac FROM information_schema.TABLES ORDER BY data_length+index_length DESC LIMIT 10;
    ERROR 1431 (HY000): The foreign data source you are trying to reference does not exist. Data source error: error: 1146 ‘Table ‘db.acct_set’ doesn’t exist’

    mysql> desc db.acct_set;
    ERROR 1146 (42S02): Table ‘manhunt.v4_account_settings’ doesn’t exist

    Comment :: April 30, 2009 @ 12:15 pm

 



Subscribe without commenting