How much space would empty MyISAM table take ? Probably 8K for .frm file, 1KB for .MYI file and 0 for MYD file. .MYI file can be larger if you have many indexes.
How much space will Innodb take:
1 2 | mysql> create table test_innodb(a int, b int) engine=innodb; Query OK, 0 rows affected (0.30 sec) |
Check out files (using Innodb File Per Table)
-rw-rw—- 1 mysql mysql 8578 Dec 16 20:33 test_innodb.frm
-rw-rw—- 1 mysql mysql 98304 Dec 16 20:33 test_innodb.ibd
So we get about 100K and so about 10 times more for MyISAM. This is ignored space which needs to be allocated in main tablespace for Innodb data dictionary. But that one is pretty small.
This is the good reason to avoid having very small Innodb tables – they will take much more space than MyISAM.
So .ibd file we get in case of table having no indexes (besides clustered key) – takes 6*16K pages. I wonder why as much as 6 pages are required for start ?
If we add more indexes to this tables – each further index will take additional 16K page even if it contains no data. This is understandable – each index has to have at least one page allocated to it.
Now it is very interesting – SHOW TABLE STATUS does not seems to show everything:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | CREATE TABLE `test_innodb` ( `i` int(10) unsigned NOT NULL, `c` char(100) default NULL, PRIMARY KEY (`i`), KEY `c` (`c`), KEY `c_2` (`c`,`i`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 mysql> show table status like "test_innodb" \G *************************** 1. row *************************** Name: test_innodb Engine: InnoDB Version: 10 Row_format: Compact Rows: 0 Avg_row_length: 0 Data_length: 16384 Max_data_length: 0 Index_length: 32768 Data_free: 0 Auto_increment: NULL Create_time: 2008-12-16 20:43:31 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: InnoDB free: 0 kB 1 row in set (0.00 sec) |
Such table’s .idb file takes 128K from the start while we only see 16K of data+32K of index, so another 5 pages are invisible. This tells me you can’t use this information to reliably identify space tables take on disk, especially for large number of very small Innodb tables.
Also note amount of free space – even though pages contain no data they are not considered free.
Avg_row_length is another field which may need an explanation. This value is computed by dividing Data_Length (exact number) by number of rows (estimated number) which means this value is going to be changing back and forth and it would be very inaccurate for small tables. For example it will show 16K as average row size for table with one row:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | mysql> show table status like "test_innodb" \G *************************** 1. row *************************** Name: test_innodb Engine: InnoDB Version: 10 Row_format: Compact Rows: 1 Avg_row_length: 16384 Data_length: 16384 Max_data_length: 0 Index_length: 32768 Data_free: 0 Auto_increment: NULL Create_time: 2008-12-16 20:58:49 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: InnoDB free: 0 kB 1 row in set (0.00 sec) |
Free Space for tables created in innodb_file_per_table mode is interesting question on its own.
As we populate table we will see Free space will remain at zero as Data_length is small:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | mysql> show table status like "test_innodb" \G *************************** 1. row *************************** Name: test_innodb Engine: InnoDB Version: 10 Row_format: Compact Rows: 1069 Avg_row_length: 199 Data_length: 212992 Max_data_length: 0 Index_length: 360448 Data_free: 0 Auto_increment: NULL Create_time: 2008-12-16 20:58:49 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: InnoDB free: 0 kB 1 row in set (0.00 sec) |
When at certain point you will see Innodb Free space to become non zero:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | mysql> show table status like "test_innodb" \G *************************** 1. row *************************** Name: test_innodb Engine: InnoDB Version: 10 Row_format: Compact Rows: 2669 Avg_row_length: 147 Data_length: 393216 Max_data_length: 0 Index_length: 671744 Data_free: 0 Auto_increment: NULL Create_time: 2008-12-16 20:58:49 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: InnoDB free: 4096 kB 1 row in set (0.01 sec) |
And the file size also jumps significantly (to 9MB):
-rw-rw—- 1 mysql mysql 8578 Dec 16 20:58 test_innodb.frm
-rw-rw—- 1 mysql mysql 9437184 Dec 16 21:06 test_innodb.ibd
If you do the math here you can see there is only about 1MB out of 9MB seen as Index_Length+Data_Length while another 4MB are visible in the “Innodb Free Space” and almost 5MB more is not visible at all.
This tells you it is not about tables which contain couple of rows which can take a lot of space in Innodb. Tables showing as using 1MB of Innodb Data can also really be taking almost 10 times more on the disk.
It is not quite clear to me what is happening here. According to documentation each index should get 2 segments one for non-leaf an one for leaf pages. However the space allocation should happen page by page until whole 32 pages allocated. In the case above no single segment should require more than 32 pages so It is surprising why all of them take 5MB (because 4MB are free)
What is clear however is what if some pages from segment are allocated it goes in the interesting space regarding space reporting – it will be gone from the free space, while only pages actually allocated will be shown in Data_Length and Index_Length fields.
Doing more tests with Inserts I can see Innodb seems to always try to keep at least 4MB free in the tablespace – populating table with more and more data I see free space never falls below 4MB while data file on disk continues to grow.
Finally it is worth to note if you’re using innodb_file_per_table the per table tablespaces are not going to grow by innodb_autoextend_increment – instead file will grow by 1MB to 4MB increments. There is a bug reported about it.
As a Summary I should not the following:
- Small Innodb tables will take more space on disk than one may anticipate.
- Using innodb_file_per_table can cause significant space use overhead if small tables are used
- The information in INFORMATION_SCHEMA can’t be used to judge how much space table is really taking on disk.
Peter, very interesting,
On large tables I have found that the difference is insignificant. There’s always these one or two very large tables which consume most disk space, and when considering disk limitations, they are the one being taken into account.
Did you happen a lot on databases where there were many small tables, and which constituted the majority of disk space?
Shlomi,
For large tables the difference between reported size and physical size may be insignificant this is true.
However there are very different applications around the world. There are applications which have hundreds of thousands and millions of tables. For these this difference is significant.
Related question: what is the storage cost for sparse INNODb tables, for example tables with 20 columns where 18 often contain NULL values? Is space pre-allocated?
Pete,
In the recent Innodb versions the NULL values are not simply not stored. Format will be something like column 1 value 2, column 10 value 3 skipping all 8 NULL columns in between.
We’ve been using Innodb tables and finding that the mix of federated and innodb makes any attempt at reporting on innodb via information_schema useless.
Has anyone run across federated tables causing the inability to report anything useful from information_schema?
What I have found is that *AnythinG* more complicated than a select table_name, table_schema, engine (no like statements, concat, or math on the fly will not work — see below for the concat ex)
I have and wonder if there is a good work around for 5.1.30
mysql Ver 14.14 Distrib 5.1.30, for unknown-linux-gnu (x86_64) using readline 5.1
All these wonderful ideas to monitor growth seem to get whacked when I run specific queries. Some queries to information_schema are ok though so it isn’t consistently and completely broken.
I’ve tried to dump the entire set of databases with triggers and functions (-R) to search for it and that won’t work even with –no-data and –force
One, this db.acct_set table does not exist because there is no database db. I assume it could be defined anywhere.
Two, the obscure error 1146 (select error) and 1431 (error on mysqldump) don’t get me very far
mysql> SELECT concat(table_schema,’.’,table_name),concat(round(table_rows/1000000,2),’M’) rows,concat(round(data_length/(1024*1024*1024),2),’G’) DATA,concat(round(index_length/(1024*1024*1024),2),’G’) idx,concat(round((data_length+index_length)/(1024*1024*1024),2),’G’) total_size,round(index_length/data_length,2) idxfrac FROM information_schema.TABLES ORDER BY data_length+index_length DESC LIMIT 10;
ERROR 1431 (HY000): The foreign data source you are trying to reference does not exist. Data source error: error: 1146 ‘Table ‘db.acct_set’ doesn’t exist’
mysql> desc db.acct_set;
ERROR 1146 (42S02): Table ‘manhunt.v4_account_settings’ doesn’t exist
Peter, thanks for these explanations,
However, I’m facing an important issue on large InnoDB table as well. For instance, I created a table with 1 BIGINT field, 7 DATE fields and 9 TINYINT fields (theorically 38 bytes per row and no variable length field).
The table contains 30 partitions with 7 millions rows each (total of 210 millions rows).
The information_schema returns an AVG_ROW_LENGTH of 77 BYTES… and the estimated number of rows is correct.
How can we explain such a difference from 38 to 77 bytes?
Bruno,
There is a lot of overhead in InnoDB.
* A 16KB block is never completely full. After a lot of churn, the block splits will lead to an average of a block being 69% full.
* Due to MVCC, there can be extra copies of a row in the block.
* Each field (at least non-NULL fields) has 1 or 2 bytes of length, etc, even for fixed-length fields.
* Each row has (used to have) 29 bytes of overhead.
Your 77 bytes is actually less than what this info predicts.
Peter,
I have empirically determined that “free space” (of TABLE STATUS) does not include free space in the 16KB blocks. And I suspect that, for large tables, completely freed 16KB blocks are not counted as “free” until a 1MB extent (or whatever) is completely freed.
This makes it difficult to know when to OPTIMIZE a table (or REORGANIZE PARTITION) because one cannot see how much would be freed.
Peter, thank you for your explanations.
But now i am wondering, should i use innodb or move on with myisam?