or… “the case of Stewart recognizing parameters to the read() system call in strace output”.

Last week, a colleague asked a question:

I have an instance of MySQL with 100 tables and the table_definition_cache set to 1000. My understanding of this is that MySQL won’t revert to opening the FRM files to read the table definition, but we can see from strace:

So, why is this? It turns out that this triggered a memory for me from several years ago. I’ve since discovered the blog post in which I mention it: drop table fail (on the road to removing the FRM). That blog post is from 2008, almost three years ago to the day.

Since we completely reworked how metadata works in Drizzle, it has enabled us to do some truly wonderful things, including more in depth testing of the server. Amazingly enough, spin-offs from this work included being able to find out and then test that the ENUM limit of 65,535 has never been true (but now is in Drizzle), produce a CREATE TABLE statement that took over four minutes to execute and get a more complete view of how the Storage Engine API is called.

But back to what the above strace shows. In MySQL 5.5 you can find in sql/datadict.cc a function named dd_frm_type(). In MySQL 5.1, for some reason yet unknown to humans, it lives in sql/sql_view.cc as mysql_frm_type(). What this code snippet does is:

  • open the FRM
  • read 10 bytes (“header”)
  • check if it’s a view by doing a string compare for “TYPE=VIEW\n” being the first bytes of the FRM file. This is due to VIEWs being stored as the plain text of the SQL query inside the FRM file instead of the normal binary format FRM.
  • some legacy check for a generic table type (I think, I haven’t gone back into the deep history of the FRM file format to confirm)
  • return the fourth byte for the DB_TYPE. i.e. what storage engine it is.

We can ignore the upper limit on number of storage engines for MySQL and understanding the relationship between the range of numbers for dynamic assignment and what this means for on-disk compatibility of data directories is left as an exercise for the reader.

This code is called from several code paths in the server:

  • DROP TABLE
  • RENAME TABLE
  • DROP VIEW
  • open table
  • filling INFORMATION_SCHEMA tables (I think it is actually the TABLES table, but didn’t look closely)
An example of how this is used is that in the DROP TABLE code path, MySQL uses this magic byte to work out which Storage Engine to ask to drop the table. The main consequence of this bit of code is that MySQL may cause unnecessary disk IO for information it already has cached (often at least twice – in InnoDB itself and in the table_definition_cache).

Further reading:

8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Peter Zaitsev

Stewart,

So how frequent does this opening frm will happen ? Will it happen only when the table_open_cache miss happens or will it happen on all table opens ?

I would point our opening and reading few bytes from .frm should come from OS cache as the working set is very small in most cases, yet it can get expensive with NFS

grace

Stewart,
does number of records in the table affects the RENAME function performance?

Davi Arnaut

Certain statements (such as truncate, drop table) should work even though the table is corrupted (and can’t be opened).

Raghavendra

Interesting. Now this may seem obvious, but is it not possible to cache this information in a cache akin to table_definition_cache or in that cache itself ?