August 29, 2014

When Does InnoDB Update Table Statistics? (And When It Can Bite)

An InnoDB table statistics is used for JOIN optimizations and helping the MySQL optimizer choose the appropriate index for a query. If a table’s statistics or index cardinality becomes outdated, you might see queries which previously performed well suddenly show up on slow query log until InnoDB again updates the statistics. But when does InnoDB perform the updates aside from the first opening of the table or manually running ANALYZE TABLE on it? The 2 instances below are documented from the MySQL and InnoDB plugin’s manual:

  1. Metadata commands like SHOW INDEX, SHOW TABLE STATUS and SHOW [FULL] TABLES (or their corresponding queries from INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.STATISTICS)
  2. When 1 / 16th of the table or 2Billion rows has been modified, whichever comes first. ./row/row0mysql.c:row_update_statistics_if_needed
If you or an application constantly executes SHOW [FULL] TABLES or SHOW TABLE STATUS on many a large tables, this can affect your server especially if the random index dives had to be read from disk. Here is an example of SHOW TABLE STATUS FROM schema taking away a couple of seconds on execution time.

As you can see db1 has about 294 tables and it took the server 12 seconds to update all the tables’ statistics.  Luckily, this can be controlled with Percona Server with the variable innodb_stats_auto_update (from MySQL Community 5.1.17 a similar variable exist called innodb_stats_on_metadata, for Percona Server where both variables exist, both should be 0 if you want to disable the feature). When set to 0, automatic updates to the table statistics (items 1 and 2 above) is disabled unless ANALYZE TABLE is ran or during first open of the table.

On my next post, I will discuss the effects of disabling the feature and how you or your application should compensate.

About Jervin Real

Jervin is a member of Percona's Rapid Response Consulting team. When you come to Percona for consulting, chances are he'll be greeting you first. His primary role is to make sure customer issues are handled efficiently and professionally. Jervin joined Percona in April 2010.

Comments

  1. Steve Jackson says:

    I vote for a innodb_stats_on_significant_change variable…

    Alternatively

    innodb_stats_on_row_ratio_modified=16 (i.e 1/16th)

    OR

    innodb_stats_on_rows_modified = 2000000000

    //Steve

  2. Holger Thiel says:

    Hello Jervin,

    I hope the sentence “When set to 0, …” refers to “innodb_stats_auto_update”. On my MySQL Community servers I always set “innodb_stats_on_metadata = 0″ as the information_schema is slow elsewise. And for investigation it is better to know what the old statistics are and then recalculate the new one.

    Regards,
    Holger

  3. Jervin Real says:

    @Holger,

    I meant, on Percona Server where both variables exist – innodb_stats_on_metadata is authoritative, if innodb_stats_auto_update=0, innodb_stats_on_metadata should also be 0 to disable the feature. I agree with you, knowing the previous statistics surely is useful especially when looking for a decent cardinality for your indexes.

    @Steve,

    Adding this options may disable the statistics update based on modified rows but not when using the metadata commands.

  4. Steve Jackson says:

    @Jervin

    Exactly…

    I would like to see stats updates disabled completely and then enabled for particular scenarios with things like innodb_stats_on_row_ratio_modified. I think most dba’s would prefer to be able to control when they update their index stats…

  5. Jonathan Valerio says:

    thanks for sharing. simple yet big help.

  6. Aaron Ingram says:

    You mentioned several metadata commands that will cause table statistics to be updated. Is that the complete list? I’d like to know if “describe” will update table stats.

  7. Carsten Dreesbach says:

    “On my next post, I will discuss the effects of disabling the feature and how you or your application should compensate.”

    Did you get a chance to write that post? I looked and couldn’t find it, would love to hear what you have to say about this.

  8. Same question as Carsten Dreesbach. Did you get chance to write followup article on this? Really interested to hear your opinions on that.

  9. Jervin Real says:

    Carsten, Amit,

    Thanks for the followup, I have not had the chance to do that yet, but will do really soon :-)

  10. Ben says:

    Is there a way to precisely update the statistics? I turned on the innodb_stats_auto_update, then execute ‘show table status’ several times. I already knew the correct row count is 34,000,000 for one particular table, but the ‘show table status’ gives me different results each time I ran it, in the same minute, getting 42,000,000, or 31,000,000, or 37,000,000 different answers.

    During this period, the db was idle, there was no any other read/write activity at all.

  11. Jervin Real says:

    @Ben,

    The InnoDB table statistics are just estimates, as such you cannot really get an accurate row count out of them like MyISAM.

  12. Carsten Dreesbach says:

    Hey Jervin, not to be a pest, but I would totally love to read a followup article on this. We’ve recently turned off these two variables and then ran into an issue with queries taking a very long time to execute, which was fixed after running an ANALYZE TABLES. So I’d love to hear other recommendations to keep stats up to date with these settings off in order to avoid running into situations like this.

    Thanks!

Speak Your Mind

*