Working on large upgrade of MySQL 4.1 running Innodb to MySQL 5.0 and doing oprofile analyzes we found very interesting issue of buf_get_latched_pages_number being responsible for most CPU usage. It did not look right.

The close look revealed this is the function which is used to compute number of latched pages in Innodb Buffer Pool, which is called when you run SHOW INNODB STATUS and SHOW STATUS. In this case for same of monitoring SHOW GLOBAL STATUS was run every couple of seconds being considered fast operation, as it is in MySQL 4.1

But why buf_get_latched_pages_number ? Because this function traverses whole buffer pool to compute number of latched pages, which is quite slow with buffer pools containing millions of pages. But even worse this is done while holding global buffer pool mutex.

Another function which we’ve seen being responsible for high CPU usage is ha_print_info which also may traverse large arrays.

But this is only one of the gotchas. The second issue we ran into is number of Threads_Running being much higher for MySQL 5.0 compared to MySQL 4.1 It was something like 2-3 for MySQL 4.1 vs 40-60 for MySQL 5.0
At the same time however there were virtually no disk IO and CPU being loaded less than 20% (out of 4 Cores) in all cases which points to some locks/latches being responsible for this high amount of threads.

It turns out this issue is side effect of the first issue described – if many queries are being run and buffer pool is large, a lot of queries pile up waiting on locked innodb buffer pool mutex, so SHOW STATUS shows high number of running threads.

This theory can be confirmed by the fact using mysqladmin processlist | grep -v Sleep shows much smaller value which is close in MySQL 5.0 and 4.1

Hopefully this issue will be fixed sooner or later (I would be even happy to give up Innodb_buffer_pool_pages_latched n SHOW STATUS output

21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Paul Tuckfield

I’ve seen this exact effect in our environment, but never attempted to explain it. nice analysis as always.

I bet changing the function you pointed to will fix this problem, but it sure would be nice to be selective about what we’re looking at in some way. If “show variables like” did a more complete where clause implementation, this and future problems could be worked around.

If it makes you feel any better, i can tell you that this happens in oracle too when examining various statistics (oracle style, select * from v$whatever) under high traffic. It starts to feel like the heisenburg principle. But like I said above, the more detailed oracle stats that result in more disruptive high traffic serialization can generally be avoided since they’re not all under one command.

Bill

Thanks for finding this issue. Has this been reported to Mysql in a bug report? I don’t seem to be able to find it. I was thinking of using SHOW STATUS for innodb in 5.0, and I’d like to monitor the progress of this issue.

Ronald Bradford

SHOW STATUS can provide an alternative to LIKE ‘innodb%’ using WHERE. For example:

mysql> show global status where variable_name in (‘Com_insert’,’Innodb_buffer_pool_pages_latched’,’threads_running’);

The downside is you loose the wildcard capability, however you can string commands together with a like

mysql> show global status where variable_name like ‘innodb%’ or variable_name like ‘com%’;

mysql> select @@version;
+———–+
| @@version |
+———–+
| 5.0.37 |
+———–+

Ronald Bradford

Peter,

I’m not certain. I suspect you are correct, however I do need to know myself, I’ll add it to my long list to investigate just what’s happening under the covers (if only had more time)

Ronald

Vasil Dimov

Peter,

Can you change the body of buf_get_latched_pages_number() function to be something like

buf_get_latched_pages_number()
{
return(13);
}

and see how much things improve? While this will come up with wrong result (obviously) it will give a good hint if it is worth having the number of latched pages in a “ready to read” variable, without having to really implement it.

Thanks!

wow gold

Thanks for finding this issue. Has this been reported to Mysql in a bug report? I don’t seem to be able to find it. I was thinking of using SHOW STATUS for innodb in 5.0, and I’d like to monitor the progress of this issue.

Mark Callaghan

Wow. The bug is still there even in the 5.1.23 InnoDB plugin. I have filed a bug — http://bugs.mysql.com/bug.php?id=36600. I have changed my builds to not call buf_get_latched_pages_number. Otherwise, the CPU gets saturated when SHOW STATUS commands are run.

Vasil Dimov

Can someone give an estimate how much SHOW STATUS improves when buf_get_latched_pages_number() is disabled?

Thanks!

Heikki Tuuri

All,

we are going to remove the loop to calculate buf_get_latched_pages_number from the standard build of mysqld. We may keep it in the UNIV_DEBUG build.

Since SHOW INNODB STATUS\G prints all reserved latches, people can get the information, and in a much more detailed form, from SHOW INNODB STATUS\G.

###

Concerning the “disconnectedness of InnoDB developers” from the MySQL user base, InnoDB developers look at bugs.mysql.com every day. And consulting companies like Percona exist, who give us feedback :).

Regards,

Heikki

Vasil Dimov

Hmm,

Peter, do you mean SHOW STATUS instead of SHOW INNODB STATUS?
buf_get_latched_pages_number() is called when SHOW STATUS is executed, it is not called when SHOW INNODB STATUS is executed.

I have prepared a simple patch that removes the Innodb_buffer_pool_pages_latched variable from SHOW STATUS.

Vasil Dimov

Patch that removes the variable Innodb_buffer_pool_pages_latched from SHOW STATUS has been committed in InnoDB 5.0 and 5.1, follow the bug report: http://bugs.mysql.com/36600

Laine

Has this been addressed in the percona binaries?

Vadim Tkachenko

Laine,

Yes, and actually it is fixed in mysql binaries also (5.0.80+)