I was restarting MySQL on box with 50.000 of Innodb tables and again it took couple of hours to reach decent performance because of “Opening Tables” stage was taking long.

Part of the problem is Innodb is updating stats on each table open which is possibly expensive operation, but really it is only great test case for general MySQL problem.

During warmup process I noticed I get very low CPU usage and disk Usage and IOWait about 25% (4CPU box) which indicates table opens and stats updates are serialized rather than performed in parallel.

I’ve checked with Heikki and he confirmed MySQL has global table cache mutex (LOCK_open) which is held for all open table operation so table opens are serialized.

Do not think however it is only Innodb problem. I’ve seen similar problems with MyISAM – these also take few IO operations to open and could take quite a while to close if there were unflushed key blocks and we have no idea what other storage engines may do to perform table open – some may need network operation etc.

In general I think it is extremely poor design choice to have global mutexes for anything which may require blocking physical IO or network operation if you care about scalability. MySQL has same problem with key buffer but that was fixed in MySQL 4.1, this one still remains.

I do not blame Monty – implementing first MySQL version in 3 months he had to take shortcuts and implement most simple solutions for many things. Now years later and having 50+ developers this should have been fixed.

Until this is fixed it is especially important to keep your table_cache large enough so table opens will be rare (I use 1/sec as a number to worry about) and also worry about possible limited performance while table cache is being warmed up in addition to all other caches 🙂

25 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jay Pipes

Hi Peter!

Tell me, does the table_definition_cache/table_open_cache split in 5.1 resolve these issues, or is that a separate thing?

Cheers!

Jay

Frank Mash

Hey Peter,

Wassup man.

Why do you have 50,000 tables on one server?

Just curious 🙂
Frank

James Day

Frank, there are people who have more than 100,000 tables on a server. It’s one reason why the InnoDB data dictionary in RAM was made 50-70% smaller in 5.1 and why a couple of other InnoDB changes have been made in 5.0 and 5.1 to speed up handling of lots of tables.

We had a feature request recently from someone who wanted more than 32,000 databases on a file system that couldn’t handle that many subdirectories, so the MySQL database count was limited by that. No current plans to do this.

The InnoDB stats continue to be quite frustrating, since the sample size isn’t big enough to give consistent query results at larger table sizes and the overhead at opening is annoying. It’d be nice if it remembered the stats and refreshed occasionally, or perhaps added to them in the background during use.

thomas

or just use a decent filesystem :).

Alexey

With 32000 databases the real bottleneck would be MySQL privilege system. It becomes a problem when you have 5000 databases, and with 10000 databases you’ll have to fix some things in the source – otherwise queries like SHOW DATABASES will use 100% of your CPU for a couple of minutes.

Alexey Polyakov

Yes, I was speaking about web hosting case. Companies that sell shared web hosting services often place all of their customers’ DBs to a dedicated boxes. Most of DBs do not produce high load so it’s ok to stuff thousands or even tens of thousands databases on a single server.
I’m not sure why one would want to create thousands of DBs belonging to a handful of users. 🙂

Regarding ‘show databases’ – you have to care about it in a web hosting case. phpMyAdmin calls it twice when you open it’s main page. 10 users actively working with phpMyAdmin can ruin the performance of otherwise not-heavily-loaded server.

Alexey

Here’s a piece of profiling data from live system, running lot of DBs without too much load (about 80 qps, select-heavy with pretty good hit rate):
root@titanic [/var/lib/mysql]# ls -la|wc -l
24547
root@titanic [/var/lib/mysql]# opreport -l *mysqld|head
warning: /oprofile could not be found.
CPU: AMD64 processors, speed 2394.07 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
10174899 26.6171 libc-2.3.4.so strcmp
5460631 14.2848 mysqld acl_getroot(THD*, user_resources*, char const*, unsigned int)
2592977 6.7831 libc-2.3.4.so memcpy
1956617 5.1184 vmlinux-2.6.17.7 copy_user_generic_c
1797812 4.7030 mysqld Query_cache::insert_into_free_memory_list(Query_cache_block*)
1685928 4.4103 mysqld my_lengthsp_8bit
1066380 2.7896 mysqld anonymous symbol from section .plt

First two functions are privilege-system related (strcmp is used for comparing logins, filenames etc., mysql uses it’s own functions for matching data). As you can see those two functions alone consume more than 40% of cpu time. That’s after fixing show databases problem, with show databases the number would be more like 95%. 🙂

Matt

If anyone is interested in improving the way MySQL deals with a ton of table we (WordPress.com) would happily sponsor that development. We have about 5 million tables today and it’s going up constantly. I’m reachable on my site.

Steven Roussey

The 32000 limit is in ext2 and ext3 (still not fixed for ext4, although there is an old patch to make it 64000). If you happen to be using ext as a file system with a lot of either databases or file systems, you might try adding dir_index and then do a tune. Don’t know what the speed up is, your mileage may vary.

Matt: why 5 million tables? We host 500,000 forums and don’t give each one its own set of tables, it was just too much work and too slow and complex.

Hakan

WordPress.com hosts a free blog system running WordPress Mu (multiuser wordpress) and it creates a new set of tables for every new blog.

/Hakan

Sysadm

We have forum hosting community – around 3 500 000 myisam tables per each server. Each forum gets it’s own database (~90 tables in one database) . We have similar issue – when server is overloaded, status of all threads in “show processlist” is “opening tables”. We have many php and system tweaks but this problem seems to be hard to resolve without MySQL source code rewrite -we have not enought knowledge to do it. It’s MySQL 4.0.

We have tried to increase table_cache but it doesn’t make any sense with such many useable tables – best performance setting for us is table_cache=0.

Now it’s hosted on Linux 2.6 & XFS which supports more than 32K subdirectories in one directory. Reiserfs supports too, but it’s terribly unstable within such load and brokes it’s journal itself very often, and reiserfs is not visible faster in this than XFS.

I’m afraid of putting such many tables into one database – this cause worse performance problems – like hosting >10 000 000 files in one directory (3 files per table)…

art

Does mysql open all innodb tables at startup or only with query on that particular table?

Derek Organ

We work on the basis of a different database for each new company account. The problem I’m starting to have with 5000 databases is the show databases and use database commands are taking a long time to run and are showing up in my slow-queries log. Are there any tips for making this go faster?

Mattias J

We have some 2000 databases with the same set of 250+ tables (both numbers constantly growing) = 500’000 tables and counting on the same server.

As we are trying to scale out, our main options seems to be either “manually” partition different databases to different servers, or putting all the data in a single set of large tables in combination with built-in MySQL Partitioning.

But how does MySQL Partitioning affect table opening / tables cache???

Are there other options we should consider?

med amine

I am an engineering student, and I am working on the subject of mysql database sharding. ds you can suggest me books or links useful for my work .. thank you in advance ..

Sergei

And I was wondering if MySQL can handle a few thousands of tables. Compared to wordpress’s numbers, this is like a toy project 🙂

Andreas Bergman

Hi,

Do you know if this issue is fixed in 5.5?

We run a WP multisite with about 200k tables and as tables in open_table cache increase the execution time of the query increase and at last there is always 400 querys running and most of them in opening/closing state. If I restart the process it works like a charm for a couple of hours, and then it starts all over again.

Any ideas?
Andreas