I’ve been working with an application which does a lot of CREATE and DROP table for Innodb tables and we’ve discovered DROP TABLE can take a lot of time and when it happens a lot of other threads stall in “Opening Tables” State. Also contrary to my initial suspect benchmarking create/drop table was CPU bound rather than IO bound.
I’ve got an oprofile run and saw the following:
1 2 3 4 5 6 7 | 15753796 56.0725 no-vmlinux no-vmlinux /no-vmlinux 11834143 42.1213 mysqld mysqld buf_LRU_invalidate_tablespace 168823 0.6009 mysql mysql completion_hash_update(hashtable*, char*, unsigned int, char*) 53667 0.1910 oprofiled oprofiled /usr/bin/oprofiled 42116 0.1499 mysqld mysqld buf_calc_page_new_checksum 32107 0.1143 mysqld mysqld srv_release_threads 14624 0.0521 mysqld mysqld srv_table_get_nth_slot |
So we can see there is disproportionate amount of time spent in buf_LRU_invalidate_tablespace function which
did not sound right. The great thing with oprofile output you can usually Google couple of top functions and find bug reports or blog posts about this topic. It lead me to this bug at this time.
The problem is basically if you’re running innodb_file_per_table=1 the tablespace is dropped when you’re running DROP TABLE, and Innodb has to go through LRU list and discard the pages which belong to this tablespace. This can take a lot of time with large buffer pool. Worst of all this is done while table_cache lock is being held so no other queries can start.
This was both a shock and “I’ve seen something like this before” – there were problems of scanning large lists which were responsible for poor performance of SHOW STATUS and crash recovery performance over last couple of years and here we go again.
It would be great to get an Innodb a good review and see where are other linked links which are hiding somewhere. Whenever you have the list you really need to think how large it can get. We’re already seeing half a TB innodb_buffer_pool size and I expect it to close 1TB at the high end in the next few year. This will get the list of pages in buffer pool to be getting close to 100 millions of entries.
So as result of this bug – if you’re running innodb_file_per_table you will have a microstalls when tables are dropped and dropping a lot of tables in a batch can cause serve performance problems. The stalls can take more than a second for very large buffer pool.
If you’re using Innodb tables as transient tables where you create and drop them frequently you may avoid such design or use innodb_file_per_table=0 for such tables (thankfully you can change this variable online in Innodb Plugin, MySQL 5.5 and Percona Server).
Yes. if you’re not using innodb_file_per_table you’re not affected because in this case tablespace does not need to be dropped so pages from Buffer pool LRU do not need to be discarded. The raw performance of dropping tables I measured on the test server was 10 times better with single tablespace, though this will vary a lot depending on buffer pool size. MyISAM tables creating/dropping was several times faster still.
One option that we have in the Facebook patch is the option innodb_background_drop_table. When this is enabled, it will cause InnoDB to always do a background drop table (normally it only does it in specific cases).
IIRC, it can potentially cause some problems if it is recreated immediately, but otherwise it prevents the lock_open problem (though it still wastes CPU for InnoDB as it does this in the background).
Subscribe to http://bugs.mysql.com/bug.php?id=51325
Also:
http://bugs.mysql.com/bug.php?id=56332
Thanks for the info Peter! Coincidentally we just discovered this issue a couple days ago. We were cleaning up a DB with one file per table and had to drop a 120G table. This was the result:
Query OK, 0 rows affected (3 hours 44 min 28.98 sec)
Almost 4 hours to drop a table! This was very shocking as I’ve never experienced a drop table take more than a couple seconds. The worst thing, as you mention, was all queries to the DB were hung during this time as it held that lock.
Do you happen to know a way around this for existing DBs? Are we basically not able to drop these large tables without taking a DB offline?
Thanks!
Robert,
I wonder if you are using ext3? The performance of dropping large tables is mostly bound by IO when using ext3. XFS can drop large files much more quickly.
We are using ext2 actually. One CPU was pegged at 100% while the drop was running and i/o was low.
There is another bug about it this could explain slowness of dropping of large tables with innodb_file_per_table:
http://bugs.mysql.com/bug.php?id=39939
It seems to be fixed in recent MySQL versions though.
Interesting find, I’m actually going to make a mental note to look at this.
I’ve been bit before with a similar issue where INNODB keeps a global lock on the database while its deleting the .idb file from the file system (which, as others have mentioned, can take a remarkably long time on ext3).
Didn’t realize there was a CPU bound phase where pruning the LRU list took this long. The engineer in my though is wondering how it could possibly take *that* long to work through the LRU list for blocks.
If we assume 16k blocks, a 16G buffer pool has 1M blocks.
If we take, I dunno, 100 CPU cycles per block to test for membership in the table being dropped, we’re still only looking at 100M CPU cycles or maybe 0.03 seconds on a decent modern chip. (or is my envelope math deeply flawed)
What’s he doing internally that’s taking that long?
Patrick,
There is a bug for that http://bugs.mysql.com/bug.php?id=41158 Thankfully it is fixed in MySQL 5.5 Also you better to stay away from ext3 it is just nasty when it comes to large files/volumes.
Thanks Peter, I wasn’t actually aware that one was fixed. I didn’t have much luck getting folks here to switch to XFS just to speed up (rather infrequent) table drops, but we’ll be upgrading sooner or later :).
Out of curiosity, what are the concerns with EXT3? I’ve actually had pretty good luck with it … stable, fast “enough”. The “really slow deletes” issue is the only time I’ve actually questioned my choice of file systems.
I’m *far* from a a file system expert though, so if you know something I don’t I’d love to hear it.
Patrick,
Ext3 has slow deletes and not just slow but impacting performance of the system a lot. It has read-write locks per inode which does not allow for as much of concurrent IO (especially O_DIRECT) as one would write and it also does not really like to mix fsync with meta data updates, such as explained here http://www.mysqlperformanceblog.com/2009/01/21/beware-ext3-and-sync-binlog-do-not-play-well-together/
ext4 looks promising in some benchmarks though it is not very commonly used in production yet.
Peter, thank you for pointing to my 1 year old bugreport, It’s been a pain for us, until we found the cause.
Did anyone test this behaviour on 5.5? I know mysql team fixed some things using lock_open, but I have not tested it yet.
HA! I just had this problem – was dropping a database with approx 150 tables, some large that were InnoDB. The CPU hit the roof so I killed the job. But I was still stuck with the problem of unneeded dbs with lots of unneeded data.
So I wrote a script that truncated each table, dropped the table & when finished dropped the db. The cpu never climbed above 3 and the db server was usable for other databases. It did take 44 min. to complete this for 14 databases. My developer is gonna be very happy that these are now gone…… script was fast to create too.
Hi Peter,
We store a window of metric data and definitely see the affect of this problem. I’m wondering if we could work around it by reusing old tables rather than dropping & creating. We’d have to truncate the old tables to clean them out. Any thoughts on the performance implications of truncating and reusing tables?
Thanks!
Jim: TRUNCATE in fact does drop + create, so it won’t help. The only way is delete all the data using DELETE statement or avoid innodb_file_per_table.
Jim,
One of the reasons for this blog post was to see if this is a pain for many users so we can see if it makes sense to fix it in Percona Server and what fix would do it. There is a partial solution with Facebook patch. I would like to see complete solution though. Relatively easy design for the fix could be not to scan LRU and invalidate all pages from the removed tablespace but just remove them via normal LRU policy. It can later be extended with some background cleanup.
Jim,
With MySQL 5.5, if you don’t have foreign keys or triggers, then a TRUNCATE will simply drop the table and recreate it under lock_OPEN, which won’t help. Deleting all rows from the table could incur significant expense.
MySQL 5.5 does have metadata locking which will eliminate holding of mutexes but invalidating buffer pool pages may still take some time. You would need to test it on your application to see if it improves performance for you. MySQL 5.5 will still hold the kernel mutex during a drop, but this is fixed in trunk.
You might consider using RENAME TABLE to swap the table with an empty table, which should be a pretty fast operation. You could use a script to drop the old tables during a period of low activity.
These stalls are a definitely an issue for us. We frequently run servers shared by multiple clients with long hours of operation. It makes it difficult to find an acceptable time window where we can do the work so as not to stall everyone on the server while drops are being performed.
This i can confirm A LOT. It happens a lot to me in 5.5.8 Dropping a database can almost take forever.
See also:
http://www.mysqlperformanceblog.com/2011/04/20/drop-table-performance/
We fixed this problem in Percona Server, release 5.5.10-20.1 with the innodb_lazy_drop_table option.
hi, peter
i have some quesitons also have a pool english ^_^.
according to your view, when you set innodb_table_per_file=1 and then you drop the table, so the tablespace be droped also.
so , why the pages associate with that tablespace in the buffer pool must be discarded too?
in my opinion, when the table is droped ,then no statement will access the pages with that tablespace,then the useless pages in the buffer pool can be discared when buffer pool is full or other situation, not at the time when drop table.
Repls,
Yes. You’re completely right. Yet code is written in a way it scans the code and replaces the table and change of this behavior is not as trivial as it may look
Just curious, is there any workaround to avoid these locks? Would DELETEing all rows in small batches before DROP, or TRUNCATEing the table before DROP do any good? We have a fairly large table that we would like to drop without global locks kicking in…
Related question; maybe incresing innodb_buffer_pool_instances could avoid long running lock?
I am planning to drop permanently a InnoDB table in size 250GB, with innodb_file_per_table=ON, We are on Percona MySQL 5.6.29-76.2
Does the issue Lock_open (global lock) still exists in 5.6 release?
Thanks in advance for the reply?
Has this problem returned with 5.7.18-14-log – Percona Server ?
I have innodb_table_per_file=1 and when tables are truncated the process list shows SYSTEM LOCK and each truncate takes several seconds.
All other queries are performing well.