Innodb performance gotcha w Larger queries.

Couple of days ago I was looking for a way to improve update performance for the application and I was replacing single value UPDATE with multiple value REPLACE (though I also saw the same problem with INSERT ON DUPLICATE KEY UPDATE)

As I went from 1 value to 3 or 10 in the batch performance improved, especially for network tests, however going to batches of 100 and 1000 values performance started to degrade badly – the process which was taking 45 seconds with single value statements was taking over 12 minutes with 1000 values in a batch. This was a big surprise for me as I’ve used batching with great performance gains a lot of times.

I had 2 thoughts – either something is going on with parser or it is something nasty going on on Innodb level, so I tried running the test with MyISAM tables instead. The process completed in 12 seconds for 1000 rows in a batch vs 40 seconds for single row statements. So with single statement MyISAM was about 10% faster, which is quite expected but with 1000 rows in the batch the difference grew to 60 times !

I looked at the oprofile results:

samples % image name app name symbol name
312528 53.8091 mysqld mysqld dict_scan_to
150327 25.8823 libc-2.5.so libc-2.5.so __ctype_toupper_loc
71924 12.3834 mysqld mysqld .plt
18071 3.1113 no-vmlinux no-vmlinux (no symbols)

Which shows some funky function “dict_scan_to” taking over 50% of the time – not the function you would expect to see on top for update workload !

I asked Yasufumi, our Innodb code ninja to take a look what it could be and here is what he tells me:

This may be kind of performance bug of 5.0.
At row0sel.c:row_search_for_mysql() “PHASE 3:”,
InnoDB judge whether the SQL is “SELECT” or not.
5.0 does scanning the SQL for each time.
if (trx->isolation_level select_lock_type != LOCK_NONE
&& trx->mysql_query_str) {
/* Scan the MySQL query string; check if SELECT is the first
word there */
ibool success;
dict_accept(*trx->mysql_query_str, “SELECT”, &success);
if (success) {
/* It is a plain locking SELECT and the isolation
level is low: do not lock gaps */
set_also_gap_locks = FALSE;
}
}
The “REPLACE 1000 rows” SQL doesn’t have “SELECT” so, InnoDB scan all of the SQL each times…
(* “INSERT” may not use this function? “UPDATE” may be also affected)
On the other hand, current 5.1 at the same place is,
if (trx->isolation_level select_lock_type != LOCK_NONE
&& trx->mysql_thd != NULL
&& thd_is_select(trx->mysql_thd)) {
/* It is a plain locking SELECT and the isolation
level is low: do not lock gaps */
set_also_gap_locks = FALSE;
}
looking up to the flag of mysql_thd….

Scanning the statement for each update… that is a bummer ! It is great to see however it is fixed in MySQL 5.1
It also explains why the problem took a while to uncover – the most typical statement of the giant size which we use is INSERT, and it does not use this function so it did not have a bug.
INSERT ON DUPLICATE KEY UPDATE, REPLACE, UPDATE all should be affected, though I have not tested this carefully.

There is one more interesting thing with this code – The scan is only run if trx->isolation_level 15 seconds instead of 12 minutes. What a change !

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Arjen Lentz

14 years ago

So why was the isolation level lowered in this case, can you tell?

And, when do you see measurable gain by lowering isolation level. As I understand from previous info, lowering the isolation level in InnoDB tends to not actaully save any time in real terms.

Peter Zaitsev

Author

14 years ago

Arjen,

I do not know. Sometimes it is just assumed it would help. Plus it actually does help a little bit – as you can see from the code in the example you can have less locks if you’re running in low isolation level.

I also checked for single row replacement statements there was performance gain of about 5% for READ-UNCOMMITTED.
I think this is not enough to bother but I also remember some use cases with larger gains.

Bhushan Uparkar

14 years ago

Could you please provide details about the mysql 5.1 version in which this got fixed ?

Peter Zaitsev

Author

14 years ago

Bhushan,

I do not know exact version – it is surely fixed in recent 5.1 versions. If you’re looking for us to dig into revision history and find in which version this change was done we can do it on consulting basics.

Chetan Ahuja

14 years ago

I just trawled through the source and it’s been fixed somewhere between 5.1.30 and 5.1.37. That is to say, I didn’t find the ” thd_is_select(trx->mysql_thd)” clause in the 5.1.30 codebase but did find it in the 5.1.37 codebase. And using vtune profiles, we even saw the actual dict_scan function that was dominating in the 5.1.30 codebase, disappear from the top 10 functions in 5.1.37 codebase. Though unlike your success with REPEATABLE-READ, we didn’t see any jump in our actual query throughput (and yes, we even tried the REPEATABLE-READ isolation level thing with 5.1.37 just in case… no cigar).

Chetan

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Innodb performance gotcha w Larger queries.

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Valkey/Redis: Configuration Best Practices

Valkey/Redis: The Hash Datatype

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Innodb performance gotcha w Larger queries.

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Valkey/Redis: Configuration Best Practices

Valkey/Redis: The Hash Datatype

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation