Couple of days ago I was looking for a way to improve update performance for the application and I was replacing single value UPDATE with multiple value REPLACE (though I also saw the same problem with INSERT ON DUPLICATE KEY UPDATE)

As I went from 1 value to 3 or 10 in the batch performance improved, especially for network tests, however going to batches of 100 and 1000 values performance started to degrade badly – the process which was taking 45 seconds with single value statements was taking over 12 minutes with 1000 values in a batch. This was a big surprise for me as I’ve used batching with great performance gains a lot of times.

I had 2 thoughts – either something is going on with parser or it is something nasty going on on Innodb level, so I tried running the test with MyISAM tables instead. The process completed in 12 seconds for 1000 rows in a batch vs 40 seconds for single row statements. So with single statement MyISAM was about 10% faster, which is quite expected but with 1000 rows in the batch the difference grew to 60 times !

I looked at the oprofile results:

samples % image name app name symbol name
312528 53.8091 mysqld mysqld dict_scan_to
150327 25.8823 libc-2.5.so libc-2.5.so __ctype_toupper_loc
71924 12.3834 mysqld mysqld .plt
18071 3.1113 no-vmlinux no-vmlinux (no symbols)

Which shows some funky function “dict_scan_to” taking over 50% of the time – not the function you would expect to see on top for update workload !

I asked Yasufumi, our Innodb code ninja to take a look what it could be and here is what he tells me:

This may be kind of performance bug of 5.0.

At row0sel.c:row_search_for_mysql() “PHASE 3:”,

InnoDB judge whether the SQL is “SELECT” or not.

5.0 does scanning the SQL for each time.

if (trx->isolation_level select_lock_type != LOCK_NONE
&& trx->mysql_query_str) {

/* Scan the MySQL query string; check if SELECT is the first
word there */
ibool success;

dict_accept(*trx->mysql_query_str, “SELECT”, &success);

if (success) {
/* It is a plain locking SELECT and the isolation
level is low: do not lock gaps */

set_also_gap_locks = FALSE;
}
}

The “REPLACE 1000 rows” SQL doesn’t have “SELECT” so, InnoDB scan all of the SQL each times…
(* “INSERT” may not use this function? “UPDATE” may be also affected)

On the other hand, current 5.1 at the same place is,

if (trx->isolation_level select_lock_type != LOCK_NONE
&& trx->mysql_thd != NULL
&& thd_is_select(trx->mysql_thd)) {
/* It is a plain locking SELECT and the isolation
level is low: do not lock gaps */

set_also_gap_locks = FALSE;
}

looking up to the flag of mysql_thd….

Scanning the statement for each update… that is a bummer ! It is great to see however it is fixed in MySQL 5.1
It also explains why the problem took a while to uncover – the most typical statement of the giant size which we use is INSERT, and it does not use this function so it did not have a bug.
INSERT ON DUPLICATE KEY UPDATE, REPLACE, UPDATE all should be affected, though I have not tested this carefully.

There is one more interesting thing with this code – The scan is only run if trx->isolation_level 15 seconds instead of 12 minutes. What a change !

5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Arjen Lentz

So why was the isolation level lowered in this case, can you tell?

And, when do you see measurable gain by lowering isolation level. As I understand from previous info, lowering the isolation level in InnoDB tends to not actaully save any time in real terms.

Bhushan Uparkar

Could you please provide details about the mysql 5.1 version in which this got fixed ?

Chetan Ahuja

I just trawled through the source and it’s been fixed somewhere between 5.1.30 and 5.1.37. That is to say, I didn’t find the ” thd_is_select(trx->mysql_thd)” clause in the 5.1.30 codebase but did find it in the 5.1.37 codebase. And using vtune profiles, we even saw the actual dict_scan function that was dominating in the 5.1.30 codebase, disappear from the top 10 functions in 5.1.37 codebase. Though unlike your success with REPEATABLE-READ, we didn’t see any jump in our actual query throughput (and yes, we even tried the REPEATABLE-READ isolation level thing with 5.1.37 just in case… no cigar).

Chetan