January 9, 2007

MySQL automatic data truncation can backfire

Posted by peter |

I had a fun case today.

There is set of cache tables which cache certain content in MyISAM tables and queries for these tables such as:

SQL:
  1. SELECT DATA FROM cache0003 WHERE `key`=2342526263 AND real_key='cp_140797_6460aad5d2e50d3e859e8649007686ac';

The "key" is CRC32 of the real key which is used to keep index size as small as possible so if we have a cache miss we can in most case learn it without going to the disk.

So far so good.

The problem I discovered however is some of these queries would take enormous amount of time while CRC32 conflicts are really rare.

Looking deep into the problem I found out PHP and MySQL are both to blame. PHP is to blame because in 32bit PHP version result of crc32() function was returned as signed integer, in 64bit build of same PHP version it became signed.

The system worked on 32bit platform initially so "key" column was defined as "int"

As it was migrated to 64bit platform we got unsigned 32bit values which did not fit in this column any more so MySQL was silently converting them to 2^32-1, in just about 50% of the cases. This one is kind of expected.

What was unexpected however is how MySQL executed select queries if key value would be out of signed int range.
Instead of simply telling "impossible where noticed" as we have value outside of rage of values which can possibly be in the database we have MySQL truncating this value to 2^32-1, then performing index ref lookup (traversing about half of the rows in pages as cardinality for this constant is low) and discarding all of them before no values matched supplied key value.

So beware, data truncation can backfire in a ways you might not ever expect :)

MySQL sources from development tree

Posted by Vadim |

I just want to write few tips how to get MySQL from development BitKeeper tree, as there are several tricks.
First of all you need a fresh free client - bk-client2.0 http://www.bitkeeper.com/Hosted.Downloading.html .
Once you get it, you can clone latest tree:
bkf clone bk://mysql.bkbits.net/mysql-5.0 mysql-5.0

New client supports "changes" command, so you can look latest changes
bkf changes in mysql-5.0 directory.
It may be useful to get sources related to specific release. MySQL build team marks tree by TAG
with each new release. What I have looking by existing TAGS:

bkf changes | grep TAG
TAG: clone-5.0.32-build
TAG: mysql-5.0.30
TAG: clone-5.0.30-build
TAG: mysql-4.1.22
TAG: clone-4.1.22-build
TAG: mysql-5.0.28
TAG: mysql-5.0.27
TAG: mysql-5.0.26

So, to get sources for 5.0.30 release you need to run:

bkf clone -rmysql-5.0.30 bk://mysql.bkbits.net/mysql-5.0 mysql-5.0.30

Latest Falcon release is available on
bk://mysql.bkbits.net/mysql-5.2-falcon

Full list of available tree you can see here
http://mysql.bkbits.com/