I had a fun case today.

There is set of cache tables which cache certain content in MyISAM tables and queries for these tables such as:

The “key” is CRC32 of the real key which is used to keep index size as small as possible so if we have a cache miss we can in most case learn it without going to the disk.

So far so good.

The problem I discovered however is some of these queries would take enormous amount of time while CRC32 conflicts are really rare.

Looking deep into the problem I found out PHP and MySQL are both to blame. PHP is to blame because in 32bit PHP version result of crc32() function was returned as signed integer, in 64bit build of same PHP version it became signed.

The system worked on 32bit platform initially so “key” column was defined as “int”

As it was migrated to 64bit platform we got unsigned 32bit values which did not fit in this column any more so MySQL was silently converting them to 2^32-1, in just about 50% of the cases. This one is kind of expected.

What was unexpected however is how MySQL executed select queries if key value would be out of signed int range.
Instead of simply telling “impossible where noticed” as we have value outside of rage of values which can possibly be in the database we have MySQL truncating this value to 2^32-1, then performing index ref lookup (traversing about half of the rows in pages as cardinality for this constant is low) and discarding all of them before no values matched supplied key value.

So beware, data truncation can backfire in a ways you might not ever expect 🙂

8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Joseph Scott

This sounds like a great argument against ever truncating data. Having a DBMS store different data that what I provided it with is just wrong. Instead it should throw an error.

Thomas

Hmmmm, didn’t know crc32 returned an ‘unsigned 32 bit’ value on 64bit. It probably is a regular signed int, but since signed 64bit can store a much large value, it won’t go negative.

Thanks for mentioning, am now recalculating all crc32 values in my mixed environment setup 🙂

BOLK

Use MD5 or SHA1 instead of CRC32.

Thomas

@ peter

As far as I checked, the problem is not the conversion to string, but the fact 64bit php returns a 64 bit signed integer which wont wrap at the same position as a 32bit signed integer.
I’ve solved my problems by using sprintf(‘%u’,crcvalue) when crcvalue is negative and then store all crc32 values as unsigned integers in mysql.