Inspiration for this post is courtesy of a friend and former colleague of mine, Greg Youngblood, who pinged me last week with an interesting MySQL puzzle. He was running Percona Server 5.5.21 with a table structure that looks something like this:
1 2 3 4 5 6 7 8 9 | CREATE TABLE foo ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, uid INT UNSIGNED NOT NULL, update_time DATETIME NOT NULL, .... INDEX `uid` (uid, update_time), INDEX `bar` (some_other_columns) .... ) ENGINE=InnoDB; |
When he ran this query:
1 | SELECT MIN(update_time) FROM foo WHERE update_time IS NOT NULL AND update_time <> '0000-00-00 00:00:00'; |
The result came back as 2012-06-22 10:28:16. However, when he ran a slightly different query:
1 | SELECT MIN(t.update_time) FROM (SELECT uid, MIN(update_time) AS "update_time" FROM foo WHERE update_time IS NOT NULL AND update_time <> '0000-00-00 00:00:00' GROUP BY 1) t; |
The answer which came back was also completely different: 2011-08-22 11:27:27. This is the correct response, not the date from 2012. We considered a few ideas to try to diagnose the issue, one of which being to try to force MySQL to use a different index. Imagine our surprise when we tried a FORCE INDEX on (bar) or an IGNORE INDEX(uid) and we got a completely different answer: 2012-06-21 20:36:35.
So, we then tried the “obvious” solution:
1 | SELECT update_time FROM foo ORDER BY update_time ORDER BY update_time LIMIT 3; |
And the result?
1 2 3 | 0024-06-22 01:34:25 2011-08-22 11:27:27 2011-08-23 11:31:40 |
Aha! Bad data! We hypothesized that the MIN() function was doing some sort of internal conversion to a unix_timestamp/integer or a ctime; indeed, running SELECT MIN(UNIX_TIMESTAMP(update_time)) …. does give a result of zero, whereas doing a CAST() or a CONVERT() on the update_time field first (so that it would be treated explicitly as a string) produces the 0024-06-22 result.
The documentation for MySQL 5.5 states that the supported range for the DATETIME type is 1000-01-01 00:00:00 through 9999-12-31 23:59:59. It goes on to say that ‘For the DATE and DATETIME range descriptions, “supported” means that although earlier values might work, there is no guarantee.’ It turned out that an application had submitted the odd datetime, and because the value passed the date-format check, it was allowed into the database. Oooops.
What, then, can we learn from this?
- Unfortunately, setting sql_mode does not help you here. Although 0024-06-21 is technically outside the supported range for DATE/DATETIME, neither TRADITIONAL, STRICT_ALL_TABLES, or STRICT_TRANS_TABLES produces so much as a warning. Is this a bug or just documented behavior? I suppose that depends on your perspective.
- When the documentation says that there is “no guarantee that something will work”, just don’t risk it.
- Even if a piece of data matches the expected format, that does not mean that it’s a valid value. So, perhaps the most important lesson of all: Always, always range-check your input! Format-checking is not enough.
One final aside – I was able to duplicate this same issue with Percona Server 5.5.25a, but in my case, the behavior was even stranger. I created a table with a similar structure:
1 2 3 4 5 6 | CREATE TABLE `foo` ( `i` int(11) NOT NULL AUTO_INCREMENT, `update_date` datetime NOT NULL, PRIMARY KEY (`i`), KEY `i` (`i`,`update_date`) ) ENGINE=InnoDB |
and then inserted a few bogus rows with the 0024-06-21 date. I then proceeded to dump a few thousand random datetimes into this table via a simple Perl script. Originally, it looked like I wasn’t going to be able to duplicate the situation – running “SELECT MIN(update_time) FROM foo” on my table produced what might arguably be considered the correct answer (1058-11-06 00:00:00), which was the minimum datetime value greater than the minimum supported datetime of 1000-01-01 00:00:00. But then I decided to move my bogus rows around by changing their PK, and this is what happened:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | Before the move: (root@localhost) [test]> select * from foo order by update_date limit 5; +-------+---------------------+ | i | update_date | +-------+---------------------+ | 1 | 0024-06-21 10:35:55 | | 2 | 0024-06-21 10:35:55 | | 3 | 0024-06-21 10:35:55 | | 4 | 0024-06-21 10:35:55 | | 1159 | 1058-11-06 00:00:00 | +-------+---------------------+ (root@localhost) [test]> select min(update_date) from foo; +---------------------+ | min(update_date) | +---------------------+ | 1058-11-06 00:00:00 | +---------------------+ (root@localhost) [test]> update foo SET i=i+100000 where i<5; Query OK, 4 rows affected (0.00 sec) Rows matched: 4 Changed: 4 Warnings: 0 After the move: (root@localhost) [test]> select min(update_date) from foo; +---------------------+ | min(update_date) | +---------------------+ | 2024-06-21 10:35:55 | +---------------------+ (root@localhost) [test]> select update_date FROM foo order by update_date LIMIT 5; +---------------------+ | update_date | +---------------------+ | 0024-06-21 10:35:55 | | 0024-06-21 10:35:55 | | 0024-06-21 10:35:55 | | 0024-06-21 10:35:55 | | 1058-11-06 00:00:00 | +---------------------+ |
Very bizarre. On one hand, it seems like MySQL is using date windowing for two-digit dates (this would explain 2024-06-21), but on the other hand, it hasn’t actually changed any data in those DATETIME values, and now it’s definitely returning the wrong answer. Always range check your input!
I’ve found that bizarre values tend to be on the extremes, e.g. all-zero dates or dates like you mentioned. Probably not too hard to find them with a LIMIT 1 query, ordering by DESC too to find 9999-99-99. Maybe a Percona Toolkit tool could iterate through all tables, look for indexed date/time columns, and check the first and last to see if there’s garbage?
Out of curiosity I’ve tried your test case on MariaDB-5.5.
And I’ve got none of this weird behavior, it always consistently gave me “0024-06-21 10:35:55” as the minimum datetime value.
Wow. It looks like there are still skeletons in the closets in the edge cases.
If MariaDB is cleaning these up it is great !
The weird behavior is not seen the latest MySQL 5.6 Development Milestone Releases either.
Looks similar to http://bugs.mysql.com/bug.php?id=61100