April 24, 2014

utf8 data on latin1 tables: converting to utf8 without downtime or double encoding

Here’s a problem some or most of us have encountered. You have a latin1 table defined like below, and your application is storing utf8 data to the column on a latin1 connection. Obviously, double encoding occurs. Now your development team decided to use utf8 everywhere, but during the process you can only have as little […]

Fixing column encoding mess in MySQL

Just had an interesting issue with an encoding mess on a column containing non-ASCII (Russian) text. The solution was not immediately obvious so I decided it’s worth sharing. The column (actually the whole table) was created with DEFAULT CHARSET cp1251. Most of the data was in proper cp1251 national encoding indeed. However, because of web […]

PHP vs. BIGINT vs. float conversion caveat

Sometimes you need to work with big numbers in PHP (gulp). For example, sometimes 32-bit identifiers are not enough and you have to use BIGINT 64-bit ids; e.g. if you are encoding additional information like the server ID into high bits of the ID. I had already written about the mess that 64-bit integers are […]

Using CHAR keys for joins, how much is the overhead ?

I prefer to use Integers for joins whenever possible and today I worked with client which used character keys, in my opinion without a big need. I told them this is suboptimal but was challenged with rightful question about the difference. I did not know so I decided to benchmark. The results below are for […]