December 20, 2007

What is the longest part of Innodb Recovery Process ?

Posted by peter

In MySQL 4.1 and above the longest part of recovery after crash for Innodb tables could be UNDO stage - it was happening in foreground and was basically unbound - if you have large enough transaction which needed to be undone this could take long hours.
REDO stage on other hand always could be regulated by size of your Innodb log files so you could have it as large as you like. Read more about it here.

Since MySQL 5.0 the UNDO stage is running in background so it still can be the longest but would not keep server completely unusable (some limitations still apply though).
[read more...]

Large result sets vs. compression protocol

Posted by shodan

mysql_connect() function in PHP’s MySQL interface (which for reference maps to mysql_real_connect() function in MySQL C API) has a $client_flags parameter since PHP 4.3.0. This parameter is barely known and almost always overlooked but in some cases it could provide a nice boost to your application.

There’s a number of different flags that can be used. We’re interested in a specific one, MYSQL_CLIENT_COMPRESS. This flag tells the client application to enable compression in the network protocol when talking to mysqld. It reduces network traffic but at the cost of some CPU time: server has to compress the data and client has to decompress it. So there’s little sense in using it if your Web application is on the same host as the database.

When the database is on a dedicated server then compression essentially means trading CPU time (on both server and client) for network time. Obviously, if the network is fast enough, the benefit in network time will not outweight the loss in CPU time. The question is, where exactly does the border lie?

It turns out that 100 Mbit link (with 1.4 ms round-trip time) is not fast enough. Oleksandr Typlynski, one of the Sphinx users, has conducted a benchmark, indexing 600 MB of data over 100 Mbit link. The data was textual and compressed well, reducing traffic more than 3 times. With compression, total indexing time reduced to 87 sec from 127 sec. That’s almost 1.5x improvement in total run time. MySQL query time improvement is even greater. On the other hand 1 Gbit link was fast enough; and total run time was 1.2x times worse with compression.

The bottom line: if you’re fetching big result sets to the client, and client and MySQL are on different boxes, and the connection is 100 Mbit, consider using compression. It’s a matter of adding one extra magic constant to your application, but the benefit might be pretty big.

December 19, 2007

MVCC: Transaction IDs, Log Sequence numbers and Snapshots

Posted by peter

MySQL Storage Engines implementing Multi Version Concurrency Control have several internal identifiers related to MVCC. I see a lot of people being confused what they are and why they are needed so I decided to take a time to explain it a bit. This is general explanation, it does not corresponds to Innodb in particular and some implementation can be different but I hope this will let you to understand MVCC a bit better.
[read more...]

December 18, 2007

Fixing column encoding mess in MySQL

Posted by shodan

Just had an interesting issue with an encoding mess on a column containing non-ASCII (Russian) text. The solution was not immediately obvious so I decided it's worth sharing.

The column (actually the whole table) was created with DEFAULT CHARSET cp1251. Most of the data was in proper cp1251 national encoding indeed. However, because of web application failure to properly set the encoding, some of the rows were actually in UTF-8. That needed to be fixed.

Simply using CONVERT(column USING xxx) did not work because MySQL treated the source data as if it was in cp1251. One obvious solution would be to write a throwaway PHP script which would SET NAMES cp1251, pull the offending rows (they'd come out in UTF-8), iconv() them to proper cp1251, and UPDATE them with new values.

However it's possible to fix the issue within MySQL. The trick is to tell it to treat the string coming from the table as binary, and then do charset conversion:

CODE:
  1. UPDATE table SET column=CONVERT(CONVERT(CONVERT(column USING binary) USING utf8) USING cp1251) WHERE id=123;

This can be further simplified. After 2nd conversion the result is in UTF-8, and this time MySQL knows that it's UTF-8 as well. So it will perform conversion to per-table charset automatically, and the 3rd explicit CONVERT can be omitted.

CODE:
  1. UPDATE table SET column=CONVERT(CONVERT(column USING binary) USING utf8) WHERE id=123;

The same trick could be applied to fix notorious KOI8-R vs CP1251 encoding issues in Russian, and other national SBCS encoding vs UTF-8 issues, I suppose. But, of course, ideally you'd always properly set the encoding in the Web application and avoid these issues at all.

December 16, 2007

Slides from Egypt talks

Posted by Vadim

I published slides from my two talks from our Egypt company meeting.
First one is about Sphinx search engine. The originally presentation was prepared by Andrew Aksenoff, author of Sphinx for HighLoad.ru conference, I translated it on English for our meeting.
And second about Mercurial version system, the tool we decided to use for our internal and web projects.

December 12, 2007

Back from company meeting in Hurghada, Egypt

Posted by peter

Percona team and families
So we're back from the company meeting which took place last week in Hurghada, Egypt. It was a great opportunity to meet a lot of people for the first time in person to work together and have some fun.
[read more...]

December 9, 2007

Be careful rotating MySQL logs

Posted by peter

If you enable logging of all queries as "slow queries" using the patch or MySQL 5.1 you can get log file to grow huge. Same may happen with general log file. In some cases we've got log file sizes of 100G or more which may need to be cleaned up.

Here is some danger waiting for you at least on typical Linux systems - If you follow most simple process - delete log file you do not need and run "FLUSH LOGS" so file is recreated and space reclaimed you can get into serious trouble. File close operation will when perform deletion which can be quite a long process, depending on filesystem and log file size - we've seen it to take 10 minutes or more in some cases. When log file is reopened MySQL will be practically unavailable causing unanticipated downtime.

The workaround for this problem is very simple - instead of deleting MySQL log file - rename it, call "FLUSH LOGS" which will be instant as it will not involve complex delete operation and when you can delete the log file you no more need.

It can be also good idea to hook up log rotate to take care of MySQL logs so you would not need to cleanup them manually. In some cases it is already setup if you use MySQL distribution supplied by OS vendor.


This page was found by: site:www.mysqlperfor... mysql uncompress utf...