June 29, 2007

Top 5 Wishes for MySQL

Posted by peter

About a week ago Marten send me email pointing to his article published on Jays Blog (Come on Marten, it is time for you to get your own blog). I should have replied much earlier but only found time to do that now. So here is my list

1. Be Pluggable

Unlike many OpenSource projects MySQL was single chunk of code and for years the only way you could officially extend it was using UDFs which was very limited. Compare this with other OpenSource projects such as PostgreSQL (plugable indexes etc), Apache, PHP or Linux Kernel. Yes in MySQL 5.1 the situation is changes - now there are plugable storage engines (something even PostgreSQL does not have) as well as Full Text Search parses but there is very long way to go before you could do any significant functionality ourside of storage engines as plugins.

Two aspects why I’m especially interested about it is Sphinx which is currently implemented as storage engine for MySQL but this role does not suite it well as well for our performance optimization works - we would like to get much more stats from MySQL and it would be great if we could write these as plugins instead of patches.

2. Be Scalable

I’d like to see MySQL Scaling to large amount of CPUs for all storage engines all operation systems and all workloads. Well this may be too much of the wish but at least it should happen for most common cases. Innodb scaling issues are the most common problem (and patches currently available only fix part of it) but far from being only problem. There are fair amount of issues with MyISAM and generally on MySQL level which needs to be taken care of. Many of these issues were known for years. The good thing is now with multi-cores coming even on laptops these issues can’t be under carpet any more and they are getting attention and gradually being fixed.

3 Be Distributed
One of the most common buzzwords used together with MySQL is Scale Out, but to say the truth MySQL does not provide much of the functionality for scale out itself, besides its fast and simple replication. You still would normally need to manually chop your data to the multiple databases and handle data merging on application together with high availability and all kinds of problems. The dream for many MySQLers is to be able to get bunch of boxes and use them same as single MySQL server with improved performance and high availability. Of course there is MySQL Cluster but it does not get where quite yet having very particular performance properties and operating requirements. In fact I do not think MySQL Cluster would ever be that thing as it provides too strong guarantees. It is same as with MySQL Replication - the reason it is fast and has low performance overhead on the master is at large extent because it is asynchronous. To offer great performance on distributed systems, especially if they are connected via not very fast network you need to come up with different consistence guaranty and semantics… and believe me most of Web applications could live with it. They anyway read stale data from the slave or memcached manually relaxing data consistence requirements.

But honestly most parts are OK - over time you develop practices and tool set to deal with multiple nodes nicely even if you do not have much support from MySQL to do so. What is really bad is Parallel query. Reporting with group by over 10.000.000 of rows in real time is one painful thing to do requiring complex application programming to do efficiently in distributed manner. Of course there are summary tables and other tricks we use when such need arises but we still can’t deny - MySQL does not handle complex queries well, being able to use only one CPU effectively on single node and no true federation support.

4. Be Solid
MySQL has very good stability track record, with exception of MySQL 5.0 it was typically safe to run it in production well before it was stable (I remember we had successful production on 3.23 alpha). However many of new releases score pretty poor on the scale of being Solid.

By by being Solid I mean Solid functionally complete features which work as uses intuitively expect them to work instead of having a lot of little known documented (sometimes) limitations. I’m saying about Prepared Statements in MySQL 4.1 which came without query cache support but even worse not all of statements could be prepared not to mention other issues. I’m saying about SubSelects which were added in MySQL 4.1 but you still could not run (and still can’t) most of your Oracle/DB2/MS SQL Queries on MySQL because subqueries optimizer weakness. I’m saying about XA which were released in MySQL 5.0 without consistent work with MySQL close Partner - Innobase to ensure Group Commit still works in 5.0 and so we do not get major regressions. I’m saying about Stored Procedures which were released without proper development and debugging support. These are just few examples.

Of course you could say “release early release often” is a good strategy and at large extent I would agree but MySQL is not released often any more. There is relatively long release cycle with great deal of planning and it looks like no features go in stable release just the day before it is released. With such release management I would expect MySQL to try to be a lot more consistent as such approach means if something is inconsistent it stays where for very long time. For example Query Cache support for Prepared Statements is only fixed in MySQL 5.1 and Most of other items are still open.

Well may be this is remainder of wild first VC-backed years of MySQL and not things are getting straightened up. I know on a lot of listed issues the work is going on and it just takes time to get them all fixed.

5. Do not forget about the roots.
MySQL has gained its popularity as database for Web but web users do not pay much because they are OK with GPL and often have smart guys inside who can figure things out and even patch MySQL if needed. So MySQL went to chase Enterprise market and develop enterprise features pushing back a lot of features which Web users would like. These features are often small, typically not SQL Standard compatible but help you if you’re web developer a lot. In my Early years in MySQL I have created more than hundred of such little projects suggested by MySQL customers and myself based on long MySQL usage for Web. Later I stopped adding these because I have not seen much traction for these tasks as well as of similar tasks created by other guys.
These would include support for query timeouts, parallel query execution, CONNECT BY support and various performance related features.

I’m not sure if MySQL got back to spend significant resources on taking care of Web guys needs but at least we have now Community tree and Web guys can take care of themselves if they want to. Jeremy Cole’s patches were accepted, our patches and I guess Google patches are in the queue which is very good sign. There is still fair amount of work to do to straighten up this process so good patches can quickly appear in community tree. Once this is done I think amount of contributions will increase a lot.

June 26, 2007

Can Innodb Read-Ahead reduce read performance ?

Posted by peter

I ran into pretty interesting behavior today. We needed to dump and reload large database and we had pretty good IO subsystem so we started number of mysqldump processes in parallel. Unlike in other case when we did load in parallel, dump in parallel did not increase IO rate significantly and we could still see just about 1.5 average outstanding IO requests to the disk.

Lets look at SHOW INNODB STATUS:

——–
FILE I/O
——–
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: doing file i/o (read thread) ev set
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 256, aio writes: 0,
ibuf aio reads: 0, log i/o’s: 0, sync i/o’s: 0
Pending flushes (fsync) log: 0; buffer pool: 0
112084404 OS file reads, 29836003 OS file writes, 2038246 OS fsyncs
1 pending preads, 0 pending pwrites
286.27 reads/s, 82658 avg bytes/read, 2.96 writes/s, 0.71 fsyncs/s
————————————-
INSERT BUFFER AND ADAPTIVE HASH INDEX
————————————-
Ibuf: size 108931, free list len 64619, seg size 173551,
56290428 inserts, 59538912 merged recs, 3876269 merges
Hash table size 25499819, used cells 1303661, node heap has 1974 buffer(s)
867.08 hash searches/s, 532.42 non-hash searches/s

LOG

Log sequence number 586 10447960
Log flushed up to 586 10447960
Last checkpoint at 586 10264850
0 pending log writes, 0 pending chkp writes
1338030 log i/o’s done, 0.47 log i/o’s/second
———————-
BUFFER POOL AND MEMORY
———————-
Total memory allocated 13917841802; in additional pool allocated 1046272
Buffer pool size 786432
Free buffers 0
Database pages 784458
Modified db pages 361
Pending reads 261
Pending writes: LRU 0, flush list 0, single page 0
Pages read 344728059, created 2867101, written 37374308
1445.95 reads/s, 0.00 creates/s, 3.35 writes/s
Buffer pool hit rate 933 / 1000
————–
ROW OPERATIONS
————–
8 queries inside InnoDB, 3 queries in queue
12 read views open inside InnoDB
Main thread process no. 3956, id 1157658976, state: sleeping
Number of rows inserted 60790248, updated 11571576, deleted 0, read 63850963520
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 24860.96 reads/s

The stats are rather interesting. There are 8 queries inside Innodb and we can see only one pending pread with 256 Pending normal aio reads.

Pending normal aio reads are in fact counted in pages rather than distinct IO requests, so in this case it is likely these are 4 read-ahead requests, 64 pages each.

What seems to be happening (I’m using “seems” as there are no detailed enough stats available) is once thread has issued read-ahead request it has to wait for it to complete. It can’t simply go and read the next page it needs, if the same page is already being read via read-ahead. Now as we’re dumping tables which were inserted in PK order we have relatively sequential IO and so read-ahead is constantly triggered which makes all thread to wait on single read-ahead thread which is unable to keep up.

This does not explain the whole story to me, such as I’d expect read-aheads to complete faster giving much more than 30MB/sec from this RAID system. Also as average read size is some 80KB we have a lot of non read ahead requests happening as well, at least 10 times more than number of read-ahead requests. May be it is poor read-ahead performance plus multiple threads doing scattered O_DIRECT (so no OS read-ahead) single page reads which hurt performance ?

I guess we really should try Mark Callaghan patches to see what happens if we allow multiple read-ahead threads.

I also can’t wait for Christoffer Hall’s work of Linux AIO support to be merged to the mainline.

This also reminds me about other work Christoffer was doing (I was adviseor in this project from MySQL AB side) - to improve how Innodb read-ahead works by having more overlap between requests so thread would rarely need to stall unable to perform any quick IO and unable to progress until scheduled read ahead can be completed. Hopefully this also will be looked at one day.

June 25, 2007

Does Slow query log logs all slow queries ?

Posted by peter

One may think one may use MySQL Slow query log to log all slow queries to catch problematic queries or for audit purposes. In fact however not all the queries are logged. I already mentioned mysql slave queries are not logged to slow query log and it looks like I was wrong connecting it just with replication.

In reality it is not the fact the thread is replication thread causes queries to be omitted from slow query log but the fact thread uses SET TIMESTAMP functionality. If you do this within normal connection result would be the same.

Why is this happening ? I guess it happens because of the way the code is structured. During query start current timestamp is stored to special value which will be used for the query execution by all NOW() calls internal timestamp assignments etc. At certain point someone thought it is nice idea to be able to go back to future or to the past and SET TIMESTAMP was implemented, which instead of patching bunch of function simply writes different timestamp to this variable. When query execution is completed, normally another time() is used to get current time and time difference between start and end time is used to check if query should be logged as slow query. This obviously does not work if start time can be in the future or in the past so slow query logging had to be disabled for such queries.

What does it mean ? Simply what you can’t really trust standard slow query log - some queries may be omitted from it. Also if someone would like to hide complex queries from you it can be easily done by using SET TIMESTAMP statement before running the query.

June 18, 2007

Using CHAR keys for joins, how much is the overhead ?

Posted by peter

I prefer to use Integers for joins whenever possible and today I worked with client which used character keys, in my opinion without a big need. I told them this is suboptimal but was challenged with rightful question about the difference. I did not know so I decided to benchmark.

The results below are for MySQL 5.1.18 using MyISAM and Innodb tables. This time unlike other benchmarks I decided to do Join not on primary key and have query to read data for both tables. If the query would be index covering I would expect us to see different ratio. The query I use here is constructed to stress out join code while avoid sending data to the client Do not try to find any good meaning for query or schema. For joins which fetch just few rows difference is likely to be less as the join code itself is likely to be responsible for less portion of response time.

[read more...]

June 16, 2007

MySQL 5.1 Command line client improvements

Posted by peter

Just found this little handy feature today:

SQL:
  1. mysql> INSERT INTO c SELECT rand()*1000, sha1(rand()) FROM c;
  2. Query aborted BY Ctrl+C
  3. ERROR 1317 (70100): Query execution was interrupted

So now if you press CTRL-C MySQL Command Line Client will not exit but will terminate query being executed. This is great as so many queries assumed CTRL-C would also abort running query in previous MySQL versions and I've seen many monstrous queries left running without noticing just because people would think CTRL-C terminated the query. Now it finally really does.

Great Job! I wish I would see more of such relatively simple but very handy improvements :)

Innodb Second Start prevention bug ?

Posted by peter

I filed the following bug today. Basically it looks like in certain conditions Innodb may be started twice in the parallel on the same data files which can hurt database badly. It is true it does not seems to be 100% repeatable but I've seen it multiple times in different environments. I'm not sure if this is some platform which has issues with locking, Some race condition which allows to copies to start or anything else.

In any case you'we been warned - be careful and ensure MySQL is not already started when you're thinking to start one. Now when
MySQL starts to listen to the socket after Innodb Recovery is completed it is very easy to make a mistake and think MySQL is not running at all when recovery is in progress.

June 15, 2007

Commodity Hardware, Commodity Software and Commodity People

Posted by peter

In the previous post I mentioned not all architectures and solutions work for Commodity People, and people seems to agree with me.
Number of vendors would claim they are in Commodity Software or Hardware business but few would probably mention they are doing it for Commodity People, because few people would like to be called commodity - each of us would like to rightfully think he is special and unique.

Thinking more about the topic I think being "Commodity People Friendly" is one of important properties for commodity products. Look for example at Dell HP or Whitebox x86 servers, they are not only cheaper but they are also easier to use than Mini Computer systems from IBM. Directly attached storage is more simple to use than SAN, MySQL is more simple to use than Oracle or DB2, PHP is more simple than Java.

Even for the same Vendors you can find commodity products are designed to use by commodity people - they tend to be more failsafe and easy to use than higher end ones. Look for example at LinkSys home routers, by CISCO and compare it to traditional IOS based series. HP is another example Vendor being on commodity and high end markets.

I believe at large extent MySQL gained its popularity due to these properties of being simple to use and forgiving to the errors. Look at early MySQL versions - if you insert too long data it would just strip it but continue working, you can copy (MyISAM) table while server is running and it would typically work. Even if table is corrupted MySQL still would run just giving you errors sometimes. No transactions means you do not have to deal with deadlocks or learn isolation modes. Perfect solution for absolute beginner.

Now from queries design MySQL would support a lot of functions (which you can always lookup in the manual) but it did not support any complex constructs like subqueries or views so you rarely would scratch your head thinking what is this query you peer has written suppose to do.

Of course you could tell me who cares about "commodity people" - we would hire smartest guy out where who can compute optimal join combination for any 15 way join in their head not to mention understanding all ups and downs of database management system. This sounds nice in theory but there are not so many smart guys out where, and if they are they may be pretty expensive so you can't have many of them in the team. In many MySQL Projects there are not even dedicated MySQL people and Web developers simply use MySQL as they fit suitable and one of them assumes DBA roles and installs MySQL and chases developer if they write too bad queries (in a good case). In a bad case you may find MySQL which just happened to come with OS and queries which no one ever run EXPLAIN for which just happen to work anyway because of very small database size.

Now, for a smaller projects even if you happened to have smart MySQL guy you might not be in a better shape as your business may be at risk if he leaves and no one else is able to understand his smart ways. This can be much worse than having commodity solutions used which may not be optimal but which everyone in the team understand and is able to support if needed.

What scares me in MySQL Development is what is is quickly leaving this Commodity Space in terms of overall feature complexity. MySQL 5.2 will give you many storage engines to play with (many with transactions an some with clustering) with support of partitioning stored procedures views and a lot of stuff. Think how much freedom evil smarty has to design something which would be hard for other people to understand and support. Interesting enough the fact MySQL is Open Source puts it in a worse shape than Oracle and other systems here - typically you get charged more for Advanced features but with MySQL everything is free out where to try so there is no financial barriers stopping you from shooting yourself in the foot.

MySQL Skills also would likely loose portability. If you look at MySQL 4.0 you could simply ask if person knows MySQL or does not, in the new versions it becomes possible for someone to be an expert in one area and familiar with one design approach but not with the other. Someone could be great MySQL Expert but have no experience with MySQL Cluster, other may be good with MySQL Cluster but have no idea how to write storage engines or optimize for Falcon storage engine. Relatively simple 1000 page book which would cover pretty much all of MySQL features in version 4.0 becomes a book shelf for new MySQL versions.

I can't say this is any unusual development. I think it is natural for software to chase features, because this is where customers are leading the product ("implement this and we'll buy it") but this is also why old products become feature overkill, slow and complicated. I'm not in positions to complain though, increasing complication will mean more services will come to us for Consulting Services

Do I see MySQL replaced in its space any time soon ? I do not think so. I do not think we need yet another SQL database because SQL language itself is way to complicated and outdated, plus it is not expressive enough for many modern application needs. I would expect solutions to be developed which operate on more flexible data structures, handle distributed semantics of web better and are expressive in a different form. I know we already had a false start in this area with XML databases but this is not unusual for technology to rethink itself and gain the market from the second attempt.

Some interesting developments we kind of have in this are is of course famous Google BigTable or FaceBook API, I know few other companies have their special "database" interfaces which run on top of MySQL or other databases. Other interesting development is Scalable Blob Streaming. This projects starts with retrieving data from storage engine using HTTP Protocol and I expect it would not be long before other operations would follow.

One thing I was thinking a lot is all these great Storage Engines - do they all really need MySQL to run ? At this point all Open Source storage engines out where are for MySQL but why could not one develop smaller and lighter top part ? The same way as you can run PHP as part of Apache but you can hook it up to bunch of other web servers as well, such as lighttpd.

In fact in this area MySQL Cluster, which was always best isolated of MySQL source leads the pack - there are bunch of interfaces to talk to MySQL cluster as PHP session storage or as a REST Web Service, which are all rather interesting.

One more interesting development I expect we might see is more active use of remote storage services. Amazon has S3 service which deals with files, I bet similar service could be designed for many of data store applications, especially for specific needs when for example large amounts of data need to be analyzed which requires large amount of hardware to offer quick response time, but only for short time frame.

Anyway it is hard to predict the future so it would be fun to watch how things develop.

June 12, 2007

RAID and Scale Out Discussions

Posted by peter

Just found this wonderful summary of articles by Jeremy and wanted to give some of my thoughts on the topic.

First lets speak about death of the RAID. I think this is far from the case especially if you consider Software RAID here.

For many workloads you would like to get RAID just for the sake of BBU. As Jeremy mentioned RAID is cheap these days if you buy right one and can offer substantial improvement for write intense workload by safe write buffering and write merging.

Performance is another story. RAID is usually easiest way to get extra performance from your IO subsystem. Spreading the database among say 10 commodity boxes is often expensive for existing applications, even for new applications it will affect development time and complexity and it is well possible it might be no guys inside the company skilled enough to work with distributed systems.

I think to many considerations are being made thinking about commodity hardware and really smart people, while in most cases tasks have to be solved with commodity hardware and commodity people.

RAID can be indeed slower than direct disks. For example for BoardReader now indexing almost billion of forum posts we decided to go ahead with raw drives for search index because it is far faster than using software RAID. But this is only because Sphinx allows using highly parallel architectures which make it possible. The boxes we use indeed even have RAID available which is used for OS hard drive.

Indeed using RAID for OS partition (software) is what we use even for low end boxes. This has zero cost and offer redundancy in case of hard drive failure.

Now regarding infrastructure. When being called for Consulting I often ask how large system is needs to growth and how much effort people would like to spend on the infrastructure because this defines a lot what type of hardware and in which fashion can be used. You can build infrastructure which would use crappy boxes as redundant array of inexpensive servers, basically moving redundancy one way up from the disk. It however would be rather expensive and can be only justified with extreme hardware needs. In fact even Google as far as I remember uses decent servers with RAID for their MySQL installations.

I surely agree with Jeremy on Commodity does not mean Crapy thing. There is usually a sweet spot when you can get well performing hardware with good price/performance. We use a lot of Dell PowerEdge 2950 with 6 hard drives and 16GB of RAM.
With other vendors I see people using 2*2 Opterons with 32GB of RAM and 10-14 hard drives. Both solutions have comparable price/performance, though former depends on your application quite a lot.

I think MySQL is pretty well supported by matching hardware these days. As it does not use many CPUs and many hard drives very efficiently in many cases so going much higher does not buy you much anyway.

Not all the needs are satisfied that well by vendors. Dell for example for some reason failed to provide cheap 32GB of memory option even with their new Opteron servers. They also do not provide high volume internal storage option. Ie find me 12 or more hot swap SATA hard drives in the case in Dell range, while you can by appropriate cases from Supermicro and a lot of other vendors. I think this is where technical possibilities clash with sales needs - there is more revenue up-selling to external directly attach storage or SAN

MySQL Stored Procedures problems and use practices

Posted by peter

To be honest I'm not a big fan of Stored Procedures, At least not in the form they are currently implemented in MySQL 5.0

Only SQL as a Language Which is ancient ugly for algorithmic programming and slow. It is also forces you to use a lot of foreign constructs to "original" MySQL style - to process data via cursors, handle error via Handlers etc. If you spent last 10 years writing Stored Procedures for Oracle or DB2 it may be cool and convenient for you, but not for me :)

Lack of Debugging I like to be able to debug software, if not full blown debugger I'd like to have things like echo and var_dump. Due to the context of execution these are not easy though. Of course you can code a little helper Debug Storage Procedure which will log some information in MySQL table but it is not convenient enough.

Bad Parser Error Messages MySQL Parser is in general far from perfect when it comes to error handling. "You have an error in your SQL syntax ... near '' at line 1" is not very helpful even when dealing with large queries but for Stored Procedures that is the real issue. So you have to overcome few road blocks even before you start fighting with debugging.

No Profiling tools If you're interested in Performance you need a way to profile what inside stored procedures is taking the time, which is not something readily available. Even if you look at most simple and typically most time consuming part of Stored Procedure execution - running of SQL queries - you do not get these logged, instead slow query log will contain full stored procedure calls.

On the other hand Stored Procedures Indeed can help to Improve MySQL Performance. For DBT2 benchmarks we've tried a while back MySQL 5.0 was about 10% slower than 4.1 without stored procedures but was 20% faster if Stored Procedures are used. And this is of course not the limit. You may also have other reasons to use Stored Procedures besides performance and these can be valid.

What I tend to do if Stored Procedures are helpful for Performance reasons is to have two code versions, one using stored procedures and other doing same thing using direct statements. This allows to debug and profile most of the things comfortably and works pretty well especially if you keep your stored procedures simple so you do not add much bugs converting code from your language of choice to SQL.

There is one more thing you should beware with Stored Procedures is to put a lot of computational load in them. Not only the language is slow at it but you're also loading CPU on your Database Server, which is typically more expensive to scale than Application/Web Servers. It is OK however to do some simple math if it can help you to avoid sending large result set back to the client as in this case server may need to do more work to send it back than Stored Procedure to process it.

Stored Procedures also have number of performance gotchas which I've been running into production but have not taken a time to research into details. So just beware.

I also hope other time MySQL and third parties will develop tools and extensions to target many of the problems mentioned.

June 6, 2007

MySQL 4 to MySQL 5 Upgrade performance regressions

Posted by peter

This week I already had two serious performance regression cases when upgrading from MySQL 4.0 and 4.1 to MySQL 5.0. By serious I mean several times performance difference not just 5-10% you often see for simple queries due to generally fatter code.

The problem in both cases was MySQL 5.0 broken group commit bug.

First I should note I am extremely unhappy how MySQL handled this problem. While working for MySQL we spotted this problem early in MySQL 5.0 release cycle as it was introduced and reported it to everyone we could inside the company - this was over 2 years ago. Few months later I created a bug for this issue to get more public attention to the problem and giving extra motivation to MySQL to fix it. Few months later I blogged about this problem with more performance results but as we can see the bug is still in Verified stage and there is no indication any work is going to have it fixed.

I can agree this may be fundamental issue which is not easy to fix, But why is not it mentioned in MySQL 4.1 to 5.0 upgrade notes ? ?

Furthermore if there were no good ideas how to make XA to work with group commit why would not you keep old working code path if XA is disabled ? Many customers do not flush binary log anyway and use single transactional storage engine so they do not care anyway.

Anyway. Enough complains. We have this problem and we have to live with it, most likely MySQL 5.0 and 5.1 would not get any fixes for this problem, so lets see who is affected, how to check you're affected and how to fix it.

Who is affected ? The good thing is only cheap guys which care about their data are typically affected, meaning you have to have innodb_flush_logs_at_trx_commit=1 so transactions are truly durable. You have to have log-bin enabled to get replication or point in time recovery, but at the same time you should not have hardware RAID or have one without battery backed up cache (BBU unit). I guess this is one of the reason why this bug did not get so much traction inside MySQL - because paying customers would normally have enough money to get BBU unit which is great for performance anyway. Of course you also have to have plenty of concurrent transactions so group commit would trigger in MySQL 4.0 and large number of transactions in total so serializing them would make MySQL unable to keep up. Disks can do 80-150 single page fsyncs per second to get you an idea bout number.

How to spot you're affected This one is interesting. If you have update prevailing load you would see very strange behavior on 5.0 of MySQL being slow but few queries being "inside innodb" and potentially even empty queue. This is because bottleneck happens in commit phase which is not counted as "inside innodb". I wish there would be some stats for number of queries waiting to be committed but there is not one easily readable. You can see it from other symptoms though. You would see queries in "SHOW PROCESSLIST" stuck in "end" stage or have "commit" queries in the processlist for multiple statement transactions. Looking at SHOW INNODB STATUS you would notice large amount of log writes and fsyncs per second which matches your hard drive capacity. Plus you would normally see single outstanding log write all the time. There are other ways you can spot the problem as well but these are probably most obvious and easy to use.

How to fix the problem ? Assuming getting back to MySQL 4.1 is not the option you can do one of 3 things to get some of your performance back. XA support has its overhead anyway so you may not get same performance as with MySQL 4.1

  1. Disable Binary Logging This allows to get group commit back but obviously you loose point in time recovery and replication.
  2. Use innodb_flush_log_at_trx_commit=2 This is probably best solution. In many cases this would be good change to do with MySQL 4 also because 100% durable transactions are not required anyway and it would allow to get some extra speed. It is often left at default value without a good reason.
  3. Get BBU If you can't use any of first two workarounds you better to get battery backed up cache unit and make sure you set your RAID cache policy to "write back". One of the customers I worked with indeed had battery backed up cache on their system... it just was in "write through" cache policy, so was basically disabled. Note getting BBU is often good idea anyway so you can use this together with other workarounds. Also it is worth to mention BBU does not fix the problem but dramatically raises number of update transactions per second needed to trigger the problem. Without BBU 200 per sec may be well enough with BBU you may only see it if you have 2000 update transactions per second or so, which few people reach.

The other way to approach the problem is of course to work on the application - in large amount of cases the problem happens in case there are away to many updates outside of transactions in auto_commit mode. Wrap them in transactions and reduce number of commits if you can. It is great optimization idea anyway.

Also please do not read this post as MySQL 5.0 is junk in terms of performance and you should stay on MySQL 4.1 until MySQL has taken it away from you as already happened with MySQL 4.0. MySQL 5.0 can offer substantial performance improvements in variety of cases as well as has other benefits. This is simply important regression which you better to know about.


This page was found by: "explain process of ...