September 30, 2009

Admin notice: comments again broken

Posted by Vadim |

(This note should not go to PlanetMySQL, sorry if it goes).
We again have problem with comments to blog posts. It’s technical problem, we did upgrade some components and it seems it affected captcha. We are working on fix.
We are sorry for the inconvenience and thank for reports about the problem.

September 29, 2009

Quick comparison of MyISAM, Infobright, and MonetDB

Posted by Baron Schwartz |

Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client’s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they’re ad-hoc (can access any columns) and it seems tailor-made for a column-oriented database.

I decided it was time to actually give Infobright a try. They have an open-source community edition, which is crippled but not enough to matter for this test. The “Knowledge Grid” architecture seems ideal for the types of queries the client runs. But hey, why not also try MonetDB, another open-source column-oriented database I’ve been meaning to take a look at?

[read more...]

September 28, 2009

How number of columns affects performance ?

Posted by peter |

It is pretty understood the tables which have long rows tend to be slower than tables with short rows. I was interested to check if the row length is the only thing what matters or if number of columns we have to work with also have an important role. I was interested in peak row processing speed so I looked at full table scan in case data fits in OS cache completely. I created 3 tables – First containing single tinyint column which is almost shortest type possible (CHAR(0) could be taking less space), table with 1 tinyint column and char(99) column and table with 100 tinyint columns. The former two tables have the same row length but have number of column different 50 times. Finally I have created 4th table which is also 100 columns but one of them is VARCHAR causes raw format to be dynamic.

[read more...]

Why InnoDB index cardinality varies strangely

Posted by Baron Schwartz |

This is a very old draft, from early 2007 in fact. At that time I started to look into something interesting with the index cardinality statistics reported by InnoDB tables. The cardinality varies because it's derived from estimates, and I know a decent amount about that. The interesting thing I wanted to look into was why the cardinality varies in a particular pattern.

Here I'll grab a bunch of cardinality estimates from sakila.film on MySQL 5.0.45 and put them into a file:

CODE:
  1. baron@kanga:~$ while true; do mysql sakila -N -e 'show index from film' | head -n 2 | tail -n 1 | awk '{print $7}'; done> sizes

After a while I cancel it and then sort and aggregate them with counts:

CODE:
  1. baron@kanga:~$ sort sizes | uniq -c
  2. 157 1022
  3. 156 1024
  4. 156 1058
  5. 156 1059
  6. 156 1131
  7. 313 951
  8. 312 952
  9. 312 953

Look at the distribution of the counts. The weighted average of these is 1000.53, so it's close to the truth (1000 rows). But five of the eight distinct estimates are shown about one-half as often as the others; it looks like the random choice of which statistic to use is not evenly distributed.

I mentioned this to Heikki and he pondered it for a bit -- but neither of us really figured out what was going on. I know the code superficially, but not as well as he or Yasufumi or others do; and I was not able to find a cause.

More recently I saw that I'm not the only one who notices oddities in the random number generation. I waited. And indeed the fixes for that bug seemed to have fixed the skew in the statistics. Case solved, and all I had to do was wait. Truly, laziness is a virtue.

September 25, 2009

InnoDB/XtraDB Training in New York City!

Posted by Morgan Tocker |

Our Santa Clara/San Francisco training went great - 100% of survey respondents said they would recommend the same course to a friend.  I'm pleased to announce that such an opportunity exists - our next training location will be New York City on October 30, 2009.

We've booked a training venue in the financial district of Manhattan (90 Broad Street New York, NY 10004), and it seems like a great opportunity to switch from using hotels to teaching in real classrooms.  This means that every student will have a (Linux) computer provided, and the instructor will have a whiteboard to be able to scribble.

Some other changes:

  • The start time will now be 9:30 AM. Thanks to those who gave feedback - this seemed to be one of the biggest concerns. The finish time will be around 5 PM.
  • With the class size going down, we have a little freedom to tweak our format.  One of those changes is that we plan to increase time for exercises in class just a little bit more.

Tickets are $450/student, and we're continuing our early bird special (a copy of High Performance MySQL 2nd Ed).  Interested in signing up?  Registration is open!

September 24, 2009

Speaking at Highload.ru

Posted by Morgan Tocker |

This is a quick announcement to say that I'll be speaking at HighLoad++ this year (October 12-14 in Moscow).  I'll be presenting on a few topics:

  • MySQL Performance Tuning (Conference Session)
  • Quick Wins with Third Party Patches for MySQL (Conference Session)
  • Performance Optimization for MySQL with InnoDB and XtraDB * (Full day class)

This will mark my first trip to Russia - and oh boy am I excited.  I'm taking a few days vacation after so I can tour around Saint Petersburg.  Want to say hello?  Let me know at morgan-at-percona-dot-com!

* Yes, this is the same as our InnoDB course we taught last week in Santa Clara and San Francisco.  More venues are coming in the next couple of days - wait for another blog post!

September 20, 2009

Guidance for MySQL Optimizer Developers

Posted by peter |

I spend large portion of my life working on MySQL Performance Optimization and so MySQL Optimizer is quite important to me. For probably last 10 years I chased first Monty and later Igor with Optimizer complains and suggestions. Here are some general ideas which I think can help to make optimizer in MySQL, MariaDB or Drizzle better.
[read more...]

September 19, 2009

Multi Column indexes vs Index Merge

Posted by peter |

The mistake I commonly see among MySQL users is how indexes are created. Quite commonly people just index individual columns as they are referenced in where clause thinking this is the optimal indexing strategy. For example if I would have something like AGE=18 AND STATE='CA' they would create 2 separate indexes on AGE and STATE columns.

The better strategy is often to have combined multi-column index on (AGE,STATE). Lets see why it is the case.
[read more...]

September 16, 2009

How to generate per-database traffic statistics using mk-query-digest

Posted by Ryan Lowe |

We often encounter customers who have partitioned their applications among a number of databases within the same instance of MySQL (think application service providers who have a separate database per customer organization ... or wordpress-mu type of apps). For example, take the following single MySQL instance with multiple (identical) databases:
[read more...]

September 15, 2009

Which adaptive should we use?

Posted by Yasufumi |

As you may know, InnoDB has 2 limits for unflushed modified blocks in the buffer pool. The one is from physical size of the buffer pool. And the another one is oldness of the block which is from the capacity of transaction log files.

In the case of heavy updating workload, the modified ages of the many blocks are clustered. And to reduce the maximum of the modified ages InnoDB needs to flush many of the blocks in a short time, if these are not flushed at all. Then the flushing storm affect the performance seriously.

We suggested the "adaptive_checkpoint" option of constant flushing to avoid such a flushing storm. And finally, the newest InnoDB Plugin 1.0.4 has the new similar option "adaptive_flushing" as native.

Let's check the adaptive flushing options at this post.
[read more...]