June 29, 2009

Few more ideas for InnoDB features

Posted by Vadim |

As you see MySQL is doing great in InnoDB performance improvements, so we decided to concentrate more on additional InnoDB features, which will make difference.

Beside ideas I put before http://www.mysqlperformanceblog.com/2009/03/30/my-hot-list-for-next-innodb-features/ (and one of them – moving InnoDB tables between servers are currently under development), we have few mores:

- Stick some InnoDB tables / indexes in buffer pool, or set priority for InnoDB tables. That means tables with bigger priority will be have more chances to stay in buffer pool then tables with lower priority. Link to blueprint https://blueprints.launchpad.net/percona-patches/+spec/lru-priority-patch

- Separate LRU list into several lists, and in this way it will allow us to emulate several buffer pool, with features to keep different tables in different buffer pools and also to decrease contention on buffer pool. Link https://blueprints.launchpad.net/percona-patches/+spec/multiple-lru-patch

- We are looking to include Waffle Grid into XtraDB releases with some additional features like caching buffer pool on SSD.

If ideas are interesting for you and you want to support them, contact us

March 25, 2009

Adjusting Innodb for Memory resident workload

Posted by peter |

As larger and larger amount of memory become common (512GB is something you can fit into relatively commodity server this day) many customers select to build their application so all or most of their database (frequently Innodb) fits into memory.

If all tables fit in Innodb buffer pool the performance for reads will be quite good however writes will still suffer because Innodb will do a lot of random IO during fuzzy checkpoint operation which often will become bottleneck. This problem makes some customers not concerned with persistence run Innodb of ram drive
[read more...]

December 22, 2008

High-Performance Click Analysis with MySQL

Posted by Baron Schwartz |

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event.

The next characteristic of a lot of these systems (real or planned) is the desire for “real-time” analysis.  Our customers often want their systems to provide the freshest data to their own clients, with no delays.

Finally, the analysis is usually multi-dimensional.  The typical user wants to be able to generate summaries and reports in many different ways on demand, often to support the functionality of the application as well as to provide reports to their clients.  Clicks by day, by customer, top ads by clicks, top ads by click-through ratio, and so on for dozens of different types of slicing and dicing.

And as a result, one of the most common questions we hear is how to build high-performance systems to do this work. Let’s see some ways you can build the functionality you need and get the performance you need. Because I’ve built two such systems to manage online ads through Google Adwords, Yahoo, MSN and others, it’s easy and familiar for me to use the example of search engine marketing. I’ll do that throughout this article.

[read more...]

December 14, 2008

SHOW OPEN TABLES – what is in your table cache

Posted by peter |

One command, which few people realize exists is SHOW OPEN TABLES - it allows you to examine what tables do you have open right now:

SQL:
  1. mysql> SHOW open TABLES FROM test;
  2. +----------+-------+--------+-------------+
  3. | DATABASE | TABLE | In_use | Name_locked |
  4. +----------+-------+--------+-------------+
  5. | test     | a     |      3 |           0 |
  6. +----------+-------+--------+-------------+
  7. 1 row IN SET (0.00 sec)

[read more...]

November 10, 2008

Thoughs on Innodb Incremental Backups

Posted by peter |

For normal Innodb "hot" backups we use LVM or other snapshot based technologies with pretty good success. However having incremental backups remain the problem.

First why do you need incremental backups at all ? Why not just take the full backups daily. The answer is space - if you want to keep several generations to be able to restore to, having huge amount of full copies of large database is not efficient. Especially if it only changes couple of percents per day.

The solution MySQL offers - using binary log works in theory but it is not overly useful in practice because it may take way too long to catch up using binary log. Even if you have very light updates and can execute updates for a full day within an hour it will take over 24 hours to cover month worth of binary logs... and quite typically you would have much higher update traffic.
[read more...]

November 6, 2008

Living with backups

Posted by Maciej Dobrzanski |

Everyone does backups. Usually it’s some nightly batch job that just dumps all MySQL tables into a text file or ordinarily copies the binary files from the data directory to a safe location. Obviously both ways involve much more complex operations than it would seem by my last sentence, but it is not important right now. Either way the data is out and ready to save someone’s life (or job at least). Unfortunately taking backup does not come free of any cost. On the contrary, it’s more like doing very heavy queries against each table in the database when mysqldump is used or reading a lot of data when copying physical files, so the price may actually be rather high. And the more effectively the server resources are utilized, the more that becomes a problem.
[read more...]

September 8, 2008

Development plans

Posted by Vadim |

We gathered together our ideas of MySQL improvements on this page http://www.percona.com/percona-lab/dev-plan.html
and we are going to implement some of them.
My favorite one is - make InnoDB files .ibd (one created with --innodb-file-per-table=1) movable from one server to another, however it is sort of challenging.
Probably next one patch we want to integrate is Google's smp-fix or Yasufumi's rw-locks (we are going to test both before)

On this page http://www.percona.com/percona-lab.html you can find links to our current binaries and patches.

July 20, 2008

Missing Data – rows used to generate result set

Posted by peter |

As Baron writes it is not the number of rows returned by the query but number of rows accessed by the query will most likely be defining query performance. Of course not all row accessed are created equal (such as full table scan row accesses may be much faster than random index lookups row accesses in the same table) but this is very valuable data point to optimize query anyway.
[read more...]

June 23, 2008

Neat tricks for the MySQL command-line pager

Posted by Baron Schwartz |

How many of you use the mysql command-line client?  And did you know about the pager command you can give it?  It's pretty useful.  It tells mysql to pipe the output of your commands through the specified program before displaying it to you.

Here's the most basic thing I can think of to do with it: use it as a pager.  (It's scary how predictable I am sometimes, isn't it?)

[read more...]

April 18, 2008

Idea: Couple of more string types

Posted by peter |

MySQL has a lot of string data types - CHAR, VARCHAR, BLOB, TEXT, ENUM and bunch of variants such as VARBINARY but I think it is not enough :)

I would also like to see type HEXCHAR which would be able to store hex strings, such as those returned as MD5() and SHA1() efficiently. With little modification it could work for UUID() as well (it adds some dashes). Currently it is quite inconvenient to deal with strings like that in MySQL. Either you store them as strings and waste space or you spend them as binary and deal with inconvenience of having not readable strings in the table OR adding UNHEX() everywhere - which also adds overhead.
[read more...]