April 23, 2014

Performance gotcha of MySQL memory tables

One performance gotcha with MEMORY tables you might know about comes from the fact it is the only MySQL storage engine which defaults to HASH index type by default, instead of BTREE which makes indexes unusable for prefix matches or range lookups. This is however not performance gotcha I’m going to write about. There is one more thing you should be aware which again comes from the fact MEMORY tables use HASH indexes by default.

I’ve created rather similar test table:

and populated it with 1.000.000 rows ALL of them having same value for c column.

Now I’m performing random deletes by primary key (DELETE FROM test WHERE id=N) and get just about 80 deletes per second while converting table to MyISAM I get about 600 deletes per second and about 4000 deletes per second for Innodb (with log flush at transaction commit disabled)

The results might sound to be to bad to be true, but there is a reason. HASH index stores list of matching values for each hash value. In this case the key value is the same for all rows so they end up as very long value list for single hash value.

When you perform key lookup it is not the problem as you normally would need to fetch all values anyway. It however becomes a huge problem when delete (or key update) operation needs to be performed. In this case the value to be deleted needs to be removed from the list which requires list to be scanned.

This problem only happens if you have many values for single key value, if you do not, it is not the problem as list traversed will not be very long. Another thing you may find unusual is – the key cardinality affects performance even when this key is not directly used for lookup.

Changing the key c to be BTREE we instantly get about 15000 of deletes per second which is quite a gain.

You may also be surprised to see MyISAM being significantly slower than Innodb – in this case the table is small and fully fits in Innodb Buffer Pool and log flush on transaction commit was also disabled, meaning Innodb could delay writes and do all the changes directly in memory. MyISAM however had to perform number of random writes to the key file and data file. They of course were buffered by operation system, but it can’t do it as efficiently as Innodb does.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Coway says:

    Peter,

    There is one problem in your test: the data is not representative. The cardinality for column ‘c’ is 1. Thus, there are too many duplicates for the same hash value. Under this situation, B-tree is much better than hash.

    If you randomly generate values for c, and increase the selectivity for column ‘c’, for the search with equal condition, hash should have better performance.

  2. peter says:

    Coway,

    There is nothing as generally representative case. In this case I was to show the particular behavior of HASH indexes with a lot of rows matching the same value. In other cases it depends differently of course.

    Note in this case it is not the search which is the problem but delete which is very expensive…

  3. Jake Drew says:

    I would love to know your thoughts on the performance of a Engine=Memory table with BTREE used for lookups by primary key when compared to the same values in a C# Dictionary? Do you expect that the performance would be worse, similar, better?

Speak Your Mind

*