September 29, 2009

Quick comparison of MyISAM, Infobright, and MonetDB

Posted by Baron Schwartz |

Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client’s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they’re ad-hoc (can access any columns) and it seems tailor-made for a column-oriented database.

I decided it was time to actually give Infobright a try. They have an open-source community edition, which is crippled but not enough to matter for this test. The “Knowledge Grid” architecture seems ideal for the types of queries the client runs. But hey, why not also try MonetDB, another open-source column-oriented database I’ve been meaning to take a look at?

[read more...]

September 20, 2009

Guidance for MySQL Optimizer Developers

Posted by peter |

I spend large portion of my life working on MySQL Performance Optimization and so MySQL Optimizer is quite important to me. For probably last 10 years I chased first Monty and later Igor with Optimizer complains and suggestions. Here are some general ideas which I think can help to make optimizer in MySQL, MariaDB or Drizzle better.
[read more...]

September 19, 2009

Multi Column indexes vs Index Merge

Posted by peter |

The mistake I commonly see among MySQL users is how indexes are created. Quite commonly people just index individual columns as they are referenced in where clause thinking this is the optimal indexing strategy. For example if I would have something like AGE=18 AND STATE=’CA’ they would create 2 separate indexes on AGE and STATE columns.

The better strategy is often to have combined multi-column index on (AGE,STATE). Lets see why it is the case.
[read more...]

August 16, 2009

A micro-benchmark of stored routines in MySQL

Posted by Baron Schwartz |

Ever wondered how fast stored routines are in MySQL? I just ran a quick micro-benchmark to compare the speed of a stored function against a “roughly equivalent” subquery. The idea — and there may be shortcomings that are poisoning the results here, your comments welcome — is to see how fast the SQL procedure code is at doing basically the same thing the subquery code does natively (so to speak).

[read more...]

March 5, 2009

What does Using filesort mean in MySQL?

Posted by Baron Schwartz |

If you were interviewing to work at Percona, and I asked you “what does Using filesort mean in EXPLAIN,” what would you say?

I have asked this question in a bunch of interviews so far, with smart people, and not one person has gotten it right. So I consider it to be a bad interview question, and I’m going to put the answer here. If anyone gets it wrong from now on, I know they don’t read this blog!

[read more...]

January 23, 2009

Optimizing repeated subexpressions in MySQL

Posted by Baron Schwartz |

How smart is the MySQL optimizer? If it sees an expression repeated many times, does it realize they’re all the same and not calculate the result for each of them?

I had a specific case where I needed to find out for sure, so I made a little benchmark. The query looks something like this:

[read more...]

December 22, 2008

High-Performance Click Analysis with MySQL

Posted by Baron Schwartz |

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event.

The next characteristic of a lot of these systems (real or planned) is the desire for “real-time” analysis.  Our customers often want their systems to provide the freshest data to their own clients, with no delays.

Finally, the analysis is usually multi-dimensional.  The typical user wants to be able to generate summaries and reports in many different ways on demand, often to support the functionality of the application as well as to provide reports to their clients.  Clicks by day, by customer, top ads by clicks, top ads by click-through ratio, and so on for dozens of different types of slicing and dicing.

And as a result, one of the most common questions we hear is how to build high-performance systems to do this work. Let’s see some ways you can build the functionality you need and get the performance you need. Because I’ve built two such systems to manage online ads through Google Adwords, Yahoo, MSN and others, it’s easy and familiar for me to use the example of search engine marketing. I’ll do that throughout this article.

[read more...]

October 31, 2008

How expensive is a WHERE clause in MySQL?

Posted by Baron Schwartz |

This is a fun question I’ve been wanting to test for some time.  How much overhead does a trivial WHERE clause add to a MySQL query?  To find out, I set my InnoDB buffer pool to 256MB and created a table that’s large enough to test, but small enough to fit wholly in memory:

[read more...]

August 1, 2008

How adding another table to JOIN can improve performance ?

Posted by peter |

JOINs are expensive and it most typical the fewer tables (for the same database) you join the better performance you will get. As for any rules there are however exceptions :)

The one I’m speaking about comes from the issue with MySQL optimizer stopping using further index key parts as soon as there is a range clause on the previous key part. So if you have INDEX(A,B) and have a where clause A BETWEEN 5 and 10 AND B=6 only the first part (A) of the index will be used which can be seriously affect performance. Of course in this example you can use index (B,A) but there are many similar cases when it is not possible.
[read more...]

April 28, 2008

The MySQL optimizer, the OS cache, and sequential versus random I/O

Posted by Baron Schwartz |

In my post on estimating query completion time, I wrote about how I measured the performance on a join between a few tables in a typical star schema data warehousing scenario.

In short, a query that could take several days to run with one join order takes an hour with another, and the optimizer chose the poorer of the two join orders. Why is one join order so much slower than the other, and why did the optimizer not choose the faster one? That's what this post is about.

Let's start with the MySQL query optimizer. The optimizer tries to choose the best join order based on its cost metric; it tries to estimate the cost for a query, then choose the query plan that has the lowest cost. The unit of cost for the MySQL query optimizer is a single random 4k data page read. In general, it's a pretty good metric, but it has one major weakness: the server doesn't know whether a read will be satisfied from the operating system cache, or whether it'll have to go to disk. (This distinction is abstracted away by the storage engine; the optimizer doesn't know how a given storage engine stores its data).

I'll try to omit the details and keep this clean. Let's take a look at the tables.

SQL:
  1. mysql> SHOW TABLE STATUS LIKE 'fact'\G
  2. *************************** 1. row ***************************
  3. Name: fact
  4. Engine: MyISAM
  5. Rows: 147045493
  6. Avg_row_length: 117
  7. Data_length: 17217646764
  8. Index_length: 11993816064
  9.  
  10. mysql> SHOW TABLE STATUS LIKE 'dim1'\G
  11. *************************** 1. row ***************************
  12. Name: dim1
  13. Engine: MyISAM
  14. Rows: 453193
  15. Avg_row_length: 122
  16. Data_length: 55605116
  17. Index_length: 93812736
  18.  
  19. mysql> SHOW TABLE STATUS LIKE 'dim2'\G
  20. *************************** 1. row ***************************
  21. Name: dim2
  22. Engine: MyISAM
  23. Rows: 811
  24. Avg_row_length: 105
  25. Data_length: 85368
  26. Index_length: 154624

It's a big fact table and two fairly small dimension tables, which is normal. Here is the query:

SQL:
  1. SELECT fact.col1, min(fact.col2) AS min_col2
  2. FROM fact, dim1, dim2
  3. WHERE fact.col4 = dim1.col4
  4. AND dim1.col3 <> 'hello world'
  5. AND dim2.col5 = 1
  6. AND fact.dim2_id = dim2.dim2_id
  7. AND fact.col2> some_const
  8. GROUP BY fact.col1

There are indexes on all the columns in all the ways you'd expect: all the dimension columns are indexed on every table, and there's a separate index on every column in the WHERE clause. Here's the query plan initially.

SQL:
  1. *************************** 1. row ***************************
  2. TABLE: dim1
  3. type: range
  4. key_len: 195
  5. rows: 18790
  6. Extra: USING WHERE; USING TEMPORARY; USING filesort
  7. *************************** 2. row ***************************
  8. TABLE: fact
  9. type: ref
  10. key_len: 4
  11. rows: 606
  12. Extra: USING WHERE
  13. *************************** 3. row ***************************
  14. TABLE: dim2
  15. type: eq_ref
  16. key_len: 2
  17. rows: 1
  18. Extra: USING WHERE

This query will run for days and never complete. No one ever let it finish to see how long it would run.

How do I know it will run for days? Here's my train of thought:

  • It's performing index lookups into the fact table, which is big.
  • An index lookup is a random I/O.
  • A modern disk can do about 100 random I/O's per second, as a rule of thumb.
  • If you do the math with the rows column in EXPLAIN, you realize that this equates to about 18790 * 606 = 11386740 I/O operations, assuming that the indexes are fully in memory.
  • When you divide this by 100 I/O operations per second, and then divide that by 86400 seconds in a day, you get about 2.6 days.

Ouch! That's slow.

Now let's look at the alternative: table-scan the fact table, and do index lookups in the two dimension tables. MySQL doesn't want to choose this join order, so we'll force it with STRAIGHT_JOIN:

SQL:
  1. EXPLAIN SELECT STRAIGHT_JOIN  ....
  2. +-------+-----------+-----------+---------------------------------+
  3. | TABLE | type      | rows      | Extra                           |
  4. +-------+-----------+-----------+---------------------------------+
  5. | fact  | ALL       | 147367284 | USING TEMPORARY; USING filesort |
  6. | dim1  | eq_ref    | 1         | USING WHERE                     |
  7. | dim2  | eq_ref    | 1         | USING WHERE                     |
  8. +-------+-----------+-----------+---------------------------------+

As we saw in the previous post, which I linked at the top of this post, we can scan the fact table in less than an hour. And it turns out that joining to the dimension tables doesn't slow the query perceptibly, because these tables are small and they stay in memory, in the OS cache. (They don't get evicted from memory by the cache's LRU policy, because they are frequently used -- once per row in the fact table. The LRU policy evicts old blocks from the fact table instead, which is perfect -- these blocks are used only once and not needed again, so they can be replaced).

The difference between the two queries -- 55 minutes and 2.6 days -- is basically the difference between scanning data sequentially on disk and random disk I/O.

So now you know why one join order is faster than the other. But why didn't the optimizer know this, too? The optimizer does know that random access is slower than sequential access, but it doesn't know that the dimension tables will stay in memory, and this is an important distinction.

Let's put ourselves into the mindset of the optimizer. We'll assume that every join to the dimension tables will go to disk instead of being read from cache. Now the STRAIGHT_JOIN becomes a table scan of about 313 sequential reads (150 million rows / 117 bytes per row / 4096 bytes per read), plus about 150 million random I/Os for the first dimension table, plus 150 million random I/Os for the second dimension table. That's 300 million random I/O operations.

In contrast, the optimizer chose a plan that it thought would cause only 11.3 million random I/O operations.

The optimizer was being smart, given its lack of knowledge about the OS cache. This is why an expert is sometimes needed to provide the missing information. If the MySQL optimizer were right and each of these had to go to disk, our STRAIGHT_JOIN plan would take more than a month to complete! Good thing we know the difference between cache and disk!