July 28, 2014

Optimizing repeated subexpressions in MySQL

How smart is the MySQL optimizer? If it sees an expression repeated many times, does it realize they’re all the same and not calculate the result for each of them? I had a specific case where I needed to find out for sure, so I made a little benchmark. The query looks something like this:

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]

How expensive is a WHERE clause in MySQL?

This is a fun question I’ve been wanting to test for some time.  How much overhead does a trivial WHERE clause add to a MySQL query?  To find out, I set my InnoDB buffer pool to 256MB and created a table that’s large enough to test, but small enough to fit wholly in memory:

How adding another table to JOIN can improve performance ?

JOINs are expensive and it most typical the fewer tables (for the same database) you join the better performance you will get. As for any rules there are however exceptions The one I’m speaking about comes from the issue with MySQL optimizer stopping using further index key parts as soon as there is a range […]

The MySQL optimizer, the OS cache, and sequential versus random I/O

In my post on estimating query completion time, I wrote about how I measured the performance on a join between a few tables in a typical star schema data warehousing scenario. In short, a query that could take several days to run with one join order takes an hour with another, and the optimizer chose […]

How to estimate query completion time in MySQL

Have you ever run a query in MySQL and wondered how long it’ll take to complete? Many people have had this experience. It’s not a big deal until the query has been running for an hour. Or a day and a half. Just when IS that query going to finish, anyway? There are actually a […]

MySQL 6.0 vs 5.1 in TPC-H queries

Last week I played with queries from TPC-H benchmarks, particularly comparing MySQL 6.0.4-alpha with 5.1. MySQL 6.0 is interesting here, as there is a lot of new changes in optimizer, which should affect execution plan of TPC-H queries. In reality only two queries (from 22) have significantly better execution time (about them in next posts), […]

MySQL Performance – eliminating ORDER BY function

One of the first rules you would learn about MySQL Performance Optimization is to avoid using functions when comparing constants or order by. Ie use indexed_col=N is good. function(indexed_col)=N is bad because MySQL Typically will be unable to use index on the column even if function is very simple such as arithmetic operation. Same can […]

UNION vs UNION ALL Performance

When I was comparing performance of UNION vs MySQL 5.0 index merge algorithm Sinisa pointed out I should be using UNION ALL instead of simple UNION in my benchmarks, and he was right. Numbers would be different but it should not change general point of having optimization of moving LIMIT inside of union clause being […]

Possible optimization for sort_merge and UNION ORDER BY LIMIT

Every so often you need to perform sort results retrieved from MySQL when your WHERE clause goes beyound col=const values which would allow MySQL to still use second portion of the index for the order by. Ranges as well as IN lists make this optimization impossible, not even speaking about index merge optimization. Lets look […]