August 1, 2014

A micro-benchmark of stored routines in MySQL

Ever wondered how fast stored routines are in MySQL? I just ran a quick micro-benchmark to compare the speed of a stored function against a “roughly equivalent” subquery. The idea — and there may be shortcomings that are poisoning the results here, your comments welcome — is to see how fast the SQL procedure code is at doing basically the same thing the subquery code does natively (so to speak).

Before we go further, I want to make sure you know that the queries I’m writing here are deliberately mis-optimized to force a bad execution plan. You should never use IN() subqueries the way I do, at least not in MySQL 5.1 and earlier.

I loaded the World sample database and cooked up this query:

This pretty consistently runs in just about 1/4th of a second. If you look at the abridged explain plan below, you’ll see the query is doing a table scan against the first query, and then executing the subquery for each row:

Now I took the subquery and basically rewrote it as a stored function.

Now the query can be rewritten as this:

If we explain it, we get output similar to the first table shown above, but the further two rows are not shown. The query can’t be optimized to use indexes, and the stored function is opaque to the optimizer. This is why I purposefully wrote the subquery badly in the first query! (If you think of a better way to compare apples and uhm, apples… please comment).

The poorly-optimized-subquery portion of the query essentially happens inside that function now.

And it’s four times slower, consistently, and that’s all I wanted to show here. Thanks for reading.

About Baron Schwartz

Baron is the lead author of High Performance MySQL.
He is a former Percona employee.

Comments

  1. Morgan Tocker says:

    @Baron – I tried testing this 1 year ago with fibonacci:
    http://mtocker.livejournal.com/45222.html

    My test wasn’t apples to apples either, but I found that writing stuff in PHP was about ten times faster. If you add in that MySQL stored procedures don’t have a debugger, it’s a tough sell to write pages and pages of business logic in them.

  2. Josh Davis says:

    The stored routine does a COUNT(), unlike the original query. I think you’d be closer to the way the original query works with something like:

    DELIMITER //
    DROP FUNCTION IF EXISTS speaks_english//
    CREATE FUNCTION speaks_english(c char(3)) returns integer deterministic
    begin
    RETURN (SELECT 1 FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’ AND co.Code = c LIMIT 1);
    end//
    DELIMITER ;

    On my machine it’s still 10 times slower than the subquery, though.

  3. Mark Leith says:

    Well, let’s try a proper stored procedure.. :)

    I’ll actually mimic what the statement was trying to do – find the population of countries that speak a given language, use a cursor to get the rows for the countries that speak the language (the inner query), and then loop over it:

    CREATE PROCEDURE speaks_language (IN in_language VARCHAR(30))
    BEGIN

    DECLARE v_countryCode CHAR(3);
    DECLARE v_countrySpeaksEnglish BIGINT DEFAULT 0;
    DECLARE v_sumSpeaksEnglish BIGINT DEFAULT 0;
    DECLARE done INT DEFAULT 0;

    DECLARE countries CURSOR FOR
    SELECT DISTINCT co.Code FROM Country AS co
    INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code
    WHERE cl.Language = in_language;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    OPEN countries;

    REPEAT
    FETCH countries INTO v_countryCode;
    IF NOT done THEN
    SELECT sql_no_cache sum(ci.Population) INTO v_countrySpeaksEnglish
    FROM City AS ci WHERE CountryCode = v_countryCode;
    SET v_sumSpeaksEnglish = v_sumSpeaksEnglish + IFNULL(v_countrySpeaksEnglish, 0);
    END IF;
    UNTIL done END REPEAT;

    CLOSE countries;

    SELECT v_sumSpeaksEnglish;

    END//

    Here’s the results on my system, including the original query:

    mysql> CALL speaks_language(‘English’)//
    +——————–+
    | v_sumSpeaksEnglish |
    +——————–+
    | 237134840 |
    +——————–+
    1 row in set (0.04 sec)

    Query OK, 0 rows affected (0.04 sec)

    mysql> CALL speaks_language(‘English’)//
    +——————–+
    | v_sumSpeaksEnglish |
    +——————–+
    | 237134840 |
    +——————–+
    1 row in set (0.04 sec)

    Query OK, 0 rows affected (0.04 sec)

    mysql> CALL speaks_language(‘English’)//
    +——————–+
    | v_sumSpeaksEnglish |
    +——————–+
    | 237134840 |
    +——————–+
    1 row in set (0.05 sec)

    Query OK, 0 rows affected (0.05 sec)

    mysql> SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE CountryCode IN ( SELECT DISTINCT co.Code FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’);//+——————–+| sum(ci.Population) |+——————–+
    | 237134840 |
    +——————–+
    1 row in set (0.05 sec)

    mysql> SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE CountryCode IN ( SELECT DISTINCT co.Code FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’);//
    +——————–+
    | sum(ci.Population) |
    +——————–+
    | 237134840 |
    +——————–+
    1 row in set (0.06 sec)

    mysql> SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE CountryCode IN ( SELECT DISTINCT co.Code FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’);//
    +——————–+
    | sum(ci.Population) |
    +——————–+
    | 237134840 |
    +——————–+
    1 row in set (0.05 sec)

  4. Mark Leith says:

    I should note of course, that I understand that this flips how the statements are run, but it *is* an apples to apples comparison, as it achieves exactly what the statement wanted to achieve, yet does it with different logic (as SPs are intended). :)

  5. Mark, thanks. I kind of created this to test a specific client scenario, so I was really focused on the cost of a stored function as a WHERE filter (as opposed to some other options that would roughly approximate the subquery). I am not sure the client could use a SP in its entirety. But I appreciate your rewrite, and I’m guessing the client does too!

  6. Tim Little says:

    It occurs to me (and everyone please feel free to correct me without mercy…

    But doesn’t the other_table retrieval occur just once per query :
    SELECT some_stuff FROM some_table
    WHERE a_field IN
    ( SELECT other_stuff FROM other_table WHERE logic_goes_here )

    While my contrasting concern is that the WHERE speaks_english is issued once per row, isn’t it? as :
    SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE speaks_english(CountryCode)> 0;

    Am I missing something, am I correct?

  7. Tim, actually that’s not correct, MySQL 5.1 and earlier don’t execute subqueries from the inside out as you expect.

Speak Your Mind

*