April 18, 2014

Optimizing slow web pages with mk-query-digest

I don’t use many tools in my consulting practice but for the ones I do, I try to know them as best as I can. I’ve been using mk-query-digest for almost as long as it exists but it continues to surprise me in ways I couldn’t imagine it would. This time I’d like to share a quick tip on how mk-query-digest allows you to slice your data in a completely different way than it otherwise would by default.

Disclaimer: this only works when persistent connections or connection pools aren’t used and is only accurate when single mysql connection is used during execution of a request.

If you are seeking to reduce the load on the database server and [as a result] increase response time for some random user requests, you are usually interested in queries that are consuming most MySQL time and that’s how mk-query-digest groups and orders data by default. Fixing top 10 queries on the list indeed will most likely reduce the load and improve response time for some requests. What if some pages are still slow to load because of the time spent in database and you either can’t or don’t want to profile or debug the application to figure out what’s happening under the hood?

That sounds like something I was working on today – I had a slow query log (captured with long_query_time=0 and all the eXtra benefits from Percona slow query log patch), I knew some particular pages were taking minutes to load and that’s exactly what the customer asked me to focus on. So instead of using mk-query-digest to list me top slowest queries, I asked it to list me top slowest sessions:

Spot on, the session I needed to focus on was right at the top. And what do you know, 519 queries were run during that session which took 148s seconds overall:

The stats here are aggregated per all queries which is great, but I still need to figure out what queries were run. I could use mk-log-player and split all sessions that way, unfortunately mk-log-player will not have all the other useful information, not even query timing. Instead, I’ve used mk-query-digest:

Now I know exactly what needs to be fixed first to make the greatest impact to this page response time. I can also convert that into a slow query log that lists all the queries that were executed during this session in the order they were executed:

Pretty cool, isn’t it? Sure, it would be even better if mk-query-digest would do a nested group-by and order-by within a group so I would avoid the extra step, but then even better than that would be if it would optimize the queries all together! Unfortunately mk-query-digest won’t do that for you, but then there’s mk-query-advisor ;)

About Aurimas Mikalauskas

Aurimas joined Percona in 2006, a few months after Peter and Vadim founded the company. His primary focus is on high performance, but he also specializes in full text search, high availability, content caching techniques and MySQL data recovery.

Comments

  1. Aurimas Mikalauskas says:

    Mike, I just looked at the source files and (the src.rpm) and they indeed contain separate patches that you can apply. The patch you want here is microslow_innodb.patch. I would however recommend to use a more recent version of MySQL and in fact without modifications, the patch that’s available in mysql-5.0.92 src.rpm file will fail at two hunks against mysql-5.0.77. On vanilla mysql-5.0.92 it applies cleanly. Mind that if you want to apply few patches (which I highly recommend if you’re using InnoDB, especially the IO patch, split_buf_pool_mutex and recovery patch) you must apply them in a certain order which you can find in percona-sql-50.spec file. Hope this helps.

  2. Mike Todd says:

    Thanks for the reply Aurimas. We would need it separately, yes — we are running MySQL 5.0.77, not Percona Server. Our server hosts won’t support Percona Server (even though it’s identical to MySQL for their purposes), so we have to use “vanilla” MySQL. But they would be okay with this patch I think. I found some old documentation that said the patch could be applied to non-Percona MySQL, but I understand if that’s no longer the case.

  3. Aurimas Mikalauskas says:

    Mike, sorry if I have confused you. The functionality you need is available in any version of the Percona Server. Some older version may not have Thread_id yet, but I can’t tell you which version exactly does not have it. If you upgrade to Percona Server 5.0.92-b23, you should definitely have the functionality you need. If you need the slow query patch only, I can take a look to see how it can be extracted if it’s still all available separately.

  4. Mike Todd says:

    Err… slow query log, not sow query log. Accursed fingers! :)

  5. Mike Todd says:

    You mention the sow log query patch which allows for logging in microseconds, but I can only find a version of that from 2006. We are running MySQL 5.0.77 — do you have a version of the patch that’s compatible with that, and can you link me to it if so?

  6. Aurimas Mikalauskas says:

    Thanks, Justin. Indeed you can overcome the shortcomings of the approach I used in that case. The good thing about grouping by thread_id though is that you get a fairly good picture of how long certain page requests are spending in MySQL without the need to modify or even look at the application. If you’re serious about instrumenting your app, then you can’t really rely on that as there’s a lot more than just MySQL calls you’re interested in and this is when Instrumentation-for-PHP becomes irreplaceable.

  7. Justin Swanhart says:

    mqd is awesome. Here is another trick that you can use if you can modify the SQL app:

    You can use a query wrapper in your application that appends SQL comments to your SQL statements. These comments can include the file/function/line that the SQL was called in. If you construct this in a regular way, then you can use mqd’s –embedded-attributes option to parse those comments and aggregate by page or page function.

    My reference instrumentation project (Instrumentation-for-PHP) adds those comments to SQL statements. It also assigns a unique ‘request_id’ to every web request, and places this into the comments too. So you can extract the SQL for a single session, even if connection pooling or a load balancer is used. Finally, it also stuffs the session_id into the SQL as well.

    Both of these are useful using –filter because embedded attributes are processed before filters are applied:

    The query comment consists of key value pairs:
    SELECT 1;
    – File: index.php Line: 118 Function: fullCachePage request_id: ABC session_id: XYZ

    mk-query-digest can then filter by web page:

    mk-query-digest /var/lib/mysql/slow –embedded-attributes ‘^– [^n]+’,’(w+): ([^t]+)’ –filter
    ‘$event->{File} && $event->{File} =~ m/index/’

  8. Justin Swanhart says:

    mqd is awesome. Here is another trick that you can use if you can modify the SQL app:

    You can use a query wrapper in your application that appends SQL comments to your SQL statements. These comments can include the file/function/line that the SQL was called in. If you construct this in a regular way, then you can use mqd’s –embedded-attributes option to parse those comments and aggregate by page or page function.

    My reference instrumentation project (Instrumentation-for-PHP) adds those comments to SQL statements. It also assigns a unique ‘request_id’ to every web request, and places this into the comments too. So you can extract the SQL for a single session, even if connection pooling or a load balancer is used. Finally, it also stuffs the session_id into the SQL as well.

    Both of these are useful using –filter because embedded attributes are processed before filters are applied:

    The query comment consists of key value pairs:
    SELECT 1;
    – File: index.php Line: 118 Function: fullCachePage request_id: ABC session_id: XYZ

    mk-query-digest can then filter by web page:

    mk-query-digest /var/lib/mysql/slow –embedded-attributes ‘^– [^\n]+’,'(\w+): ([^\t]+)’ –filter
    ‘$event->{File} && $event->{File} =~ m/index/’

  9. Thanks, Justin. Indeed you can overcome the shortcomings of the approach I used in that case. The good thing about grouping by thread_id though is that you get a fairly good picture of how long certain page requests are spending in MySQL without the need to modify or even look at the application. If you’re serious about instrumenting your app, then you can’t really rely on that as there’s a lot more than just MySQL calls you’re interested in and this is when Instrumentation-for-PHP becomes irreplaceable.

  10. Mike Todd says:

    You mention the sow log query patch which allows for logging in microseconds, but I can only find a version of that from 2006. We are running MySQL 5.0.77 — do you have a version of the patch that’s compatible with that, and can you link me to it if so?

  11. Mike Todd says:

    Err… slow query log, not sow query log. Accursed fingers! :)

  12. Mike, sorry if I have confused you. The functionality you need is available in any version of the Percona Server. Some older version may not have Thread_id yet, but I can’t tell you which version exactly does not have it. If you upgrade to Percona Server 5.0.92-b23, you should definitely have the functionality you need. If you need the slow query patch only, I can take a look to see how it can be extracted if it’s still all available separately.

  13. Mike Todd says:

    Thanks for the reply Aurimas. We would need it separately, yes — we are running MySQL 5.0.77, not Percona Server. Our server hosts won’t support Percona Server (even though it’s identical to MySQL for their purposes), so we have to use “vanilla” MySQL. But they would be okay with this patch I think. I found some old documentation that said the patch could be applied to non-Percona MySQL, but I understand if that’s no longer the case.

  14. Mike, I just looked at the source files and (the src.rpm) and they indeed contain separate patches that you can apply. The patch you want here is microslow_innodb.patch. I would however recommend to use a more recent version of MySQL and in fact without modifications, the patch that’s available in mysql-5.0.92 src.rpm file will fail at two hunks against mysql-5.0.77. On vanilla mysql-5.0.92 it applies cleanly. Mind that if you want to apply few patches (which I highly recommend if you’re using InnoDB, especially the IO patch, split_buf_pool_mutex and recovery patch) you must apply them in a certain order which you can find in percona-sql-50.spec file. Hope this helps.

Speak Your Mind

*