April 25, 2014

Get Me Some Query Logs!

One of my favorite tools in the Percona Toolkit is pt-query-digest.  This tool is indispensable for identifying your top SQL queries, and analyzing which queries are accounting for your database load.

But the report you get from pt-query-digest is only as good as the log of queries you give it as input.  You need a large enough sample of query logs, collected over a period of time when you have representative traffic on your database.

You also need the log to include all the queries, not just those that take more than N seconds.  The reason is that some queries are individually quick, and would not be logged if you set the long_query_time configuration variable to 1 or more seconds.  You want that threshold to be 0 seconds while you’re collecting logs.

However, activating such high-volume query log collection can be costly.  Every statement executed on your  database will cause file I/O, even when the query itself was served out of your buffer pool memory.  That’s a lot of overhead, so we need to be careful about how and when we collect logs, and for how long we leave that running.

I’ve put together a simple shell script to help automate this.  I have given it the functional but unimaginative name full-slow-log.

The script configures full logging, then sleeps for a number of seconds to allow queries to be collected in the logs.  After it finishes sleeping, or if you interrupt the script, the script restores log configuration back to the values they started with.

  • -v is for verbose output.
  • -s seconds allows you to specify the number of seconds to sleep.  The default is 5 seconds, which is probably too short for most sites, but the value is chosen to be as low impact as possible if you forget to give another value.
  • -c config allows you to specify a MySQL config file other than $HOME/.my.cnf, so you can store host, user, and password.

Here’s an example of running it with verbose output:

Notice that the script also redirects the slow query log to a new file, with a filename based on the timestamp.  This is so you have a distinct file that contains only the specific time range of logs you collected.

The restoration of settings is in a “trap” which is a shell scripting feature that serves as both an exit handler and signal handler.  So if you interrupt the script before it’s done, you have some assurance that it will do the right thing to restore settings anyway.

My full-slow-log script is now available on a Github project (along with a few other experimental scripts I have written).  See https://github.com/billkarwin/bk-tools

I hope you find this script a useful complement to my upcoming talks at the Percona Live  MySQL Conference in London, UK on December 3-4 2012:

If you can make it to London in December, we’d love to see you there!  If not, look for future Percona Live conferences.

Tweet the link to this blog for a chance to win a free full conference pass. Make sure to use hashtag #perconalive! Winner will be chosen at the end of the day. 

About Bill Karwin

Bill Karwin has been a software professional for over 20 years. He's helped thousands of developers with SQL technology. Bill authored the book "SQL Antipatterns," collecting frequent blunders and showing better solutions.

Comments

  1. Joffrey says:

    Hi Bill,

    thanks for an automation tool to make this task easier !

    In the meantime, I’m wondering why you do use FLUSH LOGS, which also flushes the binary logs by example, while disabling/enabling the slow query log would have the wanted effect (close/open the log file), with less side consequences.
    Since 5.5.3, MySQL also supports “FLUSH SLOW LOGS” (or mysqladmin flush-slow-log)

    Joffrey

  2. lefred says:

    +1 for Joffrey’s comment, and what could worry me a bit more is that it also causes InnoDB to flush its logs to disk and perform a checkpoint.

    But this is a nice script I will often use ;)

    Thanks Bill !

    For info on Percona Server I use these settings to populate slow query log:

    set global log_slow_sp_statements=1;
    set global log_slow_verbosity=”full”;
    set global long_query_time=0;
    set global slow_query_log_timestamp_always=1;
    set global slow_query_log_timestamp_precision=”microsecond”;
    set global slow_query_log_use_global_control=”all”;

  3. Thanks for the suggestions Joffrey and leFred! I’ve added issues in my github project and I’ll get around to making those changes in the future.

  4. JohnShep says:

    tried it but got error below, any clues appreciated :-)

    ./full-slow-log -v
    Discovering slow_query_log=1
    Discovering slow_query_log_file=/var/log/mysqld_slow.log
    Discovering long_query_time=1.000000
    Setting long_query_time=0
    ERROR 1231 (42000) at line 1: Variable ‘slow_query_log_file’ can’t be set to the value of ‘/var/log/mysqld_slow.log-full-20121123202725′
    Error occurred while trying: Setting slow_query_log_file=/var/log/mysqld_slow.log-full-20121123202725

  5. JohnShep, what version of MySQL are you using?

    The error you describe resembles this bug report:
    http://bugs.mysql.com/bug.php?id=32748

    The fix is reportedly in MySQL 5.1.25.

  6. Frieder says:

    Finally a script for it – thanks! :)

  7. JohnShep says:

    Hi Bill,
    thanks, I am using – Server version: 5.5.28-MariaDB-mariadb1~hardy-log
    John

  8. Joseph Drozdik says:

    Noob question: Isn’t slow_query_log also a session variable? In our environment we have apps that stay connected to the database for hours at a time. If you set global slow_query_log=0 won’t active sessions ignore the change? Is there some way to force them to take up the change without killing them?

  9. Joseph Drozdik says:

    Sigh: of course i was talking about long_query_time=0

  10. Joseph,

    You’re right, in stock MySQL, changes to the global variable long_query_time do not affect active sessions. They would need to reconnect.

    For that reason, Percona Server has features to make existing sessions use the global variables.

    http://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html#slow_query_log_use_global_control

  11. erin says:

    Bill -
    What about RDS w/o slow queries build? I’ve got to use processlist and I don’t seem to be gathering all the queries?

    I’m using something like this:
    pt-query-digest –noreport –processlist h=$DBHOST,D=$DB –user $DBUSER –password $DBPWD –run-time=600 –print –filter ‘$event->{fingerprint} =~ m/^select/’ > $NOW-querydigest-AWS.log

    Is the fingerprint incorrect? or is this just a limitation with processlist?

    WIsh I could use tcpdump!

    erin in SF

  12. Hi Erin!

    RDS is a problem, no doubt.

    Gathering queries from processlist is usually inadequate because pqd only polls 10 times per second. If you have queries that are quick, they might easily slip through in between the polling intervals, or else even if the polling catches them, it will only catch them once and fail to get a precise measurement of their duration.

    If I recall correctly, RDS does allow you to log slow queries to a table. This has a lot of I/O overhead but you can do it. Pqd doesn’t know how to gather logs from the mysql.slow_query log, but you can export the data into a format that’s compatible with file-based slow query log. In fact, I have such an export script in my “bk-tools” github repo, the same site where I posted my full-slow-log script. The script to export the slow-query log table is called “export-slow-log-table”.

    Another drawback of the table-based slow query log, in addition to its high overhead, is that it truncates the values for query execution time, and stores them as an integer. So all your quick queries will show up as taking 0 seconds.

    The conclusion is that RDS is very convenient as a MySQL appliance, but it has some INconvenient limitations with regards to performance analysis and tuning. I don’t recommend RDS for anything more demanding than a hobby WordPress site.

  13. Erin says:

    Hmmm I thought I was missing queries….. In fact I think I’m missing a LOT of queries.

    Alas, I can’t turn on slow queries for RDS. I just want the queries and don’t care about execution time as I’m gonna replay them else where. :)

    While RDS is very convenient, I’m not impressed…..

    Thanks!

  14. I’ve made a new commit to the full-slow-log tool to support:

    - Query logging options in Percona Server and MariaDB.
    - Gentler log redirection, without doing FLUSH LOGS.
    - New “-f file” option to specify the location of the redirected slow-query log.

    I’ll write some documentation soon too.

    JohnShep, I believe the problem you described is that your slow-query log is in a directory that the mysql uid cannot normally write to, for example /var/log. Therefore mysqld can’t dynamically create a new log file in that directory. You can use the new -f option to specify a different location, or else you can reconfigure your MySQL instance to store logs under /var/log/mysql or something, and make that directory writeable by the mysql uid.

Speak Your Mind

*