February 11, 2012

How to generate per-database traffic statistics using mk-query-digest

We often encounter customers who have partitioned their applications among a number of databases within the same instance of MySQL (think application service providers who have a separate database per customer organization … or wordpress-mu type of apps). For example, take the following single MySQL instance with multiple (identical) databases:

SHOW DATABASES;
+----------+
| Database |
+----------+
| db1      |
| db2      |
| db3      |
| db4      |
| mysql    |
+----------+

Separating the data in this manner is a great setup for being able to scale by simply migrating a subset of the databases to a different physical host when the existing host begins to get overloaded. But MySQL doesn’t allow us to examine statistics on a per-database basis.

Enter Maatkit.

There is an often-ignored gem in Maatkit‘s mk-query-digest, and that is the –group-by argument. This can be used to aggregate information by tables, hosts, users, or databases (full documentation is available via perldoc).

%> perl mk-query-digest --limit 100% --group-by db slow.log
...
# Rank Query ID Response time Calls R/Call Item
# ==== ======== ============= ===== ====== ====
#    1 0x       6000 60.0%    6000  0.5124 db3
#    2 0x       2000 20.0%    2000  0.0112 db1
#    3 0x       1500 15.0%    1500  0.1665 db2
#    4 0x        500  5.0%     500  0.0022 db4

So here, we can see that the majority (60%, to be exact) of execution time is spent in db3. If the server is reaching it’s capacity and the next most useful performance optimization is to migrate a database to a different server, you know exactly which database to move (db3) and how much room that will give you on the original host (60% growth) and on the new host (40% growth), which may have a direct bearing on your hardware selection.

Let Baron know how awesome you think this is by getting him a gift from his Amazon Wish List!

About Ryan Lowe

Ryan is a Principal Consultant and team manager at Percona. He has experience with many database technologies in industries such as health care, telecommunications, and social networking.

Speak Your Mind

*