For the last couple of months, we’ve been quietly developing a MySQL protocol parser for Maatkit. It isn’t an implementation of the protocol: it’s an observer of the protocol. This lets us gather queries from servers that don’t have a slow query log enabled, at very high time resolution.

With this new functionality, it becomes possible for mk-query-digest to stand on the sidelines and watch queries fly by over TCP. It is only an observer on the sidelines: it is NOT a man in the middle like mysql-proxy, so it has basically zero impact on the running server (tcpdump is very efficient) and zero impact on the query latency. There are some unique challenges to watching an entire server’s traffic, but we’ve found ways to solve those. Some of them are harder than others, such as making sense of a conversation when you start listening in the middle. In general, it’s working very well. We can gather just about every bit of information about queries that mysql-proxy can, making this a viable way to monitor servers without the disadvantages of a proxy. In theory, we can gather ALL the same information, but in practice we are going for the 95% case.

As always with Maatkit, this has minimal dependencies. It doesn’t require any Net::Pcap or other modules from CPAN. It’s written in pure Perl, and it parses the output of tcpdump, rather than watching the network traffic directly. This might sound useless, but it’s not. It means you can go tcpdump some traffic on a machine without Perl installed, and copy it to another machine for analysis, or send it via email to your friendly consultant, or do any of a number of other things. Decoupling things is very helpful sometimes.

Let’s see how to gather queries and do something useful with them. I’ll just watch the queries on a sandbox server on my laptop, and print out the profile synopsis so you can see how it works.

I run a few queries, quit, and cancel tcpdump. Now I’ve got a file and I’m ready to analyze it. Let’s see:

I’m kind of showing off the summary profile here to illustrate that you can get really compact results to see what’s going on inside your server. What do you suppose that one query was that took a tenth of a second? We can find out.

Indeed, it’s no surprise the query took a tenth of a second to execute, and now you see where “SELECT dual” comes from.

Notice that it is inspecting the protocol enough to see the flags set in the protocol, indicating the warning count, error count, rows affected, and whether no index or no good index was available. Look at the top of the report — what is up with the 12% of queries that say No_index_used? If we increase –limit a bit, we can see

I did not know that SHOW DATABASES sets the “no index used” flag, did you? Now we both do!

This is just a brief introduction to what the protocol parser can do. Of course, in real life it’s much more useful than just seeing a query or two — it has all the power of mk-query-digest for filtering, aggregating, printing and so forth.

23 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Rob Wultsch

This will be very useful for those of us that wish we could turn on the general query log on at will on any version of MySQL 🙂

Thank you in advance.

Justin Swanhart

Awesome beyond words.

Thanks guys.

Brice Figureau

This is purely awesome.
It’s ‘The Tool’ I needed for so long but never had time to write!
Many thanks!

Jan Kneschke

How do you handle packet drops ?

LenZ

Hi,

excellent news, this sounds like a very useful tool – thank you for developing it!

But allow me to comment on one of your statements above (emphasis mine): “For the last couple of *months*, we’ve been *quietly developing* a MySQL protocol parser for Maatkit.” I guess in the light of Peter’s recent rant (http://www.mysqlperformanceblog.com/2009/05/21/open-development-vs-making-a-big-splash/) you should be careful with such statements 🙂

Just a thought,

LenZ

Mark Leith

Pretty cool! tcpdump seems to be a common thing that people are looking at lately – I’ve seen other demos of this kind of thing in the past few months too (as well as DTrace – which gives oh oh oh so much more, my only wish is that it was available on other platforms)..

I’ll also give the nod to “practice what you preach” too though.. 😉 For shame! 🙂

Jan Kneschke

I started a pcap plugin for MySQL Proxy a while ago and stopped it after I ran into a few tricky problems:

* pcap_open_live() may have loose packets on high load (underneat is usually a ring-buffer)
* you get a tcp/ip stream. Packets can/will be out of order, duplicated, fragemented, …
* you need a proper, userland TCP implementation
* you have to run as root if you want to use pcap_open_live()

Without a proper TCP userland lib it, it is hard to make the decoding robust enough that you would try to run it in the wild.
The decoding the MySQL Protocol afterwards is the easy part.

erin

Hey THANKS! I’ve become a recent fan of mk-query-digest but I’ve been using it with processlist. Just yesterday I was exploring tcpdump and wondering how to use mk-query-digest with it. Now I can just start using it!! yippeee.

You can thank Ryan for a bunch of SF-MySQL meetup folks for a sudden interest in mk-query-digest.

thanks & keep up the good work.
erin

LenZ

Baron, it was not my intention to offend you or imply that any of the work was done in secret. *I* certainly know this was not the case (as I follow the mailing lists), but to the uninitiated a comment like yours *could* be misinterpreted like it was. Hence my suggestion to be careful with formulating it as you did. Sorry if my comment sounded misleading or offensive.

Nick Le Mouton

I’m trying to run tcpdump on my local dev server, but as far as I can see no packets are going through interface lo. Although mysql is connecting to localhost via PHP, I think it’s trying to use the unix socket rather than tcp/ip. Is there anyway to sniff packets from the socket to use the same query analysis tool?

Jeremy Zawodny

Nick:

If you change from localhost to using the IP address of your box, it’ll work (assuming the MySQL grants allow for this).

Nick Le Mouton

Thanks Jeremy that did it 🙂

Mark Callaghan

Wow! Someone just showed me some output from this and it is really, really nice. This is a very good week – first the faster recovery patch and now this.

horse

Why always remind me that the packets don’t come from mysql server?

the command on server(192.168.84.22):
tcpdump host 192.168.84.22 and 192.168.84.146 -x -n -q -tttt > tcp.dmp

I run a few querie on client(192.168.84.146), quit, and cancel tcpdump,Now I’ve got a file(tcp.dmp) and I’m ready to analyze it

$ mk-query-digest –type=tcpdump –limit 1 tcp.dmp

result:
Packet is not to or from MySQL server: $VAR1 = {
ack => 3333564724,
complete => 1,
data => ‘110000000373656c656374202a2066726f6d207070’,
data_len => 21,
dgram_len => 61,
dst_host => ‘192.168.84.22’,
dst_port => ‘4333’,
ip_hlen => 5,
pos_in_log => 13193,
raw_packet => ‘2009-08-31 14:26:13.120420 IP 192.168.84.146.2557 > 192.168.84.22.4333: tcp 21
0x0000: 4500 003d d341 4000 4006 3d80 c0a8 5492
0x0010: c0a8 5416 09fd 10ed 23c2 198d c6b2 2934
0x0020: 5018 f884 6145 0000 1100 0000 0373 656c
0x0030: 6563 7420 2a20 6672 6f6d 2070 70’,
seq => 599923085,
src_host => ‘192.168.84.146’,
src_port => ‘2557’,
tcp_hlen => 5,
ts => ‘2009-08-31 14:26:13.120420’
};

why ???Please help me ~~~

Ronald Bradford

I was unable to get this functional on a CentOS 5.3 server. I tried -s 65535, 32767 and even 0, all with the same error.

tcpdump: invalid snaplen 65535 # number being what was passed.

Command used was.

$ /usr/sbin/tcpdump -i eth1 port 3306 -s 65535 -x -n -q -tttt | /home/rbradfor/maatkit-4790/bin/mk-query-digest –type tcpdump

JoC

When running pt-query-digest with –processlist, the #Time attribute is always returned as “# Time: 2012-02-23T15:09:00”
where there’s an extra T in the middle making the –since and –until option unusable.

Is there anyway to get around this?