If you are planning to upgrade or make any configuration change on your MySQL database the first advice usually is:
How should we do that benchmark? People usually run generic benchmark tools like sysbench, tpcc or mysqlslap that are good to know the number of transactions per seconds that a database can do but it doesn’t care about your workload, data set or queries. Those tools just run random queries against random generated data.
The best way to run a benchmark is replaying the load of your production server against a different database server and here is where Percona Playback can help us. Percona Playback is a new tool that can replay the data captured from the production server in another different server. It can replay queries from tcpdump or slow query logs. With this tool you can measure how a database upgrade, change on my.cnf or schema change can affect the overall performance of your application.
In the example I’m going to show how to meassure the impact of my.cnf changes replaying a slow query log. The service starts with an empty my.cnf configuration so all parameters are set to their default values. We have two servers, PROD and DEV. PROD is the database that is in production and where our customers are working. DEV is where we are going to test all the changes and replay the load. Both servers should have an identical copy of the data, using DEV as slave of PROD or taking an xtrabackup copy just before the data capture.
Important: don’t run the benchmark on the production server because Percona Playback will run both SELECT and DML queries. There is still no –read-only parameter
So now we need to enable the slow query log on the PROD server. We should have it enabled during the peak time to get valuable information:
SET GLOBAL slow_query_log=1; SET GLOBAL long_query_time=0;
With long_query_time we are logging all queries that are being executed in our database. After capturing enough data and copying the slow query log to the DEV server we can replay the load:
percona-playback --mysql-host=127.0.0.1 --mysql-user=root --mysql-schema=my_db --query-log-file=slow.log Executed 429467 queries Spent 00:00:59.668649 executing queries versus an expected 00:00:47.085301 time. 98966 queries were quicker than expected, 330501 were slower A total of 51 queries had errors. Expected 372462 rows, got 371851 (a difference of 611) Number of queries where number of rows differed: 324.
That’s the report of the tool. Seems that the it needed 59 seconds to execute the load but it was 47 seconds originally on the production server. 98966 queries were faster and 330501 slower. We also see that some queries had errors. Those error are shown on the stderr so you can redirect them and review later. The typical reason for an error is that a particular query returns different number or rows. For example:
Connection 43 Rows Sent: 210 != expected 211 for query: SELECT DISTINCT ol_i_id FROM order_line WHERE ol_w_id = 1 AND ol_d_id = 10 AND ol_o_id < 3415 AND ol_o_id >= (3415 - 20);
Why there are some queries returns different number of rows? Usually because the data set is not exactly the same on PROD and DEV servers. In the previous report we see that 324 queries returned different row numbers from 429467 queries in total.
Let’s modify the my.cnf to see if we can get any improvement. Take in account that the data set on DEV has been modified by the previous replay so you should restore it before the next benchmark. I’m going to add 1GB of buffer pool and run again the benchmark (innodb_buffer_pool_size=1G).
Executed 429467 queries Spent 00:00:59.856656 executing queries versus an expected 00:00:47.085301 time. 98730 queries were quicker than expected, 330737 were slower A total of 51 queries had errors. Expected 372462 rows, got 371851 (a difference of 611) Number of queries where number of rows differed: 324.
Not too many changes, is very similar to the previous report. What is the reason to see no improvement? Well, I haven’t warmed up the buffer pool. Usually when doing a benchmark is a good idea to read the data first to warm up the buffer pool. This blog post is not a benchmark per se, just an overview of Percona Playback so we’re going to ignore that. Second try, add 2GB of InnoDB Log Files (innodb_log_file_size=1G, innodb_log_files_in_group=2):
Executed 429467 queries Spent 00:00:56.772552 executing queries versus an expected 00:00:47.085301 time. 99959 queries were quicker than expected, 329508 were slower A total of 51 queries had errors. Expected 372462 rows, got 371851 (a difference of 611) Number of queries where number of rows differed: 324.
Now we see an improvement but we can go further. Last change, flush logs every second instead of every commit (innodb_flush_log_at_trx_commit=2):
Executed 429467 queries Spent 00:00:51.605932 executing queries versus an expected 00:00:47.085301 time. 111833 queries were quicker than expected, 317634 were slower A total of 51 queries had errors. Expected 372462 rows, got 371851 (a difference of 611) Number of queries where number of rows differed: 324.
Much better, 51 seconds. Very near to the original value
This is an easy example of how we can test our changes or updates using real data and real queries from our application. The tool is under active development. If you find any bugs we would appreciate it being filed on launchpad. You can download the tool from: