April 16, 2014

MySQL Performance on Memory Appliance

Recently I have had a chance to check out MySQL Performance on “Memory Appliance” by Violin Memory which can be used as extremely high speed storage system.

I helped Violin Memory to optimize MySQL for customer workload and Violin memory and also had a chance to do some benchmarks on my own. 2*Quad Core Xeon running CentOS5 was tested using ext2 filesystem and SysBench tool.

Using 16K read sizes (matches Innodb page size) I could get 1.2GB/sec (80K req/sec) for reads and about 500MB/sec writes with 16 concurrent threads. Things scaled well and with 256 threads I got even a bit better performance.

Interesting enough utilization in iostat never went over few percents and load was mostly CPU bound.

Next I went on testing MySQL. My goal was to simulate as much IO as possible so I use Sysbench to get 300Mil rows table and ran primary key lookups with uniform distribution some 90% of which would need to do IO to get their data from the disk.

With Innodb results were quite poor – I got about 6500 req/sec from one thread which grew to 13000 req/sec when 8 concurrent threads were used. Compare this to 20.000 queries/sec from single thread if all data fits to bugger pool, scaling to over 80.000 queries/sec with multiple threads.

So Innodb took very high hit supplied with high performance IO subsystem. But not only performance was poor with single thread it also did not scale well with increased concurrency even though we had 8 cores available for disposal.

Unfortunately oprofile did not work on the box in question so I could not investigate where exactly CPU is wasted. My first obvious quess was Innodb checksumming (innodb_checksums=0) and indeed that allowed to get to 9000 queries/sec from single thread, however it still peaked out at 13.500 queries/sec with multiple threads which corresponds to even worse scalability. Disabling adaptive hash index could take us to 14.500 req/sec which was also a bit surprising as we’re speaking about read-only workload so it is quite surprising adaptive hash index actually hurts performance in this case.

My guess is Innodb just was not really designed and tested in such condition – normal case is to allocate cache memory to buffer pool so IOs will mostly come from drives directly which means hundreds or may be thousands of IOs per system for the whole system which allows to waste some CPU cycles handling them without taking large performance hit.

I really hope this would be one of the Items besides CPU scaling which Innodb team will put on their roadmap. With SSD coming this will be important. And it is also very easy to repeat and test – just run Innodb from the RAM drive :)

Next I tested MyISAM which is well known for its simplicity and so fast execution path from getting data from the drive and sending it back to the client.

I tested MyISAM with myisam_use_mmap option which did not seems to give any significant performance benefit, may be even reducing performance few percents. I also tried running with key buffer on and off. Interesting enough disabling key buffer actually worked better for this workload avoiding to avoid contention.

In the best conditions I got about 13.5K queries/sec from MyISAM (double of Innodb already) which scaled to 70.000 queries/sec peaking out at 16 threads.

However even with MyISAM we got CPU bound before we could reach the system capacity – these 70K queries/sec generated just over 50K IOs/sec while capacity was over 100K IOs/sec (more than with Innodb as these are smaller sized)

These results reiterate one important things about Innodb – it just loves to have as much data in buffer pool as possible for best performance.

Hopefully some time in the future I get a chance to do more testing with this nice hardware and may be check out other storage engines and other workloads or get some profiling results.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Philip Stoev says:

    Can you test Falcon on this beast? A comparison with Innodb will be much appreciated.

  2. peter says:

    Hopefully I will have a chance soon to test Falcon PBXT and Maria on this appliance. I’m curious myself.

  3. Yup. this is exactly what I saw with innodb + sysbench on SSD.

    The IO subsystem is so fast that the CPU becomes a bottleneck.

    MyISAM was 2x faster than InnoDB but in our situation did not become a bottleneck.

    I ran out of time on my benchmarks to load it into oprofile but I’m very curious as to what’s happening.

    BTW. A 16k page size is HUGE with a memory based device.

    Hopefully Maria, will be able to go down to 4k.

    I want to play with using 512 byte pages as this would be optimal for SSD based devices (I think).

    Kevin

  4. peter says:

    Kevin,

    Thanks for confirmation. Maria has 8K pages by default and can do 4K. I’m not sure lower values work. Though note to really benefit from smaller page sizes you need to use O_DIRECT or other measures to bypass OS cache as otherwise OS will still do 4K IOs.

  5. Ryan says:

    We were thinking of getting one of these just to improve upgrade times for customers (faster alter). it would be nice if you could test alter speed on these. In my testing I did notice that on some large tables with 30 indexes the system actually took more CPU rebuilding tables than IO so maybe these wouldn’t be worth the investment. It would be good to see some numbers though.

    -Ryan

  6. Ryan.

    This is the same behavior we’re seeing on SSD. You can have super fast IO on MySQL but on InnoDB you become CPU bottlenecked.

    Hopefully this will be fixed soon.

  7. peter says:

    Ryan,

    Sure If I get access again I can check it. By the way they will be exhibition on MySQL Users Conference and I’m sure if you’re serious about getting some of these you will be able to test your test case.

    Also you may learn other interesting stuff on the conference to help with this problem. I just can’t share details yet :)

    BTW. We also did Innodb profiling and patching and we can get it perform better in this case. So at least there is a space for fixes where.

  8. peter says:

    Kevin,

    I think it is natural for any system – if IO becomes fast enough, system becomes CPU bound. This is not the problem the problem is performance of MySQL/Innodb in such case – penalty for path involving IO is way too large.

  9. Peter.

    I totally agree that if you remove the IO subsystem limits then you’ll become CPU bound.

    However, InnoDB is pathological in this situation. MyISAM was about 2x faster and didn’t use any CPU…..

    Actually, I think one would saturate the disk IO and or SATA port IO before the CPU would become bottlenecked.

    Kevin

  10. Chris Nigh says:

    I am running a test with a very similar setup. We have an HP DL380 with 20GB RAM, dual socket, quad core 3.0GHz Xeons attached via a single interface card to a 500GB Violin. Our database(s) total roughly 350GB and we have multiple application servers hitting it to produce a virtually infinite database load.

    After initially seeing terrible numbers, we disabled the key_buffer and saw a huge improvement. As we added application servers we saw the database server increase in load until we seem to have reached its maximum throughput. I have tried several different database and application configurations and have not seen any real application TPS improvements.

    At maximum load, we seem to see that we are CPU bound, with high context switching and paging (not swap) activities.

    We are running multiple MySQL instances, so it is difficult to get the oprofile running for all instances, but I may do that next week.

    Overall, we are impressed that we have been able to compress an existing environment of many database servers running our data in ramdisk to a single Violin with virtually the same performance.

    Details below:
    03:00:01 PM proc/s
    03:01:01 PM 0.22
    Average: 0.22

    03:00:01 PM cswch/s
    03:01:01 PM 101676.65
    Average: 101676.65

    03:00:01 PM CPU %user %nice %system %iowait %steal %idle
    03:01:01 PM all 27.15 0.00 16.21 1.44 0.00 55.21
    03:01:01 PM 0 38.58 0.00 22.03 1.28 0.00 38.11
    03:01:01 PM 1 20.00 0.00 11.68 2.08 0.00 66.24
    03:01:01 PM 2 21.48 0.00 11.05 0.67 0.00 66.81
    03:01:01 PM 3 21.82 0.00 13.44 0.85 0.00 63.89
    03:01:01 PM 4 26.78 0.00 14.61 1.22 0.00 57.39
    03:01:01 PM 5 40.65 0.00 31.89 3.60 0.00 23.86
    03:01:01 PM 6 24.12 0.00 12.30 0.90 0.00 62.68
    03:01:01 PM 7 23.75 0.00 12.63 0.92 0.00 62.70
    Average: all 27.15 0.00 16.21 1.44 0.00 55.21
    Average: 0 38.58 0.00 22.03 1.28 0.00 38.11
    Average: 1 20.00 0.00 11.68 2.08 0.00 66.24
    Average: 2 21.48 0.00 11.05 0.67 0.00 66.81
    Average: 3 21.82 0.00 13.44 0.85 0.00 63.89
    Average: 4 26.78 0.00 14.61 1.22 0.00 57.39
    Average: 5 40.65 0.00 31.89 3.60 0.00 23.86
    Average: 6 24.12 0.00 12.30 0.90 0.00 62.68
    Average: 7 23.75 0.00 12.63 0.92 0.00 62.70

    03:00:01 PM INTR intr/s
    03:01:01 PM sum 24732.29
    Average: sum 24732.29

    03:00:01 PM CPU i000/s i001/s i003/s i008/s i009/s i012/s i014/s i098/s i106/s i130/s i146/s i154/s i162/s i169/s i177/s i185/s
    03:01:01 PM 0 1000.15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 7719.43 0.88 0.00 0.00 0.00
    03:01:01 PM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 18.33 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 15993.38 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 0 1000.15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 7719.43 0.88 0.00 0.00 0.00
    Average: 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 18.33 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 15993.38 0.00 0.00 0.00 0.00 0.00
    Average: 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

    03:00:01 PM pswpin/s pswpout/s
    03:01:01 PM 0.00 0.00
    Average: 0.00 0.00

    03:00:01 PM tps rtps wtps bread/s bwrtn/s
    03:01:01 PM 16095.52 16087.10 8.42 157681.72 1011.03
    Average: 16095.52 16087.10 8.42 157681.72 1011.03

    03:00:01 PM frmpg/s bufpg/s campg/s
    03:01:01 PM 7.47 2.88 -18.95
    Average: 7.47 2.88 -18.95

    03:00:01 PM TTY rcvin/s xmtin/s framerr/s prtyerr/s brk/s ovrun/s
    03:01:01 PM 0 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM 1 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 0 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 1 0.00 0.00 0.00 0.00 0.00 0.00

    03:00:01 PM IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s txcmp/s rxmcst/s
    03:01:01 PM lo 0.70 0.70 38.73 38.73 0.00 0.00 0.00
    03:01:01 PM eth0 9140.03 8010.61 6269830.21 7670767.72 0.00 0.00 0.00
    03:01:01 PM eth1 0.50 0.67 85.82 88.82 0.00 0.00 0.00
    03:01:01 PM sit0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: lo 0.70 0.70 38.73 38.73 0.00 0.00 0.00
    Average: eth0 9140.03 8010.61 6269830.21 7670767.72 0.00 0.00 0.00
    Average: eth1 0.50 0.67 85.82 88.82 0.00 0.00 0.00
    Average: sit0 0.00 0.00 0.00 0.00 0.00 0.00 0.00

    03:00:01 PM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s
    03:01:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    03:01:01 PM sit0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: eth1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: sit0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

    03:00:01 PM call/s retrans/s read/s write/s access/s getatt/s
    03:01:01 PM 0.45 0.00 0.00 0.00 0.07 0.28
    Average: 0.45 0.00 0.00 0.00 0.07 0.28

    03:00:01 PM scall/s badcall/s packet/s udp/s tcp/s hit/s miss/s sread/s swrite/s saccess/s sgetatt/s
    03:01:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    Average: 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

    03:00:01 PM pgpgin/s pgpgout/s fault/s majflt/s
    03:01:01 PM 78841.19 505.52 2327.63 0.38
    Average: 78841.19 505.52 2327.63 0.38

    03:00:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
    03:01:01 PM 103824 20447528 99.49 92568 16664180 8385768 152 0.00 0
    Average: 103824 20447528 99.49 92568 16664180 8385768 152 0.00 0

    03:00:01 PM dentunusd file-sz inode-sz super-sz %super-sz dquot-sz %dquot-sz rtsig-sz %rtsig-sz
    03:01:01 PM 5603 4080 8992 0 0.00 0 0.00 0 0.00
    Average: 5603 4080 8992 0 0.00 0 0.00 0 0.00

    03:00:01 PM totsck tcpsck udpsck rawsck ip-frag
    03:01:01 PM 1127 872 8 0 0
    Average: 1127 872 8 0 0

    03:00:01 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
    03:01:01 PM 2 1143 5.11 5.56 5.75
    Average: 2 1143 5.11 5.56 5.75

  11. Chris Nigh says:

    I forgot to add, we are running RHEL5, and MySQL 5.0.53 on an ext2 file system. Our databases are all read only.

  12. Chris Nigh says:

    5.0.51a
    sorry..

  13. francesco says:

    I am very interested in MySQL on Memory Appiance. We have a database with 50 millions of records (emails), and 50 columns (100byte foreach). We need to be fast (50k-100k request/sec; 20k update/sec). With mySql and a memory applance is it possible?

  14. Pavel says:

    Here is OLTP test run on another memory storage solution:

    sysbench –test=oltp –oltp-table-size=1000000 –db-driver=mysql –mysql-table-engine=innodb –mysql-socket=/var/lib/mysql/mysql.sock –mysql-user=root prepare

    sysbench –num-threads=16 –max-requests=100000 –test=oltp –mysql-table-engine=innodb –db-driver=mysql –mysql-socket=/var/lib/mysql/mysql.sock –mysql-user=root run

    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
    Using “BEGIN” for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 100000
    Threads started!
    Done.

    OLTP test statistics:
    queries performed:
    read: 1400224
    write: 500046
    other: 200018
    total: 2100288
    transactions: 100002 (1394.15 per sec.)
    deadlocks: 14 (0.20 per sec.)
    read/write requests: 1900270 (26491.99 per sec.)
    other operations: 200018 (2788.49 per sec.)

    Test execution summary:
    total time: 71.7300s
    total number of events: 100002
    total time taken by event execution: 1145.6317
    per-request statistics:
    min: 0.0027s
    avg: 0.0115s
    max: 0.0979s
    approx. 95 percentile: 0.0313s

    Threads fairness:
    events (avg/stddev): 6250.1250/66.05
    execution time (avg/stddev): 71.6020/0.01

  15. Pavel says:

    and the same test on MD3000 15K RPM HDs RAID 10

    sysbench –test=oltp –mysql-table-engine=innodb –oltp-table-size=1000000 –mysql-socket=/tmp/mysql.sock –mysql-db=test –mysql-user=root –mysql-password= prepare
    sysbench –num-threads=16 –max-requests=100000 –test=oltp –mysql-table-engine=innodb –oltp-table-size=1000000 –mysql-socket=/tmp/mysql.sock –mysql-db=test –mysql-user=root –mysql-password= run

    Running the test with following options:
    Number of threads: 16

    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
    Using “BEGIN” for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 100000
    Threads started!
    Done.

    OLTP test statistics:
    queries performed:
    read: 1400014
    write: 500005
    other: 200002
    total: 2100021
    transactions: 100001 (332.53 per sec.)
    deadlocks: 0 (0.00 per sec.)
    read/write requests: 1900019 (6318.06 per sec.)
    other operations: 200002 (665.06 per sec.)

    Test execution summary:
    total time: 300.7284s
    total number of events: 100001
    total time taken by event execution: 4809.0591
    per-request statistics:
    min: 0.0035s
    avg: 0.0481s
    max: 1.0817s
    approx. 95 percentile: 0.1054s

    Threads fairness:
    events (avg/stddev): 6250.0625/90.30
    execution time (avg/stddev): 300.5662/0.01

    SEE THE DIFFERENCE ???

  16. Hey Pavel.

    Thanks for the numbers.

    Can you say what memory device it is?

  17. Pavel says:

    I’m sorry Kevin, but unfortunately I cannot disclose this info since we do not own the device and I would have to ask permission from the manufacturer :-/

  18. peter says:

    Pavel,

    What is about raw IO capacity ?

    Also in your test I see you’re using just 1M table size – if Innodb is well configured this should be in memory workload.

  19. Pavel says:

    if u’re talking about IOPS – that would be 600k

  20. peter says:

    I see. 1394.15 Sysbench transactions per second is not a lot for device with 600k IOs/sec capacity.

  21. pave says:

    sucks for device i guess :)

    But i’m just showing what the difference is between DELL’s md3000 with 15K rpms HDs with raid 10 VS. ram based solution

Speak Your Mind

*