July 28, 2014

MyISAM concurrent insert

Arjen posted a good note about MyISAM concurrent-insert features, though I should mention concurrent-insert can be cause of
scalablity and peformance problems on SMP boxes, especially on queries processing range-queries. The reason of problems is POSIX read-write locks, which are used to protect key_buffer from concurrent changes and called for each processed row. More info you can get from my UC2006 talk, in short on main platforms (Linux, Solaris, Windows) rw-locks have a bad implementation and too many calls cause waste of CPU in user-space (Solaris) or kernel-space (Linux).

Some results for MyISAM table, query

, id – primary key.
Boxes: Sun V40z, Solaris 10, 4 x Dual Core Opteron @ 2.2GHz (8 logical cpu), 16GB of RAM, StorEdge 3310
and Quadxeon, RedHat AS 3, 2.4.21-15.Elsmp, 4 x Intel(R) XEON(TM) MP CPU 2.00GHz, 4GB of RAM, SATA RAID 10
The results in queries per sec (more is better)

threadsQuadxeon
with enabled concurrent-insert
Quadxeon
–skip-concurrent-insert
Sun V40z
with enabled concurrent-insert
Sun V40z
–skip-concurrent-insert
144.0864.8261.06129.13
232.63123.3352.63244.03
424.95176.6220.03463.62
819.92206.8112.34483.47
1619.73208.6612.3428.35
3219.77212.8312.25445.66

So if you are using range-queries and doing INSERT not often than disabling concurrent-insert with –skip-concurrent-insert can improve MyISAM performance.

As I said the reason of bad scalability is rw-locks and currently MySQL developers are working on CPU-depended rw-locks implementstation, this will be available in MySQL 5.1 – 5.2

About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. Jay Pipes says:

    Hi Vadim!

    Great info. Could you by any chance give us a heads up on what the “tipping point” is where you see scalability issues with concurrent insert? Is there a specific % read/write where MyISAM starts to have trouble, or does the situation normally occur when there is a huge range condition like the one above (20000 rows)?

    Cheers!

    Jay

  2. Vadim says:

    Jay,

    It happens even with only SELECT queries.
    The problem is pthread_rwlock_rdlock is called for each scanned rows – that is the more rows we are processing the problem
    is worse.
    I don’t have exactly number of rows – but I saw the same problem for 10.000 rows.
    Good idea for benchmarks to check when the problem popups – I will try it soon.

  3. Apachez says:

    How does the “concurrent-insert = 2″ feature described by Arjen affect the outcome of the above benchmark ?

  4. Vadim says:

    I did not test concurrent-insert = 2, but looking at source code – it changes SELECTs code in no way. So the results will be the same.

  5. citrin says:

    Which soft was used in this benchmark?

  6. Vadim says:

    Citrin,

    I usually use sysbench (http://sysbench.sf.net).

  7. Jerry says:

    What is the negative impact to turn the concurrent_insert off? Will it make insert waits much longer? Is there any bug or issue related to turn off this default feature?
    Will turning off the feature help queries which have no insert on the same table at the same time?

    My system has update:Insert:Select ratio is 3:1.5:1 so the db is write more than read.

    Thanks.

  8. Vadim says:

    Jerry,

    Well…
    There are several points:
    1. Possible negative effect – your INSERTS will be a bit longer. I don’t know any bugs with disabled concurrent insert.
    2. If you use only point selects, or selects that retieve only several rows – I don’t think concurrent=off will help you much – perhaps otherwise.
    3. if you use select retrieves many rows – you definitely should use concurrent=off
    4. if your database is more write that read – then I again don’t think concurrent is important for you – most time
    tables are in WRITE lock.
    5. But better way – test it – for example you can take mysql with patched query log and analyze execution time of queries with concurrent insert = off and on.

  9. Marc Pinel says:

    Hi man,

    i have concurrent-insert=2 set….how would the positively impact the performance? i have a torrent tracker (legal content ofcourse) with heavy php and mysql visits daily…and some spikes.

  10. Bandara says:

    Hi,

    Any one know how to run mySQL from the visual studio 2008. I tried by building the mySQL source code and it worked fine. But problem is how to run it. So if any one can provide step by step procedure its great.

  11. Richard says:

    OK, I’ve never mucked much with this setting, or ever actually, but the stock site I manage is about 170 gig on drive, daily does millions of inserts, updates, selects, and a few 1000 deletes. Some of the larger tables are well over 100,000,000 rows of data, that are part of the daily changes. I have noticed since more and more data is being written, changed, I’m getting slow query issues reading and writing. Do you think if I were to disabled concurrent insert this would help or worsen the select from time issues (the tables are Primary Key index ID,Date, and indexed Date, with the Select statements always:

    Select from table where ID = ###### Order by Date

    with sometimes

    AND Date >= 20100101 and Date <= 20110101 (for ranges of between specific dates)

Speak Your Mind

*