August 12, 2008

Beware of MyISAM Key Cache mutex contention

Posted by peter

Today I was working with the client loading data to MyISAM tables at very high rate. Hundreds of millions rows are loaded daily into single MySQL instance with bursts up to 100K of records/sec which need to be inserted (in the table with few indexes). It was good not all records had to go to the same table and so in theory using multiple thread to do inserts in multiple tables would not be bound by table locks and would be able to use multiple cores efficiently to get good insert rate… or so it seemed.

In reality inserting in parallel into different tables when indexes fit in memory results in very bad contention causing hundreds of thousands of context switches per second with far less work done than one would hope. As we already discussed
MyISAM key cache has serious mutex contention issue as there is global mutex which is held for the time of key_block being copied from key_cache to the thread local space.

Happily MyISAM allows you to create multiple key caches
. We use “keycache per table” this case and caused number of context switches to drop almost tenfold and performance almost doubled.

Another interesting result was - adding PACK_KEYS=0 to most actively updated tables almost doubled throughput again (and I would expect to see it more than double if all tables would be converted with this option. This is of course because some of indexes on the tables were on CHAR/VARCHAR columns. Honestly I did not expect so large difference for Inserts. I have already wrote about very large impact on joins this option has but I did not expect modification of packed keys would be so much more expensive.

After doing such changes we got insert rate to MySQL close to 200K rows/sec using standard multi value inserts which is pretty good number for indexing tables, especially considering application was doing some updates along the way too.

Related posts: :Using Multiple Key Caches for MyISAM Scalability::Beware large Query_Cache sizes::MyISAM Scalability and Innodb, Falcon Benchmarks:
 

9 Comments »

  1. what hardware is that on?? how many disks?

    Comment :: August 13, 2008 @ 2:28 pm

  2. This was 2 quad core Xeon CPUs

    The disks - this was RAID array but it does not really matter as it was fully in memory workload.

    Comment :: August 13, 2008 @ 7:56 pm

  3. 3. yingkuan

    Peter,
    why this is fully in memory workload since you said the inserts are on MYISAM tables?

    Comment :: August 14, 2008 @ 12:09 am

  4. Hi Peter,
    We also met such issue that when multiple MyISAM tables are included, the performance is worse than only one MyISAM table. And even if we add cache the table index, things does not change. Could you please help to see what our problem is? Thanks a lot!

    The cpu of hardware is 2.2 GHz, and the memory is 1G.

    All the MyISAM tables are in the same structure, the data for them are also similar. Each table has 2M records.
    The following is my MyISAM table information:

    user_id | int(11)
    call_id | int(11)
    call_direct | tinyint(4)
    my_number | varchar(255)
    start_time | datetime
    end_time | datetime

    The index for the table is: user_id+call_id

    We tested to read data from the MyISAM tables with one thread and two threads. table_cache=495, Open_tables=120, Opened_tables=133.
    1. Two threads and one table, the average response time is about 143ms
    Thread1: access table t1
    Thread2: access table t1
    procs memory swap io system cpu
    r b swpd free buff cache si so bi bo in cs us sy id wa
    1 2 32296 8656 64572 815228 0 0 1788 0 684 692 8 4 0 88
    0 2 32296 8648 64572 815508 0 0 1792 0 678 566 2 2 0 96

    2. Two thread and two tables, the average response time is about 338ms.
    Thread1: access table t1
    Thread2: access table t2
    r b swpd free buff cache si so bi bo in cs us sy id wa
    2 2 30640 8764 64448 820976 0 0 1340 0 539 422 1 3 0 96
    2 2 30756 8652 64448 821040 4 116 1244 116 546 406 3 1 0 96

    3. One thread and two tables, the average response time is about 137ms.

    Thread1: access table t1 and table t2 alternately

    4. One thread and one table, the average response time is 77ms.

    Thread1: only access table t1

    5. We tried use two thread to access MyISAM table and Innodb table, there is no such affect. Thread1 read data from t1, thread2 read data from innodb table test, the response time for thread1 is also about 80ms.

    Thread1: access table t1
    Thread2: access table test (innodb)

    After that, we tried cache the table index by following, but things does not change:
    mysql> CACHE INDEX user_call_log_entry_1, user_call_log_entry_2, user_call_log_entry_3,user_call_log_entry_4 IN call_log_cache;
    +———————————+———-+———-+
    | Table | Op | Msg_type | Msg_text |
    +——————————–+———-+———-+
    | lps.user_1 | assign_to_keycache | status | OK |
    | lps.user_2 | assign_to_keycache | status | OK |
    | lps.user_3 | assign_to_keycache | status | OK |
    | lps.user_4 | assign_to_keycache | status | OK |
    +———————————+———-+———-+
    4 rows in set (0.02 sec)

    mysql> LOAD INDEX INTO CACHE lps.user_1,lps.user_2,lps.user_3,lps.user_4 IGNORE LEAVES;
    +————+————–+———-+———-+
    | Table | Op | Msg_type | Msg_text |
    +————+————–+———-+———-+
    | lps.user_1 | preload_keys | status | OK |
    | lps.user_2 | preload_keys | status | OK |
    | lps.user_3 | preload_keys | status | OK |
    | lps.user_4 | preload_keys | status | OK |
    +————+————–+———-+———-+
    4 rows in set (2 min 1.85 sec)

    Comment :: August 14, 2008 @ 1:40 am

  5. Yingkuan,

    In memory meaning tables were small enough to fit in memory completely.

    Comment :: August 14, 2008 @ 1:25 pm

  6. 6. yingkuan

    Hi Peter,
    Correct me if I am wrong.
    As I understand the MYISAM will only cache KEY in memory, when you do insert the actual data will write to table MYD file on disk directly. So in that sense, you can’t achieve full memory workload, correct?

    Comment :: August 14, 2008 @ 2:39 pm

  7. huilingwei,

    It is hard to comment on such issue for me without looking at in details. One thing I should note - the “wa” is very high in your test this means this is IO bound case which is very different story.

    Comment :: August 14, 2008 @ 4:01 pm

  8. MYD modification is buffered in OS cache and written to the hard drive asynchronously so if you have enough memory there are no IO related waits though writes do happen of course.

    Comment :: August 14, 2008 @ 4:16 pm

  9. 9. huilingwei

    Hi Peter,
    Thank you very much!

    My server memory is 1G, but the size of the data file for these ten small tables are 1.7G(ten files and each 170M), and the index files are about 400M (ten files and each 40M). From the mysql manual, mysql only cache index, and the data will use the OS cache. If there is any possible that the memory of my server is two low to cache all the DB data. So when I tried to query from multiple tables, the OS need switch between the data files, and cause the performance get worse.

    Today I also tested that reduce the data stored in DB, each table has 40M records, and the size for all the data files are about 200M. And during my testing, there is no difference for the response time no matter I access one tables or access 10 tables alternately. This time, the wa is always 10.
    r b swpd free buff cache si so bi bo in cs us sy id wa
    2 0 70508 146860 24260 653572 0 0 0 0 212 2167 62 38 0 0

    Thanks a lot!

    Comment :: August 15, 2008 @ 1:46 am

 



Subscribe without commenting