July 30, 2014

Differences between READ-COMMITTED and REPEATABLE-READ transaction isolation levels

As an instructor with Percona I’m sometimes asked about the differences between the READ COMMITTED and REPEATABLE READ transaction isolation levels.  There are a few differences between READ-COMMITTED and REPEATABLE-READ, and they are all related to locking.

Extra locking (not gap locking)
It is important to remember that InnoDB actually locks index entries, not rows. During the execution of a statement InnoDB must lock every entry in the index that it traverses to find the rows it is modifying. It must do this to prevent deadlocks and maintain the isolation level.

If you run an UPDATE that is not well indexed you will lock many rows:

In the employees table, the column store_id is not indexed. Notice that the UPDATE has completed running (we are now running SHOW ENGINE …) but we are holding 218786 row locks and only one undo entry. This means that only one row was changed, but we are still holding extra locks.  The heap size represents the amount of memory that has been allocated for locks.

In REPEATABLE READ every lock acquired during a transaction is held for the duration of the transaction.

In READ COMMITTED the locks that did not match the scan are released after the STATEMENT completes.

Here is the UPDATE statement repeated under READ COMMITTED:

You’ll notice that the heap size is the same, but we are now holding only one lock. In all transaction isolation levels InnoDB creates locks over every index entry scanned. The difference between the levels is that once the statement completes in READ COMMITTED mode, the locks are released for the entries that did not match the scan.  Note that InnoDB does not immediately release the heap memory back after releasing the locks, so the heap size is the same as as that in REPEATABLE READ, but the number of locks held is lower (only one).

This means that in READ COMMITTED other transactions are free to update rows that they would not have been able to update (in REPEATABLE READ) once the UPDATE statement completes.

 

Consistent read views
In REPEATBLE READ, a ‘read view’ ( trx_no does not see trx_id >= ABC, sees < ABB ) is created at the start of the transaction, and this read view (consistent snapshot in Oracle terms) is held open for the duration of the transaction. If you execute a SELECT statement at 5AM, and come back in an open transaction at 5PM, when you run the same SELECT, then you will see the exact same resultset that you saw at 5AM.  This is called MVCC (multiple version concurrency control) and it is accomplished using row versioning and UNDO information.

In REPEATABLE READ InnoDB also creates gap locks for range scans.

The above update will create a gap lock that will prevent any rows with id > 100 from being inserted into the table until the transaction rolls back or commits.

In the same transaction, if the SELECT … FOR UPDATE is run at 5AM, and an UPDATE is run at 5PM (“UPDATE some_table where id > 100″) then the UPDATE will change the same rows that SELECT FOR UPDATE locked at 5AM. There is no possibility of changing additional rows, because the gap after 100 was previously locked.

 

Non-repeatable reads (read committed)
In READ COMMITTED, a read view is created at the start of each statement.   This means that a SELECT made at 5AM may show different results from the same SELECT run at 5PM, even in the same transaction.   This is because in READ COMMITTED the read view for the transaction lasts only as long as each statement execution.   As a result, consecutive executions of the same statement may show different results.

This is called the ‘phantom row’ problem.

In addition, in READ COMMITTED gap locks are never created.   Since there is no gap lock, the example SELECT .. FOR UPDATE above will not prevent insertions of new rows into the table by other transactions.   Thus, locking rows with SELECT … FOR UPDATE (ie “where id> 100″)  and subsequently updating  rows with “where id> 100″ (even in the same transaction) may result in more rows being updated than were earlier locked. This is because new rows may have been inserted in the table between the statements since there was no gap lock created for the SELECT … FOR UPDATE.

Useful links

About Justin Swanhart

Comments

  1. As a side note, if binary logging is enabled, READ-COMMITTED only behaves as you describe when using MIXED or RBR. When using SBR, in order to guarantee correct ordering in the binary log, a gap lock is still taken.

  2. I accidentally posted before editing my comment…

    What I meant to say is that, for *certain* operations, particularly INSERT … SELECT, the locking benefits of READ COMMITTED are only realized when also using MIXED or RBR replication formats because of the need to enforce correct ordering in the binary log for consistent execution on slaves.

  3. hassan usmani says:

    NEED HELP!!!

    if (isset($_POST['go'])) {

    $status = $_POST[status];
    if ($status == 4){
    $status = ” ;
    }

    if ($topic == 100 && $subtopic == ” && $status == ” ) {
    $queryString = “select * from tblquiz where userid=’$_SESSION[numericuserid]‘ LIMIT $start, $limit” ;
    $countString = “select * from tblquiz where userid=’$_SESSION[numericuserid]‘”;
    } // NOTE : THIS QUERY IS WORKING PERFECT

    elseif ($topic !== ” && $subtopic !== ” && $status !== ” ) {
    $queryString = “select * from tblquiz where qstatus=$status AND qtopic=’$topic[topicname]‘ AND subtopic=’$_POST[subtopic]‘ AND userid=’$_SESSION[numericuserid]‘ LIMIT $start, $limit” ;
    $countString = “select * from tblquiz where qstatus=$status AND qtopic=’$topic[topicname]‘ AND subtopic=’$_POST[subtopic]‘ AND userid=’$_SESSION[numericuserid]‘”;
    } // NOTE : THIS QUERY IS WORKING PERFECT

    elseif ($topic !== ” && $subtopic !== ” && $status == ” ) {
    $queryString = “select * from tblquiz where qtopic=’$topic[topicname]‘ AND subtopic=’$_POST[subtopic]‘ AND userid=’$_SESSION[numericuserid]‘ LIMIT $start, $limit” ;
    $countString = “select * from tblquiz where qtopic=’$topic[topicname]‘ AND subtopic=’$_POST[subtopic]‘ AND userid=’$_SESSION[numericuserid]‘”;
    } // NOTE : THIS QUERY IS WORKING PERFECT

    elseif ($topic == 100 && $subtopic == ” && $status !== ”) {
    $queryString = “select * from tblquiz where qstatus= $status AND userid=’$_SESSION[numericuserid]‘ LIMIT $start, $limit” ;
    $countString = “select * from tblquiz where qstatus= $status AND userid=’$_SESSION[numericuserid]‘ “;
    } // NOTE : THIS QUERY IS NOT WORKING
    }

  4. reco says:

    Nice.The implementation of select … for update in oracle database (11gr2) is not the same as in mysql v5.0 for READ COMMITTED isolation level.
    select…for update in mysql does not lock the rows interested for the update (I set isolation level = READ COMMITTED ,in MySQL , for 2 sessions , updating the same table and with set autocommit=0).But in Oracle this does happen …

  5. raul says:

    Hi Justin. Reading through your post I guess you can answer this question for me.
    1) I read a row off an InnoDB table.
    2) The data is then displayed onscreen.
    3) The user decides not to change anything but anyway hits the “commit to disk” button.

    In order to avoid a pointless commit I’m planning to compare the old data and the new data on the client, but it’s an awful lot of work.

    My question is: can MySQL ignore the commit request by comparing the data it read before and realizing it hasn’t changed at all? I’ll surely appreciate your answer.

  6. Justin,

    The read committed Phantom issue only occur within the same transaction when you read the data between 2 different time interval where a different transaction could have altered (inserted, updated, deleted) that data.

    One solution to this issue is to dump the data from the look up table(s) you need to work on (only rows and columns) in a temporary table and use them to perform what you need to do in your transaction.

    No matter what occurs your lookup table(s) involved in the transaction, your data won’t be affected and you prevent locking at the same time.

    By using read committed, you can capture data into temp table without creating potential locks. After that, you’re free to go.
    If you need a fresher copy of the data, you can always grab it along the way.

    The key is to dump only what you need and hopefully use memory tables.

Speak Your Mind

*