As an instructor with Percona I’m sometimes asked about the differences between the READ COMMITTED and REPEATABLE READ transaction isolation levels. There are a few differences between READ-COMMITTED and REPEATABLE-READ, and they are all related to locking.
Extra locking (not gap locking)
It is important to remember that InnoDB actually locks index entries, not rows. During the execution of a statement InnoDB must lock every entry in the index that it traverses to find the rows it is modifying. It must do this to prevent deadlocks and maintain the isolation level.
If you run an UPDATE that is not well indexed you will lock many rows:
update employees set store_id = 0 where store_id = 1; ---TRANSACTION 1EAB04, ACTIVE 7 sec 633 lock struct(s), heap size 96696, 218786 row lock(s), undo log entries 1 MySQL thread id 4, OS thread handle 0x7f8dfc35d700, query id 47 localhost root show engine innodb status
In the employees table, the column `store_id` is not indexed. Notice that the UPDATE has completed running (we are now running SHOW ENGINE …) but we are holding 218786 row locks and only one undo entry. This means that only one row was changed, but we are still holding extra locks. The heap size represents the amount of memory that has been allocated for locks.
In REPEATABLE READ every lock acquired during a transaction is held for the duration of the transaction.
In READ COMMITTED the locks that did not match the scan are released after the STATEMENT completes.
Here is the UPDATE statement repeated under READ COMMITTED:
---TRANSACTION 1EAB06, ACTIVE 11 sec 631 lock struct(s), heap size 96696, 1 row lock(s), undo log entries 1 MySQL thread id 4, OS thread handle 0x7f8dfc35d700, query id 62 localhost root show engine innodb status
You’ll notice that the heap size is the same, but we are now holding only one lock. In all transaction isolation levels InnoDB creates locks over every index entry scanned. The difference between the levels is that once the statement completes in READ COMMITTED mode, the locks are released for the entries that did not match the scan. Note that InnoDB does not immediately release the heap memory back after releasing the locks, so the heap size is the same as as that in REPEATABLE READ, but the number of locks held is lower (only one).
This means that in READ COMMITTED other transactions are free to update rows that they would not have been able to update (in REPEATABLE READ) once the UPDATE statement completes.
Consistent read views
In REPEATBLE READ, a ‘read view’ ( trx_no does not see trx_id >= ABC, sees < ABB ) is created at the start of the transaction, and this read view (consistent snapshot in Oracle terms) is held open for the duration of the transaction. If you execute a SELECT statement at 5AM, and come back in an open transaction at 5PM, when you run the same SELECT, then you will see the exact same resultset that you saw at 5AM. This is called MVCC (multiple version concurrency control) and it is accomplished using row versioning and UNDO information.
In REPEATABLE READ InnoDB also creates gap locks for range scans.
select * from some_table where id > 100 FOR UPDATE;
The above update will create a gap lock that will prevent any rows with id > 100 from being inserted into the table until the transaction rolls back or commits.
In the same transaction, if the SELECT … FOR UPDATE is run at 5AM, and an UPDATE is run at 5PM (“UPDATE some_table where id > 100″) then the UPDATE will change the same rows that SELECT FOR UPDATE locked at 5AM. There is no possibility of changing additional rows, because the gap after 100 was previously locked.
Non-repeatable reads (read committed)
In READ COMMITTED, a read view is created at the start of each statement. This means that a SELECT made at 5AM may show different results from the same SELECT run at 5PM, even in the same transaction. This is because in READ COMMITTED the read view for the transaction lasts only as long as each statement execution. As a result, consecutive executions of the same statement may show different results.
This is called the ‘phantom row’ problem.
In addition, in READ COMMITTED gap locks are never created. Since there is no gap lock, the example SELECT .. FOR UPDATE above will not prevent insertions of new rows into the table by other transactions. Thus, locking rows with SELECT … FOR UPDATE (ie “where id> 100″) and subsequently updating rows with “where id> 100″ (even in the same transaction) may result in more rows being updated than were earlier locked. This is because new rows may have been inserted in the table between the statements since there was no gap lock created for the SELECT … FOR UPDATE.