SQL Server In-Memory Multi-Version Concurrency Control

The estimated reading time for this post is 7 minutes

In the previous blog post regarding SQL Server In-Memory Internals, I wrote an overview of how In-Memory OLTP engine controls the concurrency without acquiring Lock and Latches, and how it handles the transactions. In this blog post I would like to write more in depth regarding concurrency control in this new SQL Server engine.

Basically the structure of the In-Memory Optimized rows reflects the engine behavior regarding concurrency control, lets take a look into the row structure.

inmemrowstructure

As above figure, the in-memory row structure is totally different from disk-base table row structure, it even does not use 8KB pages to store the data, instead it is allocating and storing data once it is needed unlike disk-based tables to prepare the data pages in buffer pool then read/write the data rows.

The row header consists of the following fields:-

  • Begin_Ts: It reflects the timestamp of the ‘Insert’ once the transaction is committed.
  • End_Ts: It reflects the timestamp of the ‘Delete’ once the transaction is committed.
  • StmtId: It is to avoid Halloween attack by keeping the unique ‘StmtId’, which is created the row.
  • IdxLinkCount: Represents the number of indexes.
  • Padding: Not yet used, extra bytes for future features.
  • Index Pointer: It is C language pointer, and it points to the index and next row.

The payload section of the row contains all the data columns and index keys, the payload format is vary and depends on the table schema and In-Memory Native Compiler generates DLLs (C Language) to describe the payload format for transactional operations.

As I mentioned above, SQL Server In-Memory OLTP does not need to acquire Lock or Latch to provide concurrency, instead it does row versioning by creating a new version of the row during modifications. In-Memory storage engine uses Begin_Ts and End_Ts fields to determine which row needs to be visible for the specific transaction.

The In-Memory storage engine totally is not aware of ‘Versioning’ concept and it only sees multiple rows that are connected to each other by index linkage.

Every database that supports Memory-Optimized tables has the following internal counters in order to support Multi-Version Concurrency Control (MVCC):-

  • Transaction-ID: This is incremental unique global value for In-Memory OLTP transactions and restarts once the SQL Server instance restarts.
  • Global Transaction Timestamp: This holds the unique ‘Commit’ timestamp and SQL Server increment it for every new transaction. SQL Server instance restart does not affect this counter.

SQL Server keeps track of each active transaction in an internal, global transaction table. When a transaction starts, SQL Server increments the Transaction-ID counter, and assigns a unique transaction ID to the transaction. When the transaction issues a COMMIT, SQL Server generates a commit timestamp, which it stores in the internal table initially, and then writes the value to the header of affected rows, once it validates the transaction and hardens the changes to the log.

inmemtrans

SQL Server also stores in the internal table a system timestamp that indicates the transaction’s start time; it is actually the timestamp for the last committed transaction at the time this transaction started and indicates when the transaction began, relative to the serialization order of the database. More than one transaction can have the same start timestamp. This start timestamp is never used anywhere else, and is only in the internal table while the transaction is active.

Now, I would like to write in more detail how SQL Server In-Memory OLTP storage engine does row versioning and handles the transactions. It is worth taking a preview here of a simple data modification example. Imagine we have two rows as the following in our memory-optimized table:-

inmemrow1

Now, we assume a transaction with Tx1 transaction Id and the timestamp of 90 is going to do the following operations:-

  1. Delete <John, London>
  2. Update <Hamid, Tehran> to <Hamid, Kuala Lumpur>
  3. Insert <James, New York>

Lets see how the row version evolve during regular-processing, validation and post-processing or commit phases and how SQL Server controls which row version needs to be visible for the active concurrent transactions.

inmemrow2

During regular-processing, SQL Server places the transaction id (Tx1) for End_Ts for deleted rows and for Begin_Ts for Inserted rows. At same time SQL Server uses an extra bit flag to represent that is transaction id instead of timestamp value.

For the delete operation, SQL Server locates the row via using an index and places the transaction id into the End_Ts field to mark it as deleted, and the update operation occurs as an atomic operation consists of two separate operations as insert followed by delete. At the end, our transaction COMMIT and SQL server generates commit timestamp at 120 (For example) and stores this value into the Global Transactions Table, this timestamp represents that transaction logically completed but not yet write into the row headers, because SQL Server not yet validates the transaction and also not yet written the transaction into the log file and SQL Server will not acknowledge the commit to the user until validation completes.

 SQL Server will optimistically assume that the transaction will actually commit, and makes the row available to other transactions as soon it receives the COMMIT.

Regular-Processing

Lets make it more interesting, imagine there is another transaction with transaction id Tx2 and wants to read the data before SQL Server validates the Tx1.

In Disk-based tables, SQL Server does pessimistic concurrency and holds Lock on data to avoid reader to read the data that is updating by writer. In contrast, thanks to the MVCC model, the same reader of a memory-optimized table simply continues processing. let’s assume Tx2 starts at timestamp 100, after Tx1 started but before Tx1 issues  a COMMIT. Tx2 will read the <John, London> row, and find that End-Ts contains a Transaction-ID, Tx1, indicating that the row may have been deleted. Tx2 locates  Tx1 in the Global Transaction Table and checks its status to determine if the deletion  of <John, London> is complete. In our example, Tx1 is still active and so Tx2 can access the <Hamid, Tehran> row.
When it accesses the <Hamid, Kuala Lumpur> row version, Tx2 will see the transaction id in Begin-Ts, check the Global Transaction Table, find Tx1 is still active, follow the index pointer in the header back to the previous row version, and instead return the row version <Hamid, Tehran>. Likewise, Tx2 will not return the row <James, New York>.

Now different scenario, lets assume that Tx2 is started at timestamp 121 after Tx1 Committed but SQL Server not yet validates Tx1. Base on our assumption, Tx2 can read data row that has Begin_Ts less than equal 121 and End_Ts more than 121, or infinite.

inmemrow3

Since SQL Server has yet to validate transaction Tx1, it registers a Commit Dependency between Tx2 and Tx1. This means that SQL Server will not complete validation of Tx2, nor acknowledge the commit of Tx2 to the user, until it completes validation of Tx1.

 Any blocking waits that arise from the resolution of commit dependencies will be minimal. Of course, a long-running transaction in an OLTP system is still going to cause some blocking, although never lock-related blocking.

Now imagine Tx2, starting at timestamp 100, instead of just reading the rows,  wants to update <Hamid, Tehran> to <Hamid, London>. At this stage, the End-Ts value for  the <Hamid, Tehran> row will contain a transaction ID and this, since SQL Server optimistically assumes that Tx1 will commit, immediately indicates a write-write conflict, and Tx2 will abort immediately.

Validation

Once the transaction is committed, SQL Server generates the commit timestamp in Global Transaction Table and lead the transaction enters the validation phase. In this phase SQL Server detects the potential transaction isolation level violations, for example:  Tx1 was accessing the memory-optimized table in REPEATABLE READ isolation. It reads a row value and then Tx2 updates that row value, which it can do because SQL Server acquires no locks in the MVCC model, and issues a commit before Tx1 commits. When Tx1 enters the validation phase, it will fail the validation check and SQL Server will abort the transaction. If there are no violations, SQL Server proceeds with other actions that will culminate in guaranteeing the durability of the transaction.

The following list is the summarized processes that SQL Server does during validation phase:-

  1. Validate the changes made by a transaction
  2. Wait for any commit dependencies to resolve
  3. Write the transaction to disk
  4. Mark the transaction commit as validated
  5. Clear dependencies

Post-Processing or Commit Phase

At this phase, SQL Server writes the commit timestamp into the row header of involved rows in the transaction, based on our example, the rows will be look like the following:-

inmemrow3

Conclusion

The SQL Server In-Memory OLTP engine supports true optimistic concurrency by using MVCC model and Bw-Tree row structure, based on in-memory row versioning. MVCC means that SQL Server can avoid read-write conflicts without the need for any locking or latching, and will raise write-write conflicts immediately, rather than after a delay.

Hamid J. Fard

I am SQL Server Data Platform Expert with more than 9 years’ of professional experience, I am currently Microsoft Certified Master: SQL Server 2008, Microsoft Certified Solutions Master: Charter-Data Platform, Microsoft Data Platform MVP and CIW Database Design Specialist. I also do Pig, Hive and Sqoop development regarding Hadoop Big Data platform. After a few years of being a production database administrator I jumped into the role of Data Platform Expert. Being a consultant allows me to work directly with customers to help solve questions regarding database issues for SQL Server.

More Posts

Follow Me:
FacebookLinkedIn


Leave a Comment

Be the First to Comment!

Notify of
avatar
1000
wpDiscuz