SQL Server AlwaysOn Availability Group Performance Internals

The estimated reading time for this post is 4 minutes

One of our customer requested us to showcase the SQL Server 2016/2017 AlwaysOn Availability Group performance increments in Synchronous replica scenarios, then I thought it is a good topic to write a blog post about. I have prepared a SQL Server AlwaysOn Availability Group on two different Virtual Machines (VM) as the following figure with 8vCores and 8GB memory each.

SQL Server 2016/2017 AOAG performance increment is on the following sections:-

  • Log Bytes Transportation Layer.
  • Parallel Redo Thread Per Replica Database.

Each of the above enhancements will be explained in this blog. Prior start experimenting, I would like to make you familiar with the testing environment. Each VM been installed with SQL Server 2017 and SQL Server 2014 Enterprise edition instance and AdventureWorks database has been restored and added as replica database into each Availability Group.

 

The testing is simple as BULK INSERT operation by the following TSQL script on AdventureWorks:-

USE AdventureWorks;
GO
WHILE(1=1) BEGIN
INSERT INTO [dbo].[DatabaseLog]
SELECT TOP 100 [PostTime]
      ,[DatabaseUser]
      ,[Event]
      ,[SCHEMA]
      ,[OBJECT]
      ,[TSQL]
      ,[XmlEvent]
  FROM [AdventureWorks].[dbo].[DatabaseLog] WITH (NOLOCK);
END
The above test script will be run by Seven (7) BATCH files to produce more transaction log records during our testing. The first test will be against SQL Server 2014 and then SQL Server 2017. Meanwhile, I did monitor the following performance counters to perform some comparison between SQL Server versions:-
  • Databases\Log Bytes Flushed/sec
  • Availability Replica\Bytes Sent to Replica/sec

The above performance counter shows that SQL Server 2014 performs log transport compression to reduce the packet counts transfer between Availability Group replicas. In our test case, the compression is 6.6X. SQL Server 2014 log record compression operation can lead to increase the transaction latency in primary replica.

Let’s run the same test against the SQL Server 2017, meanwhile I monitor the mentioned performance counters to make the comparison.

As you observe from the above performance counter graph, SQL Server 2017 does not perform log transport compression anymore by default in Synchronous mode. This decreases the transaction latency in primary replica. The log transport compression can be enabled by T9592. Keep in mind that log stream/transport compression is enabled by default in Asynchronous mode of replica and it can be disabled by T1462.

Compression is disabled by default for Automatic Seeding to reduce the CPU usage, you may turn in on by T9567 trace flag.

Beside the above enhancements in SQL Server AlwaysOn Availability Group, there is one more major enhancement that is quite game changer and makes every replica to be sync at almost close to real-time. In SQL Server 2016/2017 the Redo log is parallel thread unlike prior versions. The following text been snapshot from ERRORLOG file.

Redo operation uses a pool of threads for redo which is limited to 100 threads (by default). The number of threads that redo operation uses per replica database is up to half of the number of cores or maximum 16 threads per replica database. If the total number of required threads for a single instance exceeds 100, SQL Server uses a single redo thread for every remaining database. Each primary replica uses 1 Log Capture thread for each primary database. In addition, it uses 1 Log Send thread for each secondary database. Log send and Redo threads are released after ~15 seconds of inactivity. Remember that a backup on a secondary replica holds a thread on the primary replica for the duration of the backup operation.

In-Memory OLTP feature causes that redo operation become single-threaded. In fact, you don’t even have to use in-memory. It checks to see if an in-memory filegroup exists, and if it does, it operates redo serially.

The following script been used to capture the number of active redo threads on secondary replica:-

SELECT @@SERVERNAME;
GO
SELECT (COUNT(DISTINCT(os_thread_id)) -1) [Redo Threads] FROM #Temp;
SELECT DISTINCT Sysdatetime() AS [TIME],R.session_id,R.command,DB_NAME(R.database_id) [DBName],R.status,T.task_state , TH.thread_address, TH.os_thread_id 
INTO #temp
FROM sys.dm_exec_requests R 
INNER join sys.dm_os_tasks T ON T.session_id = R.session_id
INNER join sys.dm_os_threads TH ON TH.worker_address = T.worker_address
WHERE R.command = 'DB STARTUP'
 
 
WHILE(1=1) BEGIN
INSERT INTO #temp
 
SELECT DISTINCT Sysdatetime() AS [TIME],R.session_id,R.command,DB_NAME(R.database_id) [DBName],R.status,T.task_state , TH.thread_address, TH.os_thread_id 
FROM sys.dm_exec_requests R 
INNER join sys.dm_os_tasks T ON T.session_id = R.session_id
INNER join sys.dm_os_threads TH ON TH.worker_address = T.worker_address
WHERE R.command = 'DB STARTUP'
END
 
 
SELECT @@SERVERNAME;
GO
SELECT (COUNT(DISTINCT(os_thread_id)) -1) [Redo Threads] FROM #Temp;

The above figure shows that SQL Server 2014 is using single thread and SQL Server 2017 is using multi thread for redo operations. Hope this blog post was interesting to you. 

Hamid J. Fard

I am SQL Server Data Platform Expert with more than 9 years’ of professional experience, I am currently Microsoft Certified Master: SQL Server 2008, Microsoft Certified Solutions Master: Charter-Data Platform, Microsoft Data Platform MVP and CIW Database Design Specialist. I also do Pig, Hive and Sqoop development regarding Hadoop Big Data platform. After a few years of being a production database administrator I jumped into the role of Data Platform Expert. Being a consultant allows me to work directly with customers to help solve questions regarding database issues for SQL Server.

More Posts

Follow Me:
FacebookLinkedIn