Windows Server 2016 Storage Replica technology enables synchronous replication of volumes between servers or clusters for disaster recovery. It also enables you to use asynchronous replication to create failover clusters that span two sites, with all nodes staying in sync.
Storage Replica supports synchronous and asynchronous replication:
- Synchronous replication mirrors data within a low-latency network site with crash-consistent volumes to ensure zero data loss at the file-system level during a failure.
- Asynchronous replication mirrors data across sites beyond metropolitan ranges over network links with higher latencies, but without a guarantee that both sites have identical copies of the data at the time of a failure.
This blog post guide you, how to implement SQL Server multi-subnet failover cluster with use of Windows Server 2016 Storage Replica feature. I do make SQL Server AlwaysOn Failover Cluster Instance (Active-Passive) on each site.
As you see the following figure, it shows the overall architecture of implementation:-
Lets Implement It!
So far you know what is the Storage Replica in Windows Server 2016. At this section I am going to implement above SQL Server Multi-Subnet FCI by using Hyper-V, Powershell and SQL Server. I assume that all the servers are clustered accordingly and SQL Servers are installed and configured base on best practices.
Minimum requirement to implement Stretch Cluster are as followings:-
- Active Directory Domain Services forest (does not need to run Windows Server 2016).
- Two servers with Windows Server 2016 installed.
- Two sets of storage, using SAS JBODs, fibre channel SAN, iSCSI target, or local SCSI/SATA storage. The storage should contain a mix of HDD and SSD media. You will make each storage set available only to each of the servers, with no shared access.
- Each set of storage must allow creation of at least two virtual disks, one for replicated data and one for logs. The physical storage must have the same sector sizes on all the data disks. The physical storage must have the same sector sizes on all the log disks.
- At least one Ethernet/TCP connection on each server for synchronous replication, but preferably RDMA.
- Appropriate firewall and router rules to allow ICMP, SMB (port 445, plus 5445 for SMB Direct) and WS-MAN (port 5985) bi-directional traffic between all nodes.
- A network between servers with enough bandwidth to contain your IO write workload and an average of =5ms round trip latency, for synchronous replication. Asynchronous replication does not have a latency recommendation.
- The replicated storage cannot be located on the drive containing the Windows operating system folder.
Prior enabling and configuring Storage Replica, we do need to check whether our environment meets the minimum requirements, the following PowerShell script test the environment for 3 minutes:-
Test-SRTopology -SourceComputerName KL-Storage.Fard.com -SourceVolumeName D: -SourceLogVolumeName E: -DestinationComputerName TEH-Storage.Fard.com -DestinationVolumeName D: -DestinationLogVolumeName E: -DurationInMinutes 3 -ResultPath C:SRTEST
The following chart shows “Destination Data Disk Initial Sync Performance”:-
and also the following table shows the “Replication Write IO Latency”:-
To enable Cluster Stretch Replica feature, we do need to go through the following steps:-
Step 1: Create Windows Failover Cluster with all nodes.
Step 2: Make the Source Disks as “Cluster Shared Volumes” and Configure “Replication” on the same CSV disk.
Step 3: Install SQL Server Failover Cluster Instance on Windows Failover Cluster.
SQL Server AlwaysOn FCI – Failover Test
Once both sites are synchronized through Storage Replica feature, we are in need of SQL Server failover test. To achieve this goal, I am gonna shutdown [Kuala Lumpur – Malaysia] site and start SQL Server role in [Tehran-Iran] site. Let’s shutdown entire “Kuala Lumpur – Malaysia” site by turning off the VMs, as following figure shows:-
Let’s start the SQL Server role in [Tehran – Iran] site by using Failover Cluster Manager as shown in the following figures:-
How Storage Replica (Synchronous) Works!
Why Windows Server 2016 Storage Replica?
Storage Replica offers new disaster recovery and preparedness capabilities in Windows Server 2016. For the first time, Windows Server offers the peace of mind of zero data loss, with the ability to synchronously protect data on different racks, floors, buildings, campuses, counties, and cities. After a disaster strikes, all data will exist elsewhere without any possibility of loss. The same applies before a disaster strikes; Storage Replica offers you the ability to switch workloads to safe locations prior to catastrophes when granted a few moments warning – again, with no data loss.
Storage Replica allows more efficient use of multiple datacenters. By stretching clusters or replicating clusters, workloads can be run in multiple datacenters for quicker data access by local proximity users and applications, as well as better load distribution and use of compute resources. If a disaster takes one datacenter offline, you can move its typical workloads to the other site temporarily.
Storage Replica also provides the following features:-
- Zero data loss, block-level replication. With synchronous replication, there is no possibility of data loss. With block-level replication, there is no possibility of file locking.
- Guest and host. All capabilities of Storage Replica are exposed in both virtualized guest and host-based deployments. This means guests can replicate their data volumes even if running on non-Windows virtualization platforms or in public clouds, as long as using Windows Server 2016 in the guest.
- SMB3-based. Storage Replica uses the proven and mature technology of SMB 3, first released in Windows Server 2012. This means all of SMB’s advanced characteristics – such as multichannel and SMB direct support on RoCE, iWARP, and InfiniBand RDMA network cards – are available to Storage Replica.
- Security. Unlike many vendor’s products, Storage Replica has industry-leading security technology baked in. This includes packet signing, AES-128-GCM full data encryption, support for Intel AES-NI encryption acceleration, and pre-authentication integrity man-in-the-middle attack prevention. Storage Replica utilizes Kerberos AES256 for all authentication between nodes.
- High performance initial sync. Storage Replica supports seeded initial sync, where a subset of data already exists on a target from older copies, backups, or shipped drives. Initial replication will only copy the differing blocks, potentially shortening initial sync time and preventing data from using up limited bandwidth. Storage replicas block checksum calculation and aggregation means that initial sync performance is limited only by the speed of the storage and network.
- Consistency groups. Write ordering guarantees that applications such as Microsoft SQL Server can write to multiple replicated volumes and know the data will write on the destination server sequentially.
- Network Constraint. Storage Replica can be limited to individual networks by server and by replicated volumes, in order to provide application, backup, and management software bandwidth.
- Thin provisioning. Support for thin provisioning in Storage Spaces and SAN devices is supported, in order to provide near-instantaneous initial replication times under many circumstances.