SQL Server Always On Setup Guide

Why Choose Always On Availability Groups?

Always On Availability Groups (AG) are Microsoft’s native high‑availability and disaster‑recovery (HADR) solution for SQL Server. They provide:

  • Automatic failover with synchronous‑commit replicas.
  • Read‑only routing for reporting workloads.
  • Granular failover at the database level, not the instance level.
  • Support for up to eight secondary replicas (SQL Server 2019+).

For DBAs accustomed to Oracle Data Guard or PostgreSQL streaming replication, AGs feel familiar yet bring unique Windows‑centric capabilities such as Windows Server Failover Clustering (WSFC) and integrated listener support.

Prerequisites Overview

1. SQL Server Edition & Version

Enterprise edition is required for full AG functionality (basic AGs are available in Standard edition from SQL Server 2016 SP1 onward). Ensure all nodes run the same major version and build.

2. Windows Server Failover Clustering (WSFC)

AGs rely on a WSFC for quorum, node health detection, and resource management. Install the Failover-Clustering feature on each node and configure a cluster with a dedicated IP address.

3. Network & Storage Requirements

  • Separate subnets for data replication (private) and client access (public) are recommended.
  • Use high‑throughput, low‑latency links (10 GbE or better).
  • Do not place database files on shared storage; each replica must have local storage (e.g., SAN LUNs or direct‑attached NVMe).

4. Domain & Permissions

All nodes must be members of the same Active Directory domain. The SQL service account requires Log on as a service and Log on as a batch job rights, plus permission to create and manage the AG listener DNS record.

Step‑by‑Step Configuration

2.1 Create the WSFC Cluster


# PowerShell – create a new cluster
Import-Module FailoverClusters
New-Cluster -Name SQLAGCluster -Node Node01,Node02 -StaticAddress 10.0.0.100 -NoStorage

2.2 Enable Always On Feature on Each Instance

Open SQL Server Configuration Manager → SQL Server Services → Right‑click the instance → Properties → “Always On High Availability” → Check “Enable Always On”. Restart the service.

2.3 Prepare Databases for Replication

  1. Set the recovery model to FULL.
  2. Take a full backup and a subsequent transaction‑log backup.
  3. Restore the backup on each secondary with NORECOVERY.

-- Primary
ALTER DATABASE MyDB SET RECOVERY FULL;
BACKUP DATABASE MyDB TO DISK='\\backup\MyDB_Full.bak';
BACKUP LOG MyDB TO DISK='\\backup\MyDB_Log.trn';
— Secondary
RESTORE DATABASE MyDB FROM DISK=’\\backup\MyDB_Full.bak’ WITH NORECOVERY;
RESTORE LOG MyDB FROM DISK=’\\backup\MyDB_Log.trn’ WITH NORECOVERY;

2.4 Create the Availability Group

Use SQL Server Management Studio (SSMS) or T‑SQL. The example below creates an AG named AG_Sales with one synchronous‑commit secondary.


CREATE AVAILABILITY GROUP AG_Sales
FOR DATABASE MyDB
REPLICA ON
N'Node01' WITH (
ENDPOINT_URL = N'tcp://Node01.domain.com:5022',
AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
FAILOVER_MODE = AUTOMATIC,
BACKUP_PRIORITY = 1,
PRIMARY_ROLE (ALLOW_CONNECTIONS = ALL),
SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY)
),
N'Node02' WITH (
ENDPOINT_URL = N'tcp://Node02.domain.com:5022',
AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
FAILOVER_MODE = AUTOMATIC,
BACKUP_PRIORITY = 2,
PRIMARY_ROLE (ALLOW_CONNECTIONS = ALL),
SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY)
);

2.5 Add the Listener

The listener is a virtual network name that clients use to connect. It abstracts the underlying replica IP.


ALTER AVAILABILITY GROUP AG_Sales
ADD LISTENER N'AG_Sales_Listener' (
WITH IP (('10.0.0.150', 1433))
);

2.6 Join Secondary Replicas

After the primary AG is created, join each secondary:


ALTER AVAILABILITY GROUP AG_Sales JOIN;

2.7 Verify the Configuration


SELECT ag.name, ar.replica_server_name, ar.role_desc, ar.availability_mode_desc,
ar.failover_mode_desc, dr.synchronization_state_desc
FROM sys.availability_groups ag
JOIN sys.availability_replicas ar ON ag.group_id = ar.group_id
JOIN sys.dm_hadr_database_replica_states dr ON ar.replica_id = dr.replica_id;

Best Practices for Production Deployments

  • Quorum Configuration: Use a Node and Disk Majority model, or add a File Share Witness on a separate server to avoid split‑brain scenarios.
  • Endpoint Security: Secure the AG endpoint with a certificate and enforce encryption (TLS).
  • Backup Strategy: Perform backups on the primary replica by default; however, you can offload backups to a secondary to reduce I/O pressure. This mirrors the concept of RMAN backup automation in Oracle.
  • Performance Tuning: Monitor log_send_rate, redo_queue_size, and replica_latency via DMVs such as sys.dm_hadr_database_replica_states. Adjust max_transmission_size and network QoS if needed.
  • Read‑Only Routing: Define routing lists to balance reporting queries across secondary replicas.
  • Maintenance Mode: Use ALTER AVAILABILITY GROUP … SET (AUTOMATIC_SEEDING = OFF) before applying patches, then re‑seed as needed.

Failover Testing

Regularly test both planned and unplanned failovers to ensure SLA compliance.

Planned Manual Failover


ALTER AVAILABILITY GROUP AG_Sales FAILOVER;

Observe the role transition in the SSMS dashboard or via the DMV query shown earlier.

Simulating Unplanned Failover

Stop the SQL service on the primary node or disconnect its network adapter. The WSFC will detect the failure and promote the secondary automatically (if configured for automatic failover).

Monitoring & Alerting

Integrate AG health checks with your existing monitoring stack (e.g., System Center Operations Manager, Prometheus exporters, or third‑party tools). Key metrics include:

  • Replica health (online/offline)
  • Data synchronization state
  • Listener connection counts
  • Redo latency (ms)

Comparative Note: Oracle DBA & PostgreSQL

While Oracle Data Guard uses redo transport and apply, AGs rely on log send/redo apply via the endpoint. PostgreSQL’s streaming replication is analogous but lacks automatic listener routing and integrated read‑only load balancing out‑of‑the‑box. Understanding these parallels helps multi‑platform DBAs design consistent HADR policies across heterogeneous environments.

Common Pitfalls and How to Avoid Them

  1. Mismatched SQL Server builds: Even minor patch differences prevent AG creation. Use SELECT @@VERSION to verify.
  2. Incorrect quorum: A two‑node cluster without a witness can lose quorum. Always add an odd number of voting members.
  3. Endpoint port conflicts: Default is 5022; ensure the port is open and not used by other services.
  4. Backup on secondary without proper permissions: The SQL service account on the secondary must have write access to the backup location.
  5. Network latency > 2 seconds: Synchronous commit may degrade performance; consider asynchronous mode for geo‑distributed replicas.

Automation Opportunities

Leverage PowerShell DSC, Azure Automation, or Terraform to provision clusters, endpoints, and AG objects. Below is a minimal PowerShell snippet to add a new database to an existing AG:


$agName = 'AG_Sales'
$dbName = 'Orders'
Add-SqlAvailabilityDatabase -Path 'SQLSERVER:\SQL\Node01\DEFAULT' `
-AvailabilityGroup $agName -Database $dbName

Conclusion

SQL Server Always On Availability Groups provide a robust, flexible platform for high availability and disaster recovery. By following the prerequisites, configuring WSFC correctly, and adhering to best‑practice guidelines, DBAs can achieve near‑zero RPO/RTO, offload reporting workloads, and maintain a consistent backup strategy comparable to Oracle’s RMAN automation or PostgreSQL’s streaming replication.

Stay ahead of the curve—subscribe to our newsletter for the latest DBA insights, and follow Person‑IT on LinkedIn for weekly tips on performance tuning, backup automation, and multi‑platform database management.

Leave A Comment

All fields marked with an asterisk (*) are required

plugins premium WordPress