Step‑by‑step guide for Oracle DBAs on configuring Data Guard for disaster recovery, covering prerequisites, primary and standby setup, redo transport, role switchover, and performance tuning.
Why Oracle Data Guard Matters for Disaster Recovery
In today’s 24/7 environment, database downtime translates directly into revenue loss. Oracle Data Guard provides a proven, automated solution for high availability and disaster recovery (DR). It creates one or more physical or logical standby databases that maintain a synchronized copy of the primary. In the event of a hardware failure, natural disaster, or planned maintenance, the standby can be activated with minimal data loss. For Oracle DBAs, Data Guard is a cornerstone of a resilient architecture, complementing RMAN backup automation and ASM storage management.
Prerequisites and Planning
Infrastructure Checklist
- Two (or more) Oracle Database installations of the same release and patch level.
- Network connectivity with low latency (ideally < 20 ms) between primary and standby sites.
- Shared storage optional – if used, configure Oracle Automatic Storage Management (ASM) on both nodes.
- Operating system user accounts with identical
oracle
UID/GID. - Time synchronization via NTP.
Licensing Considerations
Data Guard is part of the Oracle Enterprise Edition option set. Verify that the Data Guard option and any required Active Data Guard licenses are in place before proceeding.
High‑Level Architecture
Data Guard consists of a primary database, one or more standby databases, and a redo transport mechanism. The primary generates redo, which is shipped to the standby via either synchronous (maximum protection) or asynchronous (maximum performance) modes. The standby applies redo using either physical recovery (Physical Standby) or SQL apply (Logical Standby). The following diagram illustrates a typical configuration:
Primary DB (Primary) → Redo Transport → Standby DB (Physical/Logical)
Choosing between physical and logical standby depends on your need for read‑only reporting (logical) versus exact block‑level fidelity (physical).
Configuring the Primary Database
1. Enable ARCHIVELOG Mode
The primary must run in ARCHIVELOG mode to generate redo archives for transport.
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP MOUNT;
SQL> ALTER DATABASE ARCHIVELOG;
SQL> ALTER DATABASE OPEN;
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_1='LOCATION=USE_DB_RECOVERY_FILE_DEST' SCOPE=SPFILE;
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_1=ENABLE SCOPE=SPFILE;
2. Configure Redo Destination
Define a network destination for redo transport. The example below uses asynchronous transport for maximum performance.
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=standby_db LGWR ASYNC VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=STANDBY' SCOPE=SPFILE;
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE SCOPE=SPFILE;
3. Verify ASM Disk Groups (Optional)
If you use ASM, ensure both primary and standby share compatible disk group names and attributes.
SQL> SELECT NAME, TYPE, TOTAL_MB FROM V$ASM_DISKGROUP;
Preparing the Standby Database
1. Duplicate the Primary
The quickest way to create a standby is with the RMAN DUPLICATE
command. This copies datafiles, control files, and redo logs while preserving the primary’s configuration.
RMAN> DUPLICATE TARGET DATABASE FOR STANDBY FROM ACTIVE DATABASE
DORECOVER SPFILE
NOFILENAMECHECK;
After duplication, rename the DB_UNIQUE_NAME on the standby:
SQL> ALTER SYSTEM SET DB_UNIQUE_NAME='STANDBY' SCOPE=SPFILE;
2. Configure Redo Apply
On the standby, enable managed recovery to apply incoming redo automatically.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
3. Set Up the Listener
Ensure the standby listener is started and registers the standby service name used in LOG_ARCHIVE_DEST_2
.
LSNRCTL> START;
Managing Redo Transport and Apply
Data Guard Broker Overview
The Data Guard Broker (dgmgrl
) simplifies configuration, monitoring, and role transitions. It abstracts the underlying redo transport settings and provides a single point of control.
Creating a Broker Configuration
$ dgmgrl sys/password@primary_db
DGMGRL> CREATE CONFIGURATION dg_config AS PRIMARY DATABASE IS primary_db CONNECT IDENTIFIER IS primary_db;
DGMGRL> ADD DATABASE standby_db AS CONNECT IDENTIFIER IS standby_db MAINTAINED AS PHYSICAL;
DGMGRL> ENABLE CONFIGURATION;
Verifying Status
DGMGRL> SHOW CONFIGURATION;
DGMGRL> SHOW DATABASE VERBOSE standby_db;
The output should indicate Transport: ACTIVE
and Apply: APPLYING
. If any status shows DISABLED
or ERROR
, consult the Broker logs located in $ORACLE_BASE/diag/rdbms/*/trace
.
Switchover and Failover Procedures
Planned Switchover
A switchover is a controlled role reversal with zero data loss.
$ dgmgrl sys/password@primary_db
DGMGRL> SWITCHOVER TO standby_db;
After the command completes, the former standby becomes the primary. Verify the new primary’s open mode and redo apply status.
Unplanned Failover
If the primary crashes, perform a fast‑failover using the Broker.
$ dgmgrl sys/password@standby_db
DGMGRL> FAILOVER TO standby_db;
In environments without the Broker, you can manually issue:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
SQL> ALTER DATABASE OPEN;
Then re‑establish the original primary as a new standby using the duplication steps.
Monitoring, Performance Tuning, and Best Practices
Key Metrics to Watch
- Lag – measured by
V$DATAGUARD_STATS
(e.g.,apply_lag
,transport_lag
). - Redo Rate –
V$SYSSTAT
columnredo generated
. - Network Throughput – ensure bandwidth matches redo volume (typical 10‑30 GB/hr for OLTP).
- ASM I/O – monitor
V$ASM_DISKSTAT
for latency spikes.
Tuning Tips
- Transport Mode: Use
SYNC
for zero data loss if the network latency is low; otherwiseASYNC
for higher throughput. - Redo Log Size: Larger redo logs reduce frequency of log switches, lowering transport overhead.
- ASM Striping: Increase the number of disks per group to spread I/O across more spindles or SSDs.
- RMAN Parallelism: When backing up the primary, set
PARALLELISM
to match CPU cores to avoid backup‑induced lag. - Archive Log Destination: Point
LOG_ARCHIVE_DEST_1
to a fast local disk andLOG_ARCHIVE_DEST_2
to the standby network path.
Integration with RMAN Backup Automation
Data Guard works hand‑in‑hand with RMAN. Schedule regular full and incremental backups on the standby to offload I/O from the primary.
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP ON;
RMAN> BACKUP INCREMENTAL LEVEL 1 DATABASE TAG 'STANDBY_INC';
RMAN> BACKUP ARCHIVELOG ALL DELETE INPUT;
Because the standby is read‑only, you can run these jobs at any time without affecting primary performance.
Common Pitfalls and How to Avoid Them
- Mismatched Patch Levels: Always apply the same PSU or patch set to both primary and standby before enabling Data Guard.
- Incorrect DB_UNIQUE_NAME: The primary and standby must have distinct
DB_UNIQUE_NAME
values; duplication scripts often overlook this. - Firewall Blocking Ports: Verify that TCP ports 1521 (listener) and 1555 (if using standby redo transport) are open both ways.
- Insufficient Redo Log Size: Small redo logs cause frequent log switches, increasing network traffic and potential lag.
- Missing OS Time Sync: Desynchronised clocks cause
ORA‑00257
errors during log apply.
Putting It All Together – A Sample End‑to‑End Workflow
- Validate hardware, OS, and network prerequisites.
- Enable ARCHIVELOG mode on the primary and configure
LOG_ARCHIVE_DEST_2
. - Run RMAN
DUPLICATE
to create the standby. - Start managed recovery on the standby.
- Initialize Data Guard Broker and enable the configuration.
- Set desired transport mode (SYNC/ASYNC) based on RPO requirements.
- Configure regular RMAN backup jobs on the standby.
- Implement monitoring alerts for
apply_lag
andtransport_lag
. - Test switchover and failover procedures quarterly.
Following this checklist ensures that your Oracle environment meets industry‑standard disaster recovery objectives while keeping operational overhead low.
Conclusion
Oracle Data Guard remains the gold standard for DR in Oracle‑centric enterprises. By combining robust redo transport, automated role management via the Broker, and seamless integration with RMAN and ASM, DBAs can deliver near‑zero RPO and RTO. The steps outlined above provide a repeatable, auditable process that aligns with compliance frameworks and supports future scaling—whether you add logical standbys for reporting or expand to a multi‑site architecture.
Next Steps
Ready to fortify your database environment? Subscribe to our newsletter for deeper dives into performance tuning, backup automation, and cross‑platform replication (SQL Server, PostgreSQL). Follow us on LinkedIn for real‑time updates and community discussions.