To protect from site failures or human error, an additional precaution needs to be taken in a RAC environment to provide disaster recoverability. Oracle Enterprise Edition has offered a feature called Oracle Data Guard since version 7.3 to address this requirement.
In Oracle 7 and 8/8i, this feature was referred to as a standby database. The principle behind the technology is simple yet effective: an identical copy of the live (or primary) database (AKA the standby database) is instantiated in a remote data center. The standby database is constantly in the state of media recovery, unless opened for read-only access. Without the Active Data Guard option introduced in Oracle 11.1, there is an additional caveat you need to be aware of: while the database is opened in read-only mode, it doesn’t apply changes received from the primary database.
While not a problem per se, the fact that changes are not being applied can extend the time needed to transition to the standby database. This is because additional archived redo logs have to be applied unless you are willing to incur data loss. Without Active Data Guard, the database has to be in mount state for managed recovery to work. The mount state prevents users, except for those with sysdba privilege, from connecting to the database. Any attempt to do so will result in an ORA-1033 “Oracle initialization or shutdown in progress” message.
When the standby feature was introduced in Oracle 7.3, maintaining a standby database was a highly manual process: the database administrator was in charge of transferring archived redo logs generated on the primary database to the standby site using utilities such as rcp or ftp (rsync). Once the logs were on the standby site, the standby database had to be placed in recovery mode. The only possible action the administrator could take was to activate the standby database in order for it to assume the primary role. This process where the DBA copied logs was referred to as manual recovery.
Beginning with Oracle 8i, the standby database uses managed recovery to stay in sync with the primary database. Using Oracle Net*8 communication, the primary database ships changes to the standby database, which are subsequently applied to the data files to keep the systems in sync. The application of changes can be delayed to protect the system from the aforementioned user errors. A standby database can also used for reporting or backing up data; this removes some of the load from the primary database.
A further milestone was reached with Oracle 9i, which introduced the logical standby database and graceful switchover operations. It was also in Oracle 9i that the standby database feature was renamed to Data Guard. Users of Data Guard were also given another choice for transmitting redo information to the standby. In addition to the archiver, which traditionally shipped information to the standby database after an online redo log was archived, the log writer process could be used to perform the same task. Standby redo logs were introduced as the counterpart to the primary database’s online redo logs.
Instead of having to wait for a complete archived redo log to arrive, the redo stream could be written into a standby redo log, thus reducing the risk of data loss. Oracle 9i Database also introduced the Data Guard broker with support for Enterprise Manager, as well as a command-line tool to simplify the setup and management of standby databases.
Another noteworthy evolution came with Oracle 10g, when the Real Time Apply feature was integrated into the database kernel. Using standby redo logs on the standby database server, the redo stream arriving on the destination could be applied to the standby database immediately, without having to wait for the standby redo log to be archived and applied. This further reduces the possibility of data loss.
Below Figure illustrates the concepts for Oracle 11g, where redo generated by user activity on the primary database is transported via the Log Network Server (LNS0) process—not the log writer, as in previous versions—to the standby database’s Remote File Server (RFS) process. In turn, the RFS process writes the redo stream into standby redo logs. The managed recovery process (MRP0) on the standby database applies the new information as soon as it arrives. Once filled, the standby redo log is archived by one of the standby database’s archiver processes.
Provision of support for data corruption has been introduced in Oracle 11g Release 1. Setting the new db_lost_write_protect parameter helps prevent lost writes, and Data Guard will also try re-fetching corrupt blocks detected during the redo apply to a physical standby database, and vice versa. This is referred to as Automatic Block Media Recovery.
With Oracle 11g Release 2, the previous limit of 10 archive log destinations (both local and remote) has been lifted, and up to 30 standby databases can now form part of a Data Guard configuration. A cascaded standby database configuration in which a standby database passes on redo to another standby database is not possible (read: supported) if the primary database is a RAC 11.2 database.
Types of Standby Database
You can choose from the following four different types of standby databases:
• Physical standby database
• Snapshot standby database
• Logical standby database
• Transient logical standby database
Each of these will be covered in more detail in the sections that follow.
Physical Standby Database
The physical standby database was the first standby database option available. In all aspects, a physical standby database is an identical bit-for-bit copy of the primary database. All schema structures, database users, and segments are identical down to the block level.
A physical standby database is kept in sync with production through the application of redo (referred to as “redo apply”). The process employed is the same one that a database administrator would use to recover a live database after a media failure. Apart from disaster recovery scenarios, a standby database can be used for reporting and backup. With a little bit of manual effort, a physical standby database can also be used for testing hot fixes on production systems. You do this by activating it for read-write access after having taken it out of the Data Guard configuration. Once the testing of the hot fix on like-for-like production data is complete, the Flashback Database feature can be used to flash the database back to the point in time prior to its activation, followed by its conversion back to a physical standby. The downside to this procedure is that, while the database is open for read-write access, it won’t receive any archived logs from the (real) primary database. This can cause a lot of network traffic once the database has been converted back to a physical standby.
Snapshot Standby Database
The snapshot standby database achieves the exact same result as the physical standby database opened read-write for testing, as just described. However, the snapshot database doesn’t require that the administrator worry about the fine print and all the details. The snapshot standby will receive archived logs from production, significantly reducing the overhead during gap resolution. However, the archived redo logs received from production aren’t applied until after the snapshot standby database has been converted back to a physical standby database, so the time it takes for the standby database to get back in sync with production is proportional to the amount of redo to be applied.
When upgrading a database with a standby database(s) in place, the redo transport mechanism will ensure that the dictionary changes are propagated to all destinations for as long as the catalog upgrade script is executing on the primary site. This is true for both physical and snapshot standby database configurations. All you need to do is ensure that the Oracle binaries on the standby servers exactly match the binaries on the primary database.
You can have both single-instance standby databases and multi-node RAC systems for your disaster recovery solution. However, bear in mind that your standby database must to be able to cope with the workload in a disaster recovery situation. You are well advised to use identical hardware for both production and standby environments. If you use identical hardware, and your standby database is a RAC database as well, then all instances can receive redo from the primary database, thereby spreading the load. However, only one instance can apply redo.
Logical Standby Database
A logical standby database differs from a physical standby database in that it is not an exact 1:1 copy of database, but it is then converted for read-write access. At this stage, the primary database and logical standby deviate. Physical (and snapshot) standby databases stay synchronized through the application of redo logs. However, a logical standby database stays synchronized by having it execute all the same SQL statements as the primary database. This mechanism is often referred to as SQL Apply.
Internally, SQL Apply uses the log miner feature to extract SQL statements from the redo stream. It then applies SQL statements rather than redo to the standby database. Therefore, a logical standby database has the same data structure as the primary database, but the physical representation of the data in the database is likely to be different. There are also some restrictions as to which data types are supported on the primary database, and this list keeps growing from release to release.
Another big difference between a physical and logical standby database is the fact that a logical standby database is open read-write while it still receives changes from production. A logical standby database is unlikely to be used for disaster recovery purposes. Its main purpose is to provide an environment in which reporting can be offloaded from production as changes from the live system are fed into the database. This provides a high degree of data accuracy. The fact that the logical standby database is open read-write means that additional data structures such as indexes and materialized views can be created to speed up queries that would otherwise be too expensive to maintain on the primary database.
Finally, logical standby databases can be used as part of the process to upgrade primary databases to newer releases or to apply patch sets to a system with almost no downtime. This little used technique is referred to as a rolling upgrade in the Oracle documentation. The transient logical standby database discussed in the next section is what you really want to use in cases where you want to apply a rolling upgrade in your own environment.
Transient Logical Standby Database
Oracle has recognized that few businesses are willing to set up a logical standby database only for the rolling upgrade of an Oracle database. Setting up a logical standby database is not a trivial task, and maintaining a logical standby database requires close monitoring to check that all transactions have been applied. For this reason, Oracle 11g Release 1 provides the capability to transiently convert a physical standby database into a logical standby database for the duration of a rolling upgrade. After the upgrade, the logical standby database is converted back to its original role as a physical standby database.