A month ago, I saw this article published on the AWS architecture blog:
Disaster Recovery for Oracle Database on Amazon EC2 with Fast-Start Failover
I love seeing people suggesting Oracle Data Guard Fast-Start Failover for high availability. Nevertheless, there are a few problems with the architecture and steps proposed in the article.
I sent my comments via Disqus on the AWS blogging platform, but after a month, my comment was rejected, and the blog content hasn’t changed.
For this reason, I don’t have other places to post my comment but here…
- The link to the setup procedure is from 2009.
We have official documentation that we keep up to date. The Fast-Start Failover part:
https://docs.oracle.com/en/database/oracle/oracle-database/19/dgbkr/using-data-guard-broker-to-manage-switchovers-failovers.html#GUID-D26D79F2-0093-4C0E-98CD-224A5C8CBFA4
and the Best Practices guide:
https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/oracle-data-guard-best-practices.html#GUID-C3A78B07-6584-4380-8D53-E5B831A5894C - The part about cascading standbys references a step-by-step guide from an external blog written many years ago for 11gR2.
- The DBMS_SERVICE doc is from 12cR1, while other links are from 21c doc or 19c doc. As of today, most implement 19c. That’s probably the version to use.
https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SERVICE.html#GUID-C11449DC-EEDE-4BB8-9D2C-0A45198C1928 - The steps used to create the database service do not include any HA property, which will make most efforts useless. (see Table 153-6 in the link above).
- The article talks about TAF, but no steps exist to configure it. We don’t recommend TAF since 12c anyway. Today (19c), the recommendation is TAC (Transparent Application Continuity).
https://www.oracle.com/docs/tech/application-checklist-for-continuous-availability-for-maa.pdf - But, most important, TAF (or Oracle connectivity in general) does NOT require a host IP change! There is no need to change the DNS when using the recommended connection string with multiple address_lists.
- Some RedoRoutes examples are not correct. In this video I explain how they work and how to set them up:
https://www.youtube.com/watch?v=huG8JPu_s4Q - The diagram shows the master observer together with the standby database, which is a bad practice. I explain why and how here:
https://www.youtube.com/watch?v=e81UPLfnLi0
The central message is:
If you need to implement a complex architecture using a software solution, pay attention that the practices suggested by the partner/integrator/3rd party match the ones from the software vendor. In the case of Oracle Data Guard, Oracle knows better 😉
Cheers
—
Ludovico
Latest posts by Ludovico (see all)
- New views in Oracle Data Guard 23c - January 3, 2024
- New in Data Guard 21c and 23c: Automatic preparation of the primary - December 22, 2023
- Does FLASHBACK QUERY work across incarnations or after a Data Guard failover? - December 13, 2023
I agree with a lot of what you say here, with one big exception: Oracle DataGuard does not (yet?) provide “virtual IP” or VIP functionality for application clients which do not support Oracle’s TNS specifications, most notably older JDBC thin drivers, or those not integrated with the Oracle client software.
If the JDBC client only recognizes the older connect-string format of either…
jdbc:oracle:thin@//IP-hostname-or-address:port:SID
jdbc:oracle:thin@//IP-hostname-or-address:port/service
…or if it not possible to change the embedded connect-string, then the client cannot follow the primary database after a DataGuard failover or switchover without a DNS change.
Ironically, Oracle RAC has a “virtual IP” which is also known as the SCAN (a.k.a. single cluster access name).
For Oracle database deployments in Azure cloud, Microsoft recommends employing an Azure load balancer as a “virtual IP”. The load balancer is enabled by means of a simple script running on the VM on which a database resides, which manages a designated network port (i.e. port 63000, etc) based on the value of the column DATABASE_ROLE in V$DATABASE. If DATABASE_ROLE has the value of “PRIMARY”, then the designated network port is held open, and the Azure load balancer monitoring that network port directs TCP traffic to that VM. If DATABASE_ROLE has any other value, or if the database instance is down, then the network port is closed, causing the Azure load balancer to not direct traffic there. This simple mechanism permits access by any and all programs to the DataGuard primary, where ever it is located.
The lack of “virtual IP” with DataGuard is a gaping functionality gap, but is easily resolved once identified. The “oraupdown.sh” script (referred to above) can be downloaded from GitHub at “https://github.com/tigormanmsft/oraupdown/tree/main”.
Hi Tim,
I’m not 100% sure about the exact versions, but //host:port/service has been supported since Oracle 8 or 8i (26+ years ago), and the full descriptor (DESCRIPTION=…) since 9i.
That’s true also for the JDBC drivers, with or without using TNS name resolution. So the technology gap is not in Data Guard, but in the applications that have not evolved over the last 25 years.
Over the last few years I encounter a few such applications. I would never recommend to do a lift&shift to the cloud without modernizing the app and updating versions and drivers.
We have solutions also for the desperate cases, Oracle Connection Manager and Global Data Services. But that should be the last resort.
Changing the DNS or having a clueless load balancer in front of Data Guard… good luck with that!