Data Guard 26ai – #1: Faster role transitions

This post is part of a blog series.
I’ve already blogged about it on the official Oracle MAA Blog (read here) , but let me insist on this.

Role transitions (switchover, failover) are much faster in Oracle Data Guard 26ai.

Depending on the configuration and workload, they can be up to five times faster! No changes to the application code or configuration: you get this improvement out of the box.

Here’s an example of two identical configurations using 19.29 and 23.26.1, one PDB, and no application services (basically, an empty database):

Switchover in 19.29

Total: ~44 seconds

Switchover in 23.26.1

Total: < 20 seconds

😎

The following two tabs change content below.

Ludovico

Principal Product Manager at Oracle
Ludovico is a member of the Oracle Database High Availability (HA), Scalability & Maximum Availability Architecture (MAA) Product Management team in Oracle. He focuses on Oracle Data Guard, Flashback technologies, and Cloud MAA.

4 thoughts on “Data Guard 26ai – #1: Faster role transitions

    • Hi Chandan, as I mention in the referenced link:
      The faster transitions are the result of several bug fixes and code improvements that happened over the first few release updates of Oracle Database and Oracle Grid Infrastructure 23ai.

      * Parallelization of independent tasks.
      * Removal of unnecessary sleep routines or repeated checks.
      * Optimizations at the Database level (RAC, Multitenant, Recovery).
      * Optimizations at the Clusterware level.
      Some optimizations have been backported to Oracle Database 19.27 and are visible when used in conjunction with Oracle Grid Infrastructure 23ai.

  1. Interesting. Note however that the key indicator is the time to open the new primary. The time afterwards (stopping and opening the former primary) is irrelevant for the business/application. From the moment the new primary is open, your application can continue.
    Also, it is great that Oracle improves this feature but if you do a switchover in a Db System in OCI via the web GUI, you loose 10’s of seconds (minutes?) before the switchover is actually started (due to the slow dcs framework in the background).

    • Good observations, Geert.
      The actual real indicator is how much time it takes for an application between draining (or getting disconnected) and reconnecting. This is what the business should care about.
      As you say, the application can often reconnect to the new primary before the new standby starts recovering again.

      Regarding OCI, it’s again the same. If I click on “Switchover”, it might take let’s say additional 30 seconds to start the actual switchover (I don’t know the actual numbers), and it might take a couple of minutes at the end to show the “green light” in the console, but the actual downtime perceived by the application will be the same.
      If you need to stop the application before the switchover, that’s another story. In that case, either you work on making the client configuration resilient to a switchover, or you can trigger the switchover out of band (not using the cloud control plane).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.