Data Guard, Easy Connect and the Observer for multiple configurations

EZConnect

One of the challenges of automation in bin Oracle Environments is dealing with tnsnames.ora files.
These files might grow big and are sometimes hard to distribute/maintain properly.
The worst is when manual modifications are needed: manual operations, if not made carefully, can screw up the connection to the databases.
The best solution is always using LDAP naming resolution. I have seen customers using OID, OUD, Active Directory, openldapd, all with a great level of control and automation. However, some customer don’t have/want this possibility and keep relying on TNS naming resolution.
When Data Guard (and eventually RAC) are in place, the tnsnames.ora gets filled by entries for the DGConnectIdentifiers and StaticConnectIdentifier. If I add the observer, an additional entry is required to access the dbname_CFG service created by the Fast Start Failover.

Actually, all these entries are not required if I use Easy Connect.

My friend Franck Pachot wrote a couple of nice blog posts about Easy Connect while working with me at CERN:
https://medium.com/@FranckPachot/19c-easy-connect-e0c3b77968d7

https://medium.com/@FranckPachot/19c-ezconnect-and-wallet-easy-connect-and-external-password-file-8e326bb8c9f5

Basic Data Guard configuration

The basic configuration with Data Guard is quite simple to achieve with Easy Connect. In this examples I have:
– The primary database TOOLCDB1_SITE1
– The duplicated database for standby TOOLCDB1_SITE2

After setting up the static registration (no Grid Infrastructure in my lab):

and copying the passwordfile, the configuration can be created with:

That’s it.

Now, if I want to have the configuration observed, I need to activate the Fast Start Failover:

With just two databases, FastStartFailoverTarget is not explicitly needed, but I usually do it as other databases might be added to the configuration in the future.
After that, the broker complains that FSFO is enabled but there is no observer yet:

 

Observer for multiple configurations

This feature has been introduced in 12.2 but it is still not widely used.
Before 12.2, the Observer was a foreground process: the DBAs had to start it in a wrapper script executed with nohup in order to keep it live.
Since 12.2, the observer can run as a background process as far as there is a valid wallet for the connection to the databases.
Also, 12.2 introduced the capability of starting multiple configurations with a single dgmgrl command: “START OBSERVING”.

For more information about it, you can check the documentation here:
https://docs.oracle.com/en/database/oracle/oracle-database/19/dgbkr/using-data-guard-broker-to-manage-switchovers-failovers.html#GUID-BC513CDB-1E06-4EB3-9FE1-E1331E15E492

How to set it up with Easy Connect?

First, I need a wallet. And here comes the first compromise:
Having a single dgmgrl session to start all my configurations means that I have a single wallet for all the databases that I want to observe.
Fair enough, all the DBs (CDBs?) are managed by the same team in this case.
If I have only observers on my host I can easily point to the wallet from my central sqlnet.ora:

Otherwise I need to create a separate TNS_ADMIN for my observer management environment.
Then, I create the wallet:

Now I need to add the connection descriptors.

Which connection descriptors do I need?
The Observer uses the DGConnectIdentifier to keep observing the databases, but needs a connection to both of them using the TOOLCDB1_CFG service (unless I specify something different with the broker configuration property ConfigurationWideServiceName) to connect to the configuration and get the DGConnectIdentifier information. Again, you can check it in the doc. or the note Oracle 12.2 – Simplified OBSERVER Management for Multiple Fast-Start Failover Configurations (Doc ID 2285891.1)

So I need to specify three secrets for three connection descriptors:

The first one will be used for the initial connection. The other two to observe the Primary and Standby.
I need to be careful that the first EZConnect descriptor matches EXACTLY what I put in observer.ora (see next step) and the last two match my DGConnectIdentifier (unless I specify something different with ObserverConnectIdentifier), otherwise I will get some errors and the observer will not observe correctly (or will not start at all).

The dgmgrl needs then a file named observer.ora.
$ORACLE_BASE/admin/observers or the central TNS_ADMIN would be good locations, but what if I have observers that must be started from multiple Oracle Homes?
In that case, having a observer.ora in $ORACLE_HOME/network/admin (or $ORACLE_BASE/homes/{OHNAME}/network/admin/ if Read-Only Oracle Home is enabled) would be a better solution: in this case I would need to start one session per Oracle Home

The content of my observer.ora must be something like:

This is the example for my configuration, but I can put as many (CONFIG=…) as I want in order to observe multiple configurations.
Then, if everything is configured properly, I can start all the observers with a single command:

Troubleshooting

If the observer does not work, sometimes it is not easy to understand the cause.

  • Has SYSDG been granted to SYSDG user? Is SYSDG account unlocked?
  • Does sqlnet.ora contain the correct wallet location?
  • Is the wallet accessible in autologin?
  • Are the entries in the wallet correct? (check with “sqlplus /@connstring as sysdg”)

Missing pieces

Here, a few features that I think would be a nice addition in the future:

  • Awareness for the ORACLE_HOME to be used for each observer
  • Possibility to specify a different TNS_ADMIN per observer (different wallets)
  • Integration with Grid Infrastructure (srvctl add observer…) and support for multiple observers

Ludovico

Script to check Data Guard status from SQL

In a previous blog post I have explained how to get the basic configuration from x$drc and display something like:

There are other possibilities, by using the DBMS_DRS PL/SQL package.

The package is quite rich. In order to get more details, I use CHECK_CONNECT to check the connectivity to the member databases:

example:

In the first case I get no exceptions, that means that the database is reachable using the DGConnectIdentifier specified in the configuration (‘TOOLCDB1_SITE2’ is my database name in the configuration, it is NOT a TNS entry. I use EZConnect in my lab).

In the second case I specify a database that is not in the configuration.

In the third case, it looks like the database is down (no service), or the DGConnectIdentifier is not correct.

 

GET_PROPERTY_OBJ is useful to get e single property of a database/instance:

Example:

Here I have, for the primary (the object_id from x$drc), a TransportLagThreshold of 30 seconds.

DO_CONTROL does a specific check and returns a document with the results:

The problem is… what’s the format for indoc?

To get the correct format, I have enabled sql trace to get the executions, with bind variables, of the dgmgrl commands. It happens that the input format is XML and the output format is HTML.

This is how you can get the LogXptStatus, for example:

 

The big script

So I said… why not trying to have a comprehensive SQL script to check a few vital statuses of Data Guard?

This is the script that came out:

Of course, it is not perfect (many checks missing: FSFO readiness, observer checks, etc.), but it is good enough for base monitoring. Also, it’s faster than a normal shell+dgmgrl script.

Output on a Primary database:

Output on a standby database:

In case of errors (e.g. standby listener stopped), I would get:

So easy to spot the error and use a shell wrapper to grep ^ERROR or similar.

Be careful, the script is not RAC aware, and it lacks some checks, so you might want to reuse it and extend it to fit your exact configuration.

Hope you like it!

Ludovico