rhpctl addnode gihome: specify HUB or LEAF when adding new nodes to a Flex Cluster

I have a customer trying to add a new node to a cluster using Fleet Patching and Provisioning.

The error in the command output is not very friendly:

The “RHPHELP_preNodeAddVal” might already give an idea of the cause: something related to the “cluvfy stage -pre nodeadd” evaluation that we normally do when adding a node by hand. FPP does not really run cluvfy, but it calls the same primitives cluvfy is based on.

In FPP, when the error does not give any useful information, this is the flow to follow:

  • use “rhpctl query audit” to get the date and time of the failing operation
  • open the “rhpserver.log.0” and look for the operation log in that time frame
  • get the UID of the operation e.g., in the following line it is “1556344143”:

  • Isolate the log for the operation: grep $UID rhpserver.log.0 > $UID.log
  • Locate the trace file of the rhphelper remote execution:

  • Find the root cause in the rhphelper trace:

In this case, the target cluster is a Flex Cluster, so the command must be run specifying the node_role.

The documentation is not clear (we will fix it soon):

node_role must be specified for Flex Clusters, and it must be either HUB or LEAF.

After using the correct command line, the command succeeded.

HTH

Ludovico

Changing FPP temporary directory (/tmp in noexec and other issues)

When using FPP, you might experience the following error (PRVF-7546):

This is often related to the filesystem /tmp that has the “noexec” option:

Although it is tempting to just remount the filesystem with “exec”, you might be in this situation because your systems are configured to adhere to the STIG recommendations:

The noexec option must be added to the /tmp partition (https://www.stigviewer.com/stig/red_hat_enterprise_linux_6/2016-12-16/finding/V-57569)

FPP 19.9 contains fix 30885598 that allows specifying the temporary location for FPP operations:

After that, the operation should run smoothly:

HTH

Ludo

Oracle Fleet Patching and Provisioning (FPP): My new role as PM and a brand new series of blog posts

It’s been 6 years since I’ve tried FPP for the first time (formerly Rapid Home Provisioning, or RHP).

Rapid Home Provisioning

FPP was still young and lacking many features at that time, but it already changed the way I’ve worked during the next years. I embraced the out of place patching, developed some basic scripts to install Oracle Homes, and sought automation and standardization at all costs:

Oracle Home Management – part 7: Putting all together

When 18c came with the FPP local-mode automaton, I have implemented it for the Grid Infrastructure patching strategy at CERN:

Oracle Grid Infrastructure 18c patching part 3: Executing out-of-place patching with the local-mode automaton

And discovered that meanwhile, FPP did giant steps, with many new features and fixes for quite a few usability and performance problems.

Last year, when joining the Oracle Database High Availability (HA), Scalability, and Maximum Availability Architecture (MAA) Product Management Team at Oracle, I took (among others) the Product Manager role for FPP.

Becoming an Oracle employee after 20 years of working with Oracle technology is a big leap. It allows me to understand how big the company is, and how collaborative and friendly the Oracle employees are (Yes, I was used to marketing nonsense, insisting salesmen, and unfriendly license auditors. This is slowly changing with Oracle embracing the Cloud, but it is still a fresh wound for many customers. Expect this to change even more! Regarding me… I’ll be the same I’ve always been 🙂 ).

Now I have daily meetings with big customers (bigger than the ones I have ever had in the past), development teams, other product managers, Oracle consultants, and community experts. My primary goal is to make the product better, increasing its adoption, and helping customers having the best experience with it. This includes testing the product myself, writing specs, presentations, videos, collecting feedback from the customers, tracking bugs, and manage escalations.

I am a Product Manager for other products as well, but I have to admit that FPP is the product that takes most of my Product Manager time. Why?

I will give a few reasons in my next blog post(s).

Ludo

The fear of (availability) loss is a path to the dark side.

I have been a DBA/consultant for customers and big production environments for over twenty years. I have explained more or less my career path in this blog post.

Database (and application) high availability has always been one of my favorite areas. Over the years I have become a high availability expert (my many blog posts are there to confirm it) and I have spent a lot of time building, troubleshooting, teaching, presenting, advocating these gems of technology that are RAC, Data Guard, Application Continuity and the many other products that are part of the Oracle Maximum Availability Architecture solution. Customers fear downtime, and I have always been with them on that. But in my case, it looks like Yoda’s famous quote worked well for me (in a good way):

I’ll be joining the Oracle Maximum Availability Architecture Product Management team as MAA Product Manager (or rather Cloud MAA, I will not explain here ;-))  next November.

(for those who are not familiar with the joke, the “Dark Side” is how we often refer to the Oracle employees in the Oracle Community 😉 )

I remember just like if it was yesterday that I was presenting some Data Guard 12c new features in front of a big audience at Collaborate 2014. There I have met two incredible people that were part of the MAA Product Management team: Larry Carpenter and Markus Michalewicz. Larry has been a great source of inspiration to improve my seniority and ease of presenting in front of the public, while Markus has become a friend over the years in addition of being one of the most influent persons in my professional network.

Now I have got the opportunity to join that team, and I feel like it’s the most natural change to do in my career.

And because I imagine some of you will have some questions, there are some answers to questions I’ve been frequently asked so far:

  • MAA PM does not mean becoming team lead or supervising other colleagues, I’ll be a “regular” PM
  • I will stay in Switzerland and work remotely from here
  • I will stay in “the conference circus” and keep presenting as soon the COVID-19 situation will allow to do so
  • Yes, I was VERY happy in Trivadis and it will always have a special place in my heart
  • Yep, that means no ACE Director award anymore 😉

Exciting times ahead! 🙂

Data Guard, Easy Connect and the Observer for multiple configurations

EZConnect

One of the challenges of automation in bin Oracle Environments is dealing with tnsnames.ora files.
These files might grow big and are sometimes hard to distribute/maintain properly.
The worst is when manual modifications are needed: manual operations, if not made carefully, can screw up the connection to the databases.
The best solution is always using LDAP naming resolution. I have seen customers using OID, OUD, Active Directory, openldapd, all with a great level of control and automation. However, some customer don’t have/want this possibility and keep relying on TNS naming resolution.
When Data Guard (and eventually RAC) are in place, the tnsnames.ora gets filled by entries for the DGConnectIdentifiers and StaticConnectIdentifier. If I add the observer, an additional entry is required to access the dbname_CFG service created by the Fast Start Failover.

Actually, all these entries are not required if I use Easy Connect.

My friend Franck Pachot wrote a couple of nice blog posts about Easy Connect while working with me at CERN:
https://medium.com/@FranckPachot/19c-easy-connect-e0c3b77968d7

https://medium.com/@FranckPachot/19c-ezconnect-and-wallet-easy-connect-and-external-password-file-8e326bb8c9f5

Basic Data Guard configuration

The basic configuration with Data Guard is quite simple to achieve with Easy Connect. In this examples I have:
– The primary database TOOLCDB1_SITE1
– The duplicated database for standby TOOLCDB1_SITE2

After setting up the static registration (no Grid Infrastructure in my lab):

and copying the passwordfile, the configuration can be created with:

That’s it.

Now, if I want to have the configuration observed, I need to activate the Fast Start Failover:

With just two databases, FastStartFailoverTarget is not explicitly needed, but I usually do it as other databases might be added to the configuration in the future.
After that, the broker complains that FSFO is enabled but there is no observer yet:

 

Observer for multiple configurations

This feature has been introduced in 12.2 but it is still not widely used.
Before 12.2, the Observer was a foreground process: the DBAs had to start it in a wrapper script executed with nohup in order to keep it live.
Since 12.2, the observer can run as a background process as far as there is a valid wallet for the connection to the databases.
Also, 12.2 introduced the capability of starting multiple configurations with a single dgmgrl command: “START OBSERVING”.

For more information about it, you can check the documentation here:
https://docs.oracle.com/en/database/oracle/oracle-database/19/dgbkr/using-data-guard-broker-to-manage-switchovers-failovers.html#GUID-BC513CDB-1E06-4EB3-9FE1-E1331E15E492

How to set it up with Easy Connect?

First, I need a wallet. And here comes the first compromise:
Having a single dgmgrl session to start all my configurations means that I have a single wallet for all the databases that I want to observe.
Fair enough, all the DBs (CDBs?) are managed by the same team in this case.
If I have only observers on my host I can easily point to the wallet from my central sqlnet.ora:

Otherwise I need to create a separate TNS_ADMIN for my observer management environment.
Then, I create the wallet:

Now I need to add the connection descriptors.

Which connection descriptors do I need?
The Observer uses the DGConnectIdentifier to keep observing the databases, but needs a connection to both of them using the TOOLCDB1_CFG service (unless I specify something different with the broker configuration property ConfigurationWideServiceName) to connect to the configuration and get the DGConnectIdentifier information. Again, you can check it in the doc. or the note Oracle 12.2 – Simplified OBSERVER Management for Multiple Fast-Start Failover Configurations (Doc ID 2285891.1)

So I need to specify three secrets for three connection descriptors:

The first one will be used for the initial connection. The other two to observe the Primary and Standby.
I need to be careful that the first EZConnect descriptor matches EXACTLY what I put in observer.ora (see next step) and the last two match my DGConnectIdentifier (unless I specify something different with ObserverConnectIdentifier), otherwise I will get some errors and the observer will not observe correctly (or will not start at all).

The dgmgrl needs then a file named observer.ora.
$ORACLE_BASE/admin/observers or the central TNS_ADMIN would be good locations, but what if I have observers that must be started from multiple Oracle Homes?
In that case, having a observer.ora in $ORACLE_HOME/network/admin (or $ORACLE_BASE/homes/{OHNAME}/network/admin/ if Read-Only Oracle Home is enabled) would be a better solution: in this case I would need to start one session per Oracle Home

The content of my observer.ora must be something like:

This is the example for my configuration, but I can put as many (CONFIG=…) as I want in order to observe multiple configurations.
Then, if everything is configured properly, I can start all the observers with a single command:

Troubleshooting

If the observer does not work, sometimes it is not easy to understand the cause.

  • Has SYSDG been granted to SYSDG user? Is SYSDG account unlocked?
  • Does sqlnet.ora contain the correct wallet location?
  • Is the wallet accessible in autologin?
  • Are the entries in the wallet correct? (check with “sqlplus /@connstring as sysdg”)

Missing pieces

Here, a few features that I think would be a nice addition in the future:

  • Awareness for the ORACLE_HOME to be used for each observer
  • Possibility to specify a different TNS_ADMIN per observer (different wallets)
  • Integration with Grid Infrastructure (srvctl add observer…) and support for multiple observers

Ludovico

Script to check Data Guard status from SQL

In a previous blog post I have explained how to get the basic configuration from x$drc and display something like:

There are other possibilities, by using the DBMS_DRS PL/SQL package.

The package is quite rich. In order to get more details, I use CHECK_CONNECT to check the connectivity to the member databases:

example:

In the first case I get no exceptions, that means that the database is reachable using the DGConnectIdentifier specified in the configuration (‘TOOLCDB1_SITE2’ is my database name in the configuration, it is NOT a TNS entry. I use EZConnect in my lab).

In the second case I specify a database that is not in the configuration.

In the third case, it looks like the database is down (no service), or the DGConnectIdentifier is not correct.

 

GET_PROPERTY_OBJ is useful to get e single property of a database/instance:

Example:

Here I have, for the primary (the object_id from x$drc), a TransportLagThreshold of 30 seconds.

DO_CONTROL does a specific check and returns a document with the results:

The problem is… what’s the format for indoc?

To get the correct format, I have enabled sql trace to get the executions, with bind variables, of the dgmgrl commands. It happens that the input format is XML and the output format is HTML.

This is how you can get the LogXptStatus, for example:

 

The big script

So I said… why not trying to have a comprehensive SQL script to check a few vital statuses of Data Guard?

This is the script that came out:

Of course, it is not perfect (many checks missing: FSFO readiness, observer checks, etc.), but it is good enough for base monitoring. Also, it’s faster than a normal shell+dgmgrl script.

Output on a Primary database:

Output on a standby database:

In case of errors (e.g. standby listener stopped), I would get:

So easy to spot the error and use a shell wrapper to grep ^ERROR or similar.

Be careful, the script is not RAC aware, and it lacks some checks, so you might want to reuse it and extend it to fit your exact configuration.

Hope you like it!

Ludovico

Real-Time Cascade Standby Container Databases without Oracle Managed Files

OK, the title might not be the best… I just would like to add more detail to content you can already find in other blogs (E.g. this nice one from Philippe Fierens http://pfierens.blogspot.com/2020/04/19c-data-guard-series-part-iii-adding.html).

I have this Cascade Standby configuration:

Years ago I wrote this whitepaper about cascaded standbys:
https://fr.slideshare.net/ludovicocaldara/2014-603-caldarappr
While it is still relevant for non-CDBs, things have changed with Multitenant architecture.

In my config, the Oracle Database version is 19.7 and the databases are actually CDBs. No Grid Infrastructure, non-OMF datafiles.
It is important to highlight that a lot of things have changed since 12.1. And because 19c is the LTS version now, it does not make sense to try anything older.

First, I just want to make sure that my standbys are aligned.

Primary:

1st Standby alert log:

2nd Standby alert log:

Then, I create a pluggable database (from PDB$SEED):

On the first standby I get:

On the second:

So, yeah, not having OMF might get you some warnings like: WARNING: File being created with same name as in Primary
But it is good to know that the cascade standby deals well with new PDBs.

Of course, this is not of big interest as I know that the problem with Multitenant comes from CLONING PDBs from either local or remote PDBs in read-write mode.

So let’s try a relocate from another CDB:

This is what I get on the first standby:

and this is on the cascaded standby:

So absolutely the same behavior between the two levels of standby.
According to the documentation: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/CREATE-PLUGGABLE-DATABASE.html#GUID-F2DBA8DD-EEA8-4BB7-A07F-78DC04DB1FFC
I quote what is specified for the parameter STANDBYS={ALL|NONE|…}:
“If you include a PDB in a standby CDB, then during standby recovery the standby CDB will search for the data files for the PDB. If the data files are not found, then standby recovery will stop and you must copy the data files to the correct location before you can restart recovery.”

“Specify ALL to include the new PDB in all standby CDBs. This is the default.”

Specify NONE to exclude the new PDB from all standby CDBs. When a PDB is excluded from all standby CDBs, the PDB’s data files are unnamed and marked offline on all of the standby CDBs. Standby recovery will not stop if the data files for the PDB are not found on the standby. […]”

So, in order to avoid the MRP to crash, I should have included STANDBYS=NONE
But the documentation is not up to date, because in my case the PDB is skipped automatically and the recovery process DOES NOT STOP:

However, the recovery is marked ENABLED for the PDB on the standby, while usind STANDBYS=NONE it would have been DISABLED.

So, another difference with the doc who states:
“You can enable a PDB on a standby CDB after it was excluded on that standby CDB by copying the data files to the correct location, bringing the PDB online, and marking it as enabled for recovery.”

This reflects the findings of Philippe Fierens in his blog (http://pfierens.blogspot.com/2020/04/19c-data-guard-series-part-iii-adding.html).

This behavior has been introduced probably between 12.2 and 19c, but I could not manage to find exactly when, as it is not explicitly stated in the documentation.
However, I remember well that in 12.1.0.2, the MRP process was crashing.

In my configuration, not on purpose, but interesting for this article, the first standby has the very same directory structure, while the cascaded standby has not.

In any case, there is a potentially big problem for all the customers implementing Multitenant on Data Guard:

With the old behaviour (MRP crashing), it was easy to spot when a PDB was cloned online into a primary database, because a simple dgmgrl “show configuration” whould have displayed a warning because of the increasing lag (following the MRP crash).

With the current behavior, the MRP keeps recovering and the “show configuration” displays “SUCCESS” despite there is a PDB not copied on the standby (thus not protected).

Indeed, this is what I get after the clone:

I can see that the Data Guard Broker is completely silent about the missing PDB. So I might think my PDB is protected while it is not!

I actually have to add a check on the standby DBs to check if I have any missing datafiles:

This check should be implemented and put under monitoring (custom metrics in OEM?)

The missing PDB is easy to spot once I know that I have to do it. However, for each PDB to recover (I might have many!), I have to prepare the rename of datafiles and creation of directory (do not forget I am using non-OMF here).

Now, the datafile names on the standby got changed to …/UNNAMEDnnnnn.

So I have to get the original ones from the primary database and do the same replace that db_file_name_convert would do:

and put this in a rman script (this will be for the second standby, the first has the same name so same PATH):

Then, I need to stop the recovery, start it and stopping again, put the datafiles online and finally restart the recover.
These are the same steps used my Philippe in his blog post, just adapted to my taste 🙂

For the second part, I use this HEREDOC to online all offline datafiles:

and finally:

Now, I do not have anymore any datafiles offline on the standby:

I will not publish the steps for the second standby, they are exactly the same (same output as well).

At the end, for me it is important to highlight that monitoring the OFFLINE datafiles on the standby becomes a crucial point to guarantee the health of Data Guard in Multitenant. Relying on the Broker status or “PDB recovery disabled” is not enough.

On the bright side, it is nice to see that Cascade Standby configurations do not introduce any variation, so cascaded standbys can be threated the same as “direct” standby databases.

HTH

Ludovico

How to get the Data Guard broker configuration from a SQL query?

Everybody knows that you can get the Data Guard configuration in dgmgrl with the command:

but few know that this information is actually available in x$drc:

The table design is not very friendly, because it is a mix of ATTRIBUTE/VALUE pairs (hence many rows per object) and specific columns.

So, in order to get something usable to show the databases and their status, the only solution is to make use of PIVOT().

To get it in a friendly format, I recommend using SQLcl and settting

This is what I get on a simple two databases configuration (primary/standby):

and with a real-time cascade standby:

(interesting to note the leading :RTC_ON in the ship_to attribute).

Although it is much easier to get this information from DGMGRL, get it programmatically is more sure/flexible using the SQL interface, as you know what you want to get, no matter how the dgmgrl syntax changes.

Looking forward to have the REST APIs in a future version of Data Guard 🙂

Ludovico

Parameter REMOTE_LISTENER pointing to a TNS alias? Beware of how it registers.

On an Oracle Database instance, if I set:

The instance tries to resolve the cluster-scan name to detect if it is a SCAN address.
So, after it solves, it stores all the addresses it gets and registers to them.
I can check which addresses there are with this query:

In this case, the instance registers to the three addresses discovered, which is OK: all three SCAN listeners will get service updates from the instance.

But if I have this TNS alias:

and I set:

I get:

the result is that the instance registers only at the first IP fot from the DNS, leaving the other SCANs without the service registration and thus random

This is in my opinion quite annoying, as my goal here was to have all the DBs set with:

in order to facilitate changes of ports, database migrations from different clusters, clones, etc.

So the solution is either to revert to the syntax “cluster-scan:port”, or specifying explicitly all the endpoints in the address list:

I am sure it is “working as designed”, but I wonder if it could be an enhancement to have the address expended fully also in case of TNS alias….
Or… do you know any way to do it from a TNS alias without having the full IP list?

Cheers

Ludo

FPP local-mode: Steps to remove/add node from a cluster if RHP fails to move gihome

I am getting more and more experience with patching clusters with the local-mode automaton. The whole process would be very complex, but the local-mode automaton makes it really easy.

I have had nevertheless a couple of clusters where the process did not work:

#1: The very first cluster that I installed in 18c

This cluster has “kind of failed” patching the first node. Actually, the rhpctl command exited with an error:

But actually, the helper kept running and configured everything properly:

The cluster was OK on the first node, with the correct patch level. The second node, however, was failing with:

I am not sure about the cause, but let’s assume it is irrelevant for the moment.

#2: A cluster with new GI home not properly linked with RAC

This was another funny case, where the first node patched successfully, but the second one failed upgrading in the middle of the process with a java NullPointer exception. We did a few bad tries of prePatch and postPatch to solve, but after that the second node of the cluster was in an inconsistent state: in ROLLING_UPGRADE mode and not possible to patch anymore.

Common solution: removing the node from the cluster and adding it back

In both cases we were in the following situation:

  • one node was successfully patched to 18.6
  • one node was not patched and was not possible to patch it anymore (at least without heavy interventions)

So, for me, the easiest solution has been removing the failing node and adding it back with the new patched version.

Steps to remove the node

Although the steps are described here: https://docs.oracle.com/en/database/oracle/oracle-database/18/cwadd/adding-and-deleting-cluster-nodes.html#GUID-8ADA9667-EC27-4EF9-9F34-C8F65A757F2A, there are a few differences that I will highlight:

Stop of the cluster:

The actual procedure to remove a node asks to deconfigure the databases and managed homes from the active cluster version. But as we manage our homes with golden images, we do not need this; we rather want to keep all the entries in the OCR so that when we add it back, everything is in place.

Once stopped the CRS, we have deinstalled the CRS home on the failing node:

This  complained about the CRS that was down, but it continued and ask for this script to be executed:

We’ve got errors also for this script, but the remove process was OK afterall.

Then, from the surviving node:

Adding the node back

From the surviving node, we ran gridSetup.sh and followed the steps to ad the node.

Wait before running root.sh.

In our case, we have originally installed the cluster starting with a SW_ONLY install. This type of installation keeps some leftovers in the configuration files that prevent the root.sh from configuring the cluster…we have had to modify rootconfig.sh:

then, after running root.sh and the config tools, everything was back as before removing the node form the cluster.

For one of the clusters , both nodes were at the same patch level, but the cluster was still in ROLLING_PATCH mode. So we have had to do a

Ludo