DBA survival BLOG

DBA stuff and Oracle Data Guard

Oracle Fleet Patching and Provisioning (FPP): My new role as PM and a brand new series of blog posts

Posted on May 4, 2021 by Ludovico

It’s been 6 years since I’ve tried FPP for the first time (formerly Rapid Home Provisioning, or RHP).

Rapid Home Provisioning

FPP was still young and lacking many features at that time, but it already changed the way I’ve worked during the next years. I embraced the out of place patching, developed some basic scripts to install Oracle Homes, and sought automation and standardization at all costs:

Oracle Home Management – part 7: Putting all together

When 18c came with the FPP local-mode automaton, I have implemented it for the Grid Infrastructure patching strategy at CERN:

Oracle Grid Infrastructure 18c patching part 3: Executing out-of-place patching with the local-mode automaton

And discovered that meanwhile, FPP did giant steps, with many new features and fixes for quite a few usability and performance problems.

Last year, when joining the Oracle Database High Availability (HA), Scalability, and Maximum Availability Architecture (MAA) Product Management Team at Oracle, I took (among others) the Product Manager role for FPP.

Becoming an Oracle employee after 20 years of working with Oracle technology is a big leap. It allows me to understand how big the company is, and how collaborative and friendly the Oracle employees are (Yes, I was used to marketing nonsense, insisting salesmen, and unfriendly license auditors. This is slowly changing with Oracle embracing the Cloud, but it is still a fresh wound for many customers. Expect this to change even more! Regarding me… I’ll be the same I’ve always been 🙂 ).

Now I have daily meetings with big customers (bigger than the ones I have ever had in the past), development teams, other product managers, Oracle consultants, and community experts. My primary goal is to make the product better, increasing its adoption, and helping customers having the best experience with it. This includes testing the product myself, writing specs, presentations, videos, collecting feedback from the customers, tracking bugs, and manage escalations.

I am a Product Manager for other products as well, but I have to admit that FPP is the product that takes most of my Product Manager time. Why?

I will give a few reasons in my next blog post(s).

—

Ludo

Migrating Oracle RAC from SuSE to OEL (or RHEL) live

Posted on November 10, 2015 by Ludovico

I have a customer that needs to migrate its Oracle RAC cluster from SuSE to OEL.

I know, I know, there is a paper from Dell and Oracle named:

How Dell Migrated from SUSE Linux to Oracle Linux

That explains how Dell migrated its many RAC clusters from SuSE to OEL. The problem is that they used a different strategy:

– backup the configuration of the nodes
– then for each node, one at time
– stop the node
– reinstall the OS
– restore the configuration and the Oracle binaries
– relink
– restart

What I want to achieve instead is:
– add one OEL node to the SuSE cluster as new node
– remove one SuSE node from the now-mixed cluster
– install/restore/relink the RDBMS software (RAC) on the new node
– move the RAC instances to the new node (taking care to NOT run more than the number of licensed nodes/CPUs at any time)
– repeat (for the remaining nodes)

because the customer will also migrate to new hardware.

In order to test this migration path, I’ve set up a SINGLE NODE cluster (if it works for one node, it will for two or more).

oracle@sles01:~> crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       sles01                   STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       sles01                   STABLE
ora.asm
               ONLINE  ONLINE       sles01                   Started,STABLE
ora.net1.network
               ONLINE  ONLINE       sles01                   STABLE
ora.ons
               ONLINE  ONLINE       sles01                   STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       sles01                   STABLE
ora.cvu
      1        ONLINE  ONLINE       sles01                   STABLE
ora.oc4j
      1        OFFLINE OFFLINE                               STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       sles01                   STABLE
ora.sles01.vip
      1        ONLINE  ONLINE       sles01                   STABLE
--------------------------------------------------------------------------------
oracle@sles01:~> cat /etc/issue

Welcome to SUSE Linux Enterprise Server 11 SP4  (x86_64) - Kernel \r (\l).

oracle@sles01:~> crsctl stat res -t

--------------------------------------------------------------------------------

Name Target State Server State details

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE sles01 STABLE

ora.LISTENER.lsnr

ONLINE ONLINE sles01 STABLE

ora.asm

ONLINE ONLINE sles01 Started,STABLE

ora.net1.network

ONLINE ONLINE sles01 STABLE

ora.ons

ONLINE ONLINE sles01 STABLE

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE sles01 STABLE

ora.cvu

1 ONLINE ONLINE sles01 STABLE

ora.oc4j

1 OFFLINE OFFLINE STABLE

ora.scan1.vip

1 ONLINE ONLINE sles01 STABLE

ora.sles01.vip

1 ONLINE ONLINE sles01 STABLE

--------------------------------------------------------------------------------

oracle@sles01:~> cat /etc/issue

Welcome to SUSE Linux Enterprise Server 11 SP4 (x86_64) - Kernel \r (\l).

I have to setup the new node addition carefully, mainly as I would do with a traditional node addition:

Add new ip addresses (public, private, vip) to the DNS/hosts
Install the new OEL server
Keep the same user and groups (uid, gid, etc)
Verify the network connectivity and setup SSH equivalence
Check that the multicast connection is ok
Add the storage, configure persistent naming (udev) and verify that the disks (major, minor, names) are the very same
The network cards also must be the very same

Once the new host ready, the cluvfy stage -pre nodeadd will likely fail due to

Kernel release mismatch
Package mismatch

Here’s an example of output:

oracle@sles01:~> cluvfy stage -pre nodeadd -n rhel01

Performing pre-checks for node addition

Checking node reachability...
Node reachability check passed from node "sles01"


Checking user equivalence...
User equivalence check passed for user "oracle"
Package existence check passed for "cvuqdisk"

Checking CRS integrity...

CRS integrity check passed

Clusterware version consistency passed.

Checking shared resources...

Checking CRS home location...
Location check passed for: "/u01/app/12.1.0/grid"
Shared resources check for node addition passed


Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity using interfaces on subnet "192.168.56.0"
Node connectivity passed for subnet "192.168.56.0" with node(s) sles01,rhel01
TCP connectivity check passed for subnet "192.168.56.0"


Check: Node connectivity using interfaces on subnet "172.16.100.0"
Node connectivity passed for subnet "172.16.100.0" with node(s) rhel01,sles01
TCP connectivity check passed for subnet "172.16.100.0"

Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.56.0".
Subnet mask consistency check passed for subnet "172.16.100.0".
Subnet mask consistency check passed.

Node connectivity check passed

Checking multicast communication...

Checking subnet "172.16.100.0" for multicast communication with multicast group "224.0.0.251"...
Check of subnet "172.16.100.0" for multicast communication with multicast group "224.0.0.251" passed.

Check of multicast communication passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "sles01:/usr,sles01:/var,sles01:/etc,sles01:/u01/app/12.1.0/grid,sles01:/sbin,sles01:/tmp"
Free disk space check passed for "rhel01:/usr,rhel01:/var,rhel01:/etc,rhel01:/u01/app/12.1.0/grid,rhel01:/sbin,rhel01:/tmp"
Check for multiple users with UID value 1101 passed
User existence check passed for "oracle"
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed

WARNING:
PRVF-7524 : Kernel version is not consistent across all the nodes.
Kernel version = "3.0.101-63-default" found on nodes: sles01.
Kernel version = "3.8.13-16.2.1.el6uek.x86_64" found on nodes: rhel01.
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make"
Package existence check passed for "libaio"
Package existence check passed for "binutils"
Package existence check passed for "gcc(x86_64)"
Package existence check passed for "gcc-c++(x86_64)"
Package existence check passed for "glibc"
Package existence check passed for "glibc-devel"
Package existence check passed for "ksh"
Package existence check passed for "libaio-devel"
Package existence check failed for "libstdc++33"
Check failed on nodes:
        rhel01
Package existence check failed for "libstdc++43-devel"
Check failed on nodes:
        rhel01
Package existence check passed for "libstdc++-devel(x86_64)"
Package existence check failed for "libstdc++46"
Check failed on nodes:
        rhel01
Package existence check failed for "libgcc46"
Check failed on nodes:
        rhel01
Package existence check passed for "sysstat"
Package existence check failed for "libcap1"
Check failed on nodes:
        rhel01
Package existence check failed for "nfs-kernel-server"
Check failed on nodes:
        rhel01
Check for multiple users with UID value 0 passed
Current group ID check passed

Starting check for consistency of primary group of root user

Check for consistency of root user's primary group passed
Group existence check passed for "asmadmin"
Group existence check passed for "asmoper"
Group existence check passed for "asmdba"

Checking ASMLib configuration.
Check for ASMLib configuration passed.

Checking OCR integrity...

OCR integrity check passed

Checking Oracle Cluster Voting Disk configuration...

Oracle Cluster Voting Disk configuration check passed
Time zone consistency check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed


User "oracle" is not part of "root" group. Check passed
Checking integrity of file "/etc/resolv.conf" across nodes

"domain" and "search" entries do not coexist in any  "/etc/resolv.conf" file
All nodes have same "search" order defined in file "/etc/resolv.conf"
PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: sles01,rhel01

Check for integrity of file "/etc/resolv.conf" failed


Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ...
Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed


Pre-check for node addition was unsuccessful on all the nodes.

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

oracle@sles01:~> cluvfy stage -pre nodeadd -n rhel01

Performing pre-checks for node addition

Checking node reachability...

Node reachability check passed from node "sles01"

Checking user equivalence...

User equivalence check passed for user "oracle"

Package existence check passed for "cvuqdisk"

Checking CRS integrity...

CRS integrity check passed

Clusterware version consistency passed.

Checking shared resources...

Checking CRS home location...

Location check passed for: "/u01/app/12.1.0/grid"

Shared resources check for node addition passed

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity using interfaces on subnet "192.168.56.0"

Node connectivity passed for subnet "192.168.56.0" with node(s) sles01,rhel01

TCP connectivity check passed for subnet "192.168.56.0"

Check: Node connectivity using interfaces on subnet "172.16.100.0"

Node connectivity passed for subnet "172.16.100.0" with node(s) rhel01,sles01

TCP connectivity check passed for subnet "172.16.100.0"

Checking subnet mask consistency...

Subnet mask consistency check passed for subnet "192.168.56.0".

Subnet mask consistency check passed for subnet "172.16.100.0".

Subnet mask consistency check passed.

Node connectivity check passed

Checking multicast communication...

Checking subnet "172.16.100.0" for multicast communication with multicast group "224.0.0.251"...

Check of subnet "172.16.100.0" for multicast communication with multicast group "224.0.0.251" passed.

Check of multicast communication passed.

Total memory check passed

Available memory check passed

Swap space check passed

Free disk space check passed for "sles01:/usr,sles01:/var,sles01:/etc,sles01:/u01/app/12.1.0/grid,sles01:/sbin,sles01:/tmp"

Free disk space check passed for "rhel01:/usr,rhel01:/var,rhel01:/etc,rhel01:/u01/app/12.1.0/grid,rhel01:/sbin,rhel01:/tmp"

Check for multiple users with UID value 1101 passed

User existence check passed for "oracle"

Run level check passed

Hard limits check passed for "maximum open file descriptors"

Soft limits check passed for "maximum open file descriptors"

Hard limits check passed for "maximum user processes"

Soft limits check passed for "maximum user processes"

System architecture check passed

WARNING:

PRVF-7524 : Kernel version is not consistent across all the nodes.

Kernel version = "3.0.101-63-default" found on nodes: sles01.

Kernel version = "3.8.13-16.2.1.el6uek.x86_64" found on nodes: rhel01.

Kernel version check passed

Kernel parameter check passed for "semmsl"

Kernel parameter check passed for "semmns"

Kernel parameter check passed for "semopm"

Kernel parameter check passed for "semmni"

Kernel parameter check passed for "shmmax"

Kernel parameter check passed for "shmmni"

Kernel parameter check passed for "shmall"

Kernel parameter check passed for "file-max"

Kernel parameter check passed for "ip_local_port_range"

Kernel parameter check passed for "rmem_default"

Kernel parameter check passed for "rmem_max"

Kernel parameter check passed for "wmem_default"

Kernel parameter check passed for "wmem_max"

Kernel parameter check passed for "aio-max-nr"

Package existence check passed for "make"

Package existence check passed for "libaio"

Package existence check passed for "binutils"

Package existence check passed for "gcc(x86_64)"

Package existence check passed for "gcc-c++(x86_64)"

Package existence check passed for "glibc"

Package existence check passed for "glibc-devel"

Package existence check passed for "ksh"

Package existence check passed for "libaio-devel"

Package existence check failed for "libstdc++33"

Check failed on nodes:

rhel01

Package existence check failed for "libstdc++43-devel"

Check failed on nodes:

rhel01

Package existence check passed for "libstdc++-devel(x86_64)"

Package existence check failed for "libstdc++46"

Check failed on nodes:

rhel01

Package existence check failed for "libgcc46"

Check failed on nodes:

rhel01

Package existence check passed for "sysstat"

Package existence check failed for "libcap1"

Check failed on nodes:

rhel01

Package existence check failed for "nfs-kernel-server"

Check failed on nodes:

rhel01

Check for multiple users with UID value 0 passed

Current group ID check passed

Starting check for consistency of primary group of root user

Check for consistency of root user's primary group passed

Group existence check passed for "asmadmin"

Group existence check passed for "asmoper"

Group existence check passed for "asmdba"

Checking ASMLib configuration.

Check for ASMLib configuration passed.

Checking OCR integrity...

OCR integrity check passed

Checking Oracle Cluster Voting Disk configuration...

Oracle Cluster Voting Disk configuration check passed

Time zone consistency check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...

No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed

User "oracle" is not part of "root" group. Check passed

Checking integrity of file "/etc/resolv.conf" across nodes

"domain" and "search" entries do not coexist in any "/etc/resolv.conf" file

All nodes have same "search" order defined in file "/etc/resolv.conf"

PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: sles01,rhel01

Check for integrity of file "/etc/resolv.conf" failed

Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ...

Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed

Pre-check for node addition was unsuccessful on all the nodes.

So the problem is not if the check succeed or not (it will not), but what fails.

Solving all the problems not related to the difference SuSE-OEL is crucial, because the addNode.sh will fail with the same errors. I need to run it using -ignorePrereqs and -ignoreSysPrereqs switches. Let’s see how it works:

oracle@sles01:/u01/app/12.1.0/grid/addnode> ./addnode.sh -silent "CLUSTER_NEW_NODES={rhel01}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={rhel01-vip}" -ignorePrereq -ignoreSysPrereqs
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 27479 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 2032 MB    Passed

Prepare Configuration in progress.

Prepare Configuration successful.
..................................................   9% Done.
You can find the log of this install session at:
 /u01/app/oraInventory/logs/addNodeActions2015-11-09_09-57-16PM.log

Instantiate files in progress.

Instantiate files successful.
..................................................   15% Done.

Copying files to node in progress.

Copying files to node successful.
..................................................   79% Done.

Saving cluster inventory in progress.
..................................................   87% Done.

Saving cluster inventory successful.
The Cluster Node Addition of /u01/app/12.1.0/grid was successful.
Please check '/tmp/silentInstall.log' for more details.

As a root user, execute the following script(s):
        1. /u01/app/oraInventory/orainstRoot.sh
        2. /u01/app/12.1.0/grid/root.sh

Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:
[rhel01]
Execute /u01/app/12.1.0/grid/root.sh on the following nodes:
[rhel01]

The scripts can be executed in parallel on all the nodes. If there are any policy managed databases managed by cluster, proceed with the addnode procedure without executing the root.sh script. Ensure that root.sh script is executed after all the policy managed databases managed by clusterware are extended to the new nodes.
..........
Update Inventory in progress.
..................................................   100% Done.

Update Inventory successful.
Successfully Setup Software.

oracle@sles01:/u01/app/12.1.0/grid/addnode> ./addnode.sh -silent "CLUSTER_NEW_NODES={rhel01}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={rhel01-vip}" -ignorePrereq -ignoreSysPrereqs

Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB. Actual 27479 MB Passed

Checking swap space: must be greater than 150 MB. Actual 2032 MB Passed

Prepare Configuration in progress.

Prepare Configuration successful.

.................................................. 9% Done.

You can find the log of this install session at:

/u01/app/oraInventory/logs/addNodeActions2015-11-09_09-57-16PM.log

Instantiate files in progress.

Instantiate files successful.

.................................................. 15% Done.

Copying files to node in progress.

Copying files to node successful.

.................................................. 79% Done.

Saving cluster inventory in progress.

.................................................. 87% Done.

Saving cluster inventory successful.

The Cluster Node Addition of /u01/app/12.1.0/grid was successful.

Please check '/tmp/silentInstall.log' for more details.

As a root user, execute the following script(s):

1. /u01/app/oraInventory/orainstRoot.sh

2. /u01/app/12.1.0/grid/root.sh

Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:

[rhel01]

Execute /u01/app/12.1.0/grid/root.sh on the following nodes:

[rhel01]

The scripts can be executed in parallel on all the nodes. If there are any policy managed databases managed by cluster, proceed with the addnode procedure without executing the root.sh script. Ensure that root.sh script is executed after all the policy managed databases managed by clusterware are extended to the new nodes.

..........

Update Inventory in progress.

.................................................. 100% Done.

Update Inventory successful.

Successfully Setup Software.

Then, as stated by the addNode.sh, I run the root.sh and I expect it to work:

[oracle@rhel01 install]$ sudo /u01/app/12.1.0/grid/root.sh
Performing root user operation for Oracle 12c

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/12.1.0/grid
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u01/app/12.1.0/grid/crs/install/crsconfig_params
2015/11/09 23:18:42 CLSRSC-363: User ignored prerequisites during installation

OLR initialization - successful
2015/11/09 23:19:08 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.conf'

CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'rhel01'
CRS-2672: Attempting to start 'ora.evmd' on 'rhel01'
CRS-2676: Start of 'ora.mdnsd' on 'rhel01' succeeded
CRS-2676: Start of 'ora.evmd' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rhel01'
CRS-2676: Start of 'ora.gpnpd' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rhel01'
CRS-2676: Start of 'ora.gipcd' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rhel01'
CRS-2676: Start of 'ora.cssdmonitor' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rhel01'
CRS-2672: Attempting to start 'ora.diskmon' on 'rhel01'
CRS-2676: Start of 'ora.diskmon' on 'rhel01' succeeded
CRS-2789: Cannot stop resource 'ora.diskmon' as it is not running on server 'rhel01'
CRS-2676: Start of 'ora.cssd' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rhel01'
CRS-2672: Attempting to start 'ora.ctssd' on 'rhel01'
CRS-2676: Start of 'ora.ctssd' on 'rhel01' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rhel01'
CRS-2676: Start of 'ora.asm' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rhel01'
CRS-2676: Start of 'ora.storage' on 'rhel01' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rhel01'
CRS-2676: Start of 'ora.crsd' on 'rhel01' succeeded
CRS-6017: Processing resource auto-start for servers: rhel01
CRS-2672: Attempting to start 'ora.ons' on 'rhel01'
CRS-2676: Start of 'ora.ons' on 'rhel01' succeeded
CRS-6016: Resource auto-start has completed for server rhel01
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.
2015/11/09 23:22:06 CLSRSC-343: Successfully started Oracle clusterware stack

clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 12c Release 1.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Preparing packages for installation...
cvuqdisk-1.0.9-1
2015/11/09 23:22:23 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

[oracle@rhel01 install]$ sudo /u01/app/12.1.0/grid/root.sh

Performing root user operation for Oracle 12c

The following environment variables are set as:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/app/12.1.0/grid

Copying dbhome to /usr/local/bin ...

Copying oraenv to /usr/local/bin ...

Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Relinking oracle with rac_on option

Using configuration parameter file: /u01/app/12.1.0/grid/crs/install/crsconfig_params

2015/11/09 23:18:42 CLSRSC-363: User ignored prerequisites during installation

OLR initialization - successful

2015/11/09 23:19:08 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.conf'

CRS-4133: Oracle High Availability Services has been stopped.

CRS-4123: Oracle High Availability Services has been started.

CRS-4133: Oracle High Availability Services has been stopped.

CRS-4123: Oracle High Availability Services has been started.

CRS-4133: Oracle High Availability Services has been stopped.

CRS-4123: Starting Oracle High Availability Services-managed resources

CRS-2672: Attempting to start 'ora.mdnsd' on 'rhel01'

CRS-2672: Attempting to start 'ora.evmd' on 'rhel01'

CRS-2676: Start of 'ora.mdnsd' on 'rhel01' succeeded

CRS-2676: Start of 'ora.evmd' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.gpnpd' on 'rhel01'

CRS-2676: Start of 'ora.gpnpd' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.gipcd' on 'rhel01'

CRS-2676: Start of 'ora.gipcd' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rhel01'

CRS-2676: Start of 'ora.cssdmonitor' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'rhel01'

CRS-2672: Attempting to start 'ora.diskmon' on 'rhel01'

CRS-2676: Start of 'ora.diskmon' on 'rhel01' succeeded

CRS-2789: Cannot stop resource 'ora.diskmon' as it is not running on server 'rhel01'

CRS-2676: Start of 'ora.cssd' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rhel01'

CRS-2672: Attempting to start 'ora.ctssd' on 'rhel01'

CRS-2676: Start of 'ora.ctssd' on 'rhel01' succeeded

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.asm' on 'rhel01'

CRS-2676: Start of 'ora.asm' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.storage' on 'rhel01'

CRS-2676: Start of 'ora.storage' on 'rhel01' succeeded

CRS-2672: Attempting to start 'ora.crsd' on 'rhel01'

CRS-2676: Start of 'ora.crsd' on 'rhel01' succeeded

CRS-6017: Processing resource auto-start for servers: rhel01

CRS-2672: Attempting to start 'ora.ons' on 'rhel01'

CRS-2676: Start of 'ora.ons' on 'rhel01' succeeded

CRS-6016: Resource auto-start has completed for server rhel01

CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources

CRS-4123: Oracle High Availability Services has been started.

2015/11/09 23:22:06 CLSRSC-343: Successfully started Oracle clusterware stack

clscfg: EXISTING configuration version 5 detected.

clscfg: version 5 is 12c Release 1.

Successfully accumulated necessary OCR keys.

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

Preparing packages for installation...

cvuqdisk-1.0.9-1

2015/11/09 23:22:23 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

Bingo! Let’s check if everything is up and running:

[oracle@rhel01 ~]$ /u01/app/12.1.0/grid/bin/crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rhel01                   STABLE
               ONLINE  ONLINE       sles01                   STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rhel01                   STABLE
               ONLINE  ONLINE       sles01                   STABLE
ora.asm
               ONLINE  ONLINE       rhel01                   Started,STABLE
               ONLINE  ONLINE       sles01                   Started,STABLE
ora.net1.network
               ONLINE  ONLINE       rhel01                   STABLE
               ONLINE  ONLINE       sles01                   STABLE
ora.ons
               ONLINE  ONLINE       rhel01                   STABLE
               ONLINE  ONLINE       sles01                   STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       sles01                   STABLE
ora.cvu
      1        ONLINE  ONLINE       sles01                   STABLE
ora.oc4j
      1        OFFLINE OFFLINE                               STABLE
ora.rhel01.vip
      1        ONLINE  ONLINE       rhel01                   STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       sles01                   STABLE
ora.sles01.vip
      1        ONLINE  ONLINE       sles01                   STABLE
--------------------------------------------------------------------------------

[oracle@rhel01 ~]$ /u01/app/12.1.0/grid/bin/crsctl stat res -t

--------------------------------------------------------------------------------

Name Target State Server State details

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE rhel01 STABLE

ONLINE ONLINE sles01 STABLE

ora.LISTENER.lsnr

ONLINE ONLINE rhel01 STABLE

ONLINE ONLINE sles01 STABLE

ora.asm

ONLINE ONLINE rhel01 Started,STABLE

ONLINE ONLINE sles01 Started,STABLE

ora.net1.network

ONLINE ONLINE rhel01 STABLE

ONLINE ONLINE sles01 STABLE

ora.ons

ONLINE ONLINE rhel01 STABLE

ONLINE ONLINE sles01 STABLE

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE sles01 STABLE

ora.cvu

1 ONLINE ONLINE sles01 STABLE

ora.oc4j

1 OFFLINE OFFLINE STABLE

ora.rhel01.vip

1 ONLINE ONLINE rhel01 STABLE

ora.scan1.vip

1 ONLINE ONLINE sles01 STABLE

ora.sles01.vip

1 ONLINE ONLINE sles01 STABLE

--------------------------------------------------------------------------------

[oracle@rhel01 ~]$ olsnodes -s
sles01  Active
rhel01  Active

[oracle@rhel01 ~]$ ssh rhel01 uname -r
3.8.13-16.2.1.el6uek.x86_64
[oracle@rhel01 ~]$ ssh sles01 uname -r
3.0.101-63-default

[oracle@rhel01 ~]$ ssh rhel01 cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.5 (Santiago)
[oracle@rhel01 ~]$ ssh sles01 cat /etc/issue
Welcome to SUSE Linux Enterprise Server 11 SP4  (x86_64) - Kernel \r (\l).

[oracle@rhel01 ~]$ olsnodes -s

sles01 Active

rhel01 Active

[oracle@rhel01 ~]$ ssh rhel01 uname -r

3.8.13-16.2.1.el6uek.x86_64

[oracle@rhel01 ~]$ ssh sles01 uname -r

3.0.101-63-default

[oracle@rhel01 ~]$ ssh rhel01 cat /etc/redhat-release

Red Hat Enterprise Linux Server release 6.5 (Santiago)

[oracle@rhel01 ~]$ ssh sles01 cat /etc/issue

Welcome to SUSE Linux Enterprise Server 11 SP4 (x86_64) - Kernel \r (\l).

So yes, it works, but remember that it’s not a supported long-term configuration.

In my case I expect to migrate the whole cluster from SLES to OEL in one day.

NOTE: using OEL6 as new target is easy because the interface names do not change. The new OEL7 interface naming changes, if you need to migrate without cluster downtime you need to setup the new OEL7 nodes following this post: http://ask.xmodulo.com/change-network-interface-name-centos7.html

Otherwise, you need to configure a new interface name for the cluster with oifcfg.

HTH

—

Ludovico

It’s confirmed. Standard Edition and Standard Edition One are dead.

Posted on August 3, 2015 by Ludovico

The first voices came on July 3rd, 2015.

After many years of existence, Standard Edition and Standard Edition One will no longer be part of the Oracle Database Edition portfolio.

The short history

Standard Edition has been for longtime the “stepbrother” of Enterprise Edition, with less features, no options, but cheaper than EE. I can’t remember when SE has been released. It was before 2000s, I guess.

In 2003, Oracle released 10gR1. Many new features as been released for EE only, but:

– RAC as been included as part of Standard Edition

– Standard Edition One has been released, with an even lower price and “almost” the same features of Standard Edition.

For a few years, customers had the possibility to get huge savings (but many compromises) by choosing the cheaper editions.

SE ONE: just two sockets, but with today’s 18-core processors, the possibility to run Oracle on 36 cores (or more?) for less than 12k of licenses.

SE: up to four sockets and the possibility to run on either 72 core servers or RAC composed by a total of 72 cores (max 4 nodes) for less than the price of a 4-core Enterprise Edition deployement.

In 2014, for the first time, Oracle released a new Database version (12.1.0.2) where Standard Edition and SE One were not immediately available.

For months, customers asked: “When will the Oracle 12.1.0.2 SE be available?”

Now the big announcement: SE and SE One will no longer exist. With 12.1.0.2, there’s a new Edition: Oracle Database Standard Edition 2.

You can find more information here:

Some highlights

– SE One will no longer exist

– SE is replaced by SE Two that has a limitation of 2 sockets

– SE Two still has RAC feature, with a maximum of two single-socket servers.

– Customers with SE on 4 socket nodes (or clusters) will need to migrate to 2 socket nodes (or clusters)

– Customers with SE One should definitely be prepared to spend some money to upgrade to SE Two, which comes at the same price of the old Standard Edition. ($17,500 per socket).

– the smallest amount of NUP licenses when licensing per named users has been increased to 10 (it was 5 with SE and SE One).

– Each SE2 Database can run max 16 user threads (in RAC, max 8 per instance). This is limited by the database Resource Manager. It does not prevent customers from using all the cores, in case they want to deploy many databases per server.

So, finally, less scalability for the same pricetag.

Other bloggers have already written about the behaviour of SE2. The best blog post is IMO from Franck Pachot. http://blog.dbi-services.com/oracle-standard-edition-two/

Cheers

—

Ludo

It’s time to Collaborate again!!

Posted on March 27, 2015 by Ludovico

In a little more than a couple of weeks, the great Collaborate conference will start again.

My agenda will be quite packed again, as speaker, panelist and workshop organizer:

Date/Time	Event
08/04/2015 3:15 pm - 4:15 pm	Oracle RAC, Data Guard, and Pluggable Databases: When MAA Meets Oracle Multitenant IOUG Collaborate 15, Las Vegas NV
08/04/2015 4:30 pm - 5:30 pm	Panel: Nothing to BLOG About - Think Again IOUG Collaborate 15, Las Vegas NV
12/04/2015 9:00 am - 4:00 pm	RAC Attack! 12c IOUG Collaborate 15, Las Vegas NV
15/04/2015 5:30 pm - 6:00 pm	IOUG RAC SIG Meeting IOUG Collaborate 15, Las Vegas NV

RAC Attack! 12c

This technical workshop and networking event (never forget it’s a project created several years ago thanks to an intuition of Jeremy Schneider), confirms to be one of the best, long-living projects in the Oracle Community. It certainly boosted my Community involvement up to becoming an Oracle ACE. This year I’m the coordinator of the organization of the workshop, it’s a double satisfaction and it will certainly be a lot of fun again. Did I said that it’s already full booked? I’ve already blogged about it (and about what the lucky participants will get) here.

Oracle RAC, Data Guard, and Pluggable Databases: When MAA Meets Oracle Multitenant

One of my favorite presentations, I’ve presented it already at OOW14 and UKOUG Tech14, but it’s still a very new topic for most people, even the most experienced DBAs. You’ll learn how Multitenant, RAC and Data Guard work together. Expect colorful architecture schemas and a live demo! You can read more about it in this post.

Panel: Nothing to BLOG About – Think Again

My friend Michael Abbey (Pythian) invited me to participate in his panel about blogging. It’s my first time as panelist, so I’m very excited!

IOUG RAC SIG Meeting

Missing this great networking event is not an option! I’m organizing this session as RAC SIG board member (Thanks to the IOUG for this opportunity!). We’ll focus on Real Application Clusters role in the private cloud and infrastructure optimization. We’ll have many special guests, including Oracle RAC PM Markus Michalewicz, Oracle QoS PM Mark Scardina and Oracle ASM PM James Williams.

Can you ever miss it???

A good Trivadis representative!!

This year I’m not going to Las Vegas alone. My Trivadis colleague Markus Flechtner , one of the most expert RAC technologists I have the chance to know, will also come and present a session about RAC diagnostics:

615: RAC Clinics- Starring Dr. ORACHK, Dr CHM and Dr. TFA

Mon. April 13| 9:15 AM – 10:15 AM | Room Palm D

If you speak German you can follow his nice blog: http://oracle.markusflechtner.de/

Looking forward to meet you there

—

Ludovico

Moving Clusterware Interconnect from single NIC/Bond to HAIP

Posted on March 23, 2015 by Ludovico

Very recently I had to configure a customer’s RAC private interconnect from bonding to HAIP to get benefit of both NICs.

So I would like to recap here what the hypothetic steps would be if you need to do the same.

In this example I’ll switch from a single-NIC interconnect (eth1) rather than from a bond configuration, so if you are familiar with the RAC Attack! environment you can try to put everything in place on your own.

First, you need to plan the new network configuration in advance, keeping in mind that there are a couple of important restrictions:

Your interconnect interface naming must be uniform on all nodes in the cluster. The interconnect uses the interface name in its configuration and it doesn’t support different names on different hosts
You must bind the different private interconnect interfaces in different subnets (see Note: 1481481.1 – 11gR2 CSS Terminates/Node Eviction After Unplugging one Network Cable in Redundant Interconnect Environment if you need an explanation)

Implementation

The RAC Attack book uses one interface per node for the interconnect (eth1, using network 172.16.100.0)

To make things a little more complex, we’ll not use the eth1 in the new HAIP configuration, so we’ll test also the deletion of the old interface.

What you need to do is add two new interfaces (host only in your virtualbox) and configure them as eth2 and eth3, e.g. in networks 172.16.101.0 and 172.16.102.0)

eth2      Link encap:Ethernet  HWaddr 08:00:27:32:76:DD
          inet addr:172.16.101.51  Bcast:172.16.101.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe32:76dd/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:29 errors:0 dropped:0 overruns:0 frame:0
          TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2044 (1.9 KiB)  TX bytes:1714 (1.6 KiB)

eth3      Link encap:Ethernet  HWaddr 08:00:27:2E:05:4B
          inet addr:172.16.102.61  Bcast:172.16.102.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe2e:54b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:19 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1140 (1.1 KiB)  TX bytes:720 (720.0 b)

eth2 Link encap:Ethernet HWaddr 08:00:27:32:76:DD

inet addr:172.16.101.51 Bcast:172.16.101.255 Mask:255.255.255.0

inet6 addr: fe80::a00:27ff:fe32:76dd/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:29 errors:0 dropped:0 overruns:0 frame:0

TX packets:25 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:2044 (1.9 KiB) TX bytes:1714 (1.6 KiB)

eth3 Link encap:Ethernet HWaddr 08:00:27:2E:05:4B

inet addr:172.16.102.61 Bcast:172.16.102.255 Mask:255.255.255.0

inet6 addr: fe80::a00:27ff:fe2e:54b/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:19 errors:0 dropped:0 overruns:0 frame:0

TX packets:12 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:1140 (1.1 KiB) TX bytes:720 (720.0 b)

modify /var/named/racattack in order to use the new addresses (RAC doesn’t care about logical names, it’s just for our convenience):

collabn1 A 192.168.78.51
collabn1-vip A 192.168.78.61
collabn1-priv A 172.16.100.51
collabn1-priv1 A 172.16.101.51
collabn1-priv2 A 172.16.102.61
collabn2 A 192.168.78.52
collabn2-vip A 192.168.78.62
collabn2-priv A 172.16.100.52
collabn2-priv1 A 172.16.101.52
collabn2-priv2 A 172.16.102.62

collabn1 A 192.168.78.51

collabn1-vip A 192.168.78.61

collabn1-priv A 172.16.100.51

collabn1-priv1 A 172.16.101.51

collabn1-priv2 A 172.16.102.61

collabn2 A 192.168.78.52

collabn2-vip A 192.168.78.62

collabn2-priv A 172.16.100.52

collabn2-priv1 A 172.16.101.52

collabn2-priv2 A 172.16.102.62

add also the reverse lookup in in-addr.arpa:

51.101.16.172 PTR collabn1-priv1.racattack.
52.102.16.172 PTR collabn1-priv2.racattack.
61.101.16.172 PTR collabn2-priv1.racattack.
62.102.16.172 PTR collabn2-priv2.racattack.

51.101.16.172 PTR collabn1-priv1.racattack.

52.102.16.172 PTR collabn1-priv2.racattack.

61.101.16.172 PTR collabn2-priv1.racattack.

62.102.16.172 PTR collabn2-priv2.racattack.

restart named on the first node and check that both nodes can ping all the names correctly:

[root@collabn1 named]# ping collabn2-priv1
PING collabn2-priv1.racattack (172.16.101.52) 56(84) bytes of data.
64 bytes from 172.16.101.52: icmp_seq=1 ttl=64 time=1.27 ms
64 bytes from 172.16.101.52: icmp_seq=2 ttl=64 time=0.396 ms
^C
--- collabn2-priv1.racattack ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1293ms
rtt min/avg/max/mdev = 0.396/0.835/1.275/0.440 ms
[root@collabn1 named]# ping collabn2-priv2
PING collabn2-priv2.racattack (172.16.102.62) 56(84) bytes of data.
64 bytes from 172.16.102.62: icmp_seq=1 ttl=64 time=0.924 ms
64 bytes from 172.16.102.62: icmp_seq=2 ttl=64 time=0.251 ms
^C
--- collabn2-priv2.racattack ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1480ms
rtt min/avg/max/mdev = 0.251/0.587/0.924/0.337 ms
[root@collabn1 named]# ping collabn1-priv2
PING collabn1-priv2.racattack (172.16.102.61) 56(84) bytes of data.
64 bytes from 172.16.102.61: icmp_seq=1 ttl=64 time=0.019 ms
64 bytes from 172.16.102.61: icmp_seq=2 ttl=64 time=0.032 ms
^C
--- collabn1-priv2.racattack ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1240ms
rtt min/avg/max/mdev = 0.019/0.025/0.032/0.008 ms
[root@collabn1 named]# ping collabn1-priv1
PING collabn1-priv1.racattack (172.16.101.51) 56(84) bytes of data.
64 bytes from 172.16.101.51: icmp_seq=1 ttl=64 time=0.017 ms
64 bytes from 172.16.101.51: icmp_seq=2 ttl=64 time=0.060 ms
^C
--- collabn1-priv1.racattack ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1224ms
rtt min/avg/max/mdev = 0.017/0.038/0.060/0.022 ms

[root@collabn1 named]# ping collabn2-priv1

PING collabn2-priv1.racattack (172.16.101.52) 56(84) bytes of data.

64 bytes from 172.16.101.52: icmp_seq=1 ttl=64 time=1.27 ms

64 bytes from 172.16.101.52: icmp_seq=2 ttl=64 time=0.396 ms

--- collabn2-priv1.racattack ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1293ms

rtt min/avg/max/mdev = 0.396/0.835/1.275/0.440 ms

[root@collabn1 named]# ping collabn2-priv2

PING collabn2-priv2.racattack (172.16.102.62) 56(84) bytes of data.

64 bytes from 172.16.102.62: icmp_seq=1 ttl=64 time=0.924 ms

64 bytes from 172.16.102.62: icmp_seq=2 ttl=64 time=0.251 ms

--- collabn2-priv2.racattack ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1480ms

rtt min/avg/max/mdev = 0.251/0.587/0.924/0.337 ms

[root@collabn1 named]# ping collabn1-priv2

PING collabn1-priv2.racattack (172.16.102.61) 56(84) bytes of data.

64 bytes from 172.16.102.61: icmp_seq=1 ttl=64 time=0.019 ms

64 bytes from 172.16.102.61: icmp_seq=2 ttl=64 time=0.032 ms

--- collabn1-priv2.racattack ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1240ms

rtt min/avg/max/mdev = 0.019/0.025/0.032/0.008 ms

[root@collabn1 named]# ping collabn1-priv1

PING collabn1-priv1.racattack (172.16.101.51) 56(84) bytes of data.

64 bytes from 172.16.101.51: icmp_seq=1 ttl=64 time=0.017 ms

64 bytes from 172.16.101.51: icmp_seq=2 ttl=64 time=0.060 ms

--- collabn1-priv1.racattack ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1224ms

rtt min/avg/max/mdev = 0.017/0.038/0.060/0.022 ms

check the nodes that compose the cluster:

[root@collabn1 network-scripts]# olsnodes -s
collabn1 Active
collabn2 Active

[root@collabn1 network-scripts]# olsnodes -s

collabn1 Active

collabn2 Active

on all nodes, make a copy of the gpnp profile.xml (just in case, the oifcfg tool does the copy automatically)

$ cd $GRID_HOME/gpnp/`hostname`/profiles/peer/
$ cp -p profile.xml profile.xml.bk

1 2	$ cd $GRID_HOME/gpnp/`hostname`/profiles/peer/ $ cp -p profile.xml profile.xml.bk

List the available networks:

[root@collabn1 bin]# ./oifcfg iflist -p -n
eth0 192.168.78.0 PRIVATE 255.255.255.0
eth1 172.16.100.0 PRIVATE 255.255.255.0
eth1 169.254.0.0 UNKNOWN 255.255.0.0
eth2 172.16.101.0 PRIVATE 255.255.255.0
eth3 172.16.102.0 PRIVATE 255.255.255.0

[root@collabn1 bin]# ./oifcfg iflist -p -n

eth0 192.168.78.0 PRIVATE 255.255.255.0

eth1 172.16.100.0 PRIVATE 255.255.255.0

eth1 169.254.0.0 UNKNOWN 255.255.0.0

eth2 172.16.101.0 PRIVATE 255.255.255.0

eth3 172.16.102.0 PRIVATE 255.255.255.0

Get the current ip configuration for the interconnect:

[root@collabn1 bin]# ./oifcfg getif
eth0 192.168.78.0 global public
eth1 172.16.100.0 global cluster_interconnect

[root@collabn1 bin]# ./oifcfg getif

eth0 192.168.78.0 global public

eth1 172.16.100.0 global cluster_interconnect

one one node only, set the new interconnect interfaces:

[root@collabn1 network-scripts]# oifcfg setif -global eth2/172.16.101.0:cluster_interconnect
[root@collabn1 network-scripts]# oifcfg setif -global eth3/172.16.102.0:cluster_interconnect
[root@collabn1 network-scripts]# oifcfg getif
eth0 192.168.78.0 global public
eth1 172.16.100.0 global cluster_interconnect
eth2 172.16.101.0 global cluster_interconnect
eth3 172.16.102.0 global cluster_interconnect

[root@collabn1 network-scripts]# oifcfg setif -global eth2/172.16.101.0:cluster_interconnect

[root@collabn1 network-scripts]# oifcfg setif -global eth3/172.16.102.0:cluster_interconnect

[root@collabn1 network-scripts]# oifcfg getif

eth0 192.168.78.0 global public

eth1 172.16.100.0 global cluster_interconnect

eth2 172.16.101.0 global cluster_interconnect

eth3 172.16.102.0 global cluster_interconnect

check that the other nodes has received the new configuration:

[root@collabn2 bin]# ./oifcfg getif
eth0 192.168.78.0 global public
eth1 172.16.100.0 global cluster_interconnect
eth2 172.16.101.0 global cluster_interconnect
eth3 172.16.102.0 global cluster_interconnect

[root@collabn2 bin]# ./oifcfg getif

eth0 192.168.78.0 global public

eth1 172.16.100.0 global cluster_interconnect

eth2 172.16.101.0 global cluster_interconnect

eth3 172.16.102.0 global cluster_interconnect

Before deleting the old interface, it would be sensible to stop your cluster resources (in some cases, one of the nodes may be evicted), in any case the cluster must be restarted completely in order to get the new interfaces working.

Note: having three interfaces in a HAIP interconnect is perfectly working, HAIP works from 2 to 4 interfaces. I’m showing how to delete eth1 just for information!! 🙂

[root@collabn1 network-scripts]# oifcfg delif -global eth1/172.16.100.0
[root@collabn1 network-scripts]# oifcfg getif
eth0 192.168.78.0 global public
eth2 172.16.101.0 global cluster_interconnect
eth3 172.16.102.0 global cluster_interconnect

[root@collabn1 network-scripts]# oifcfg delif -global eth1/172.16.100.0

[root@collabn1 network-scripts]# oifcfg getif

eth0 192.168.78.0 global public

eth2 172.16.101.0 global cluster_interconnect

eth3 172.16.102.0 global cluster_interconnect

on all nodes, shutdown the CRS:

[root@collabn1 network-scripts]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'collabn1'
...

[root@collabn1 network-scripts]# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'collabn1'

...

Now you can disable the old interface:

[root@collabn1 network-scripts]# ifdown eth1

1	[root@collabn1 network-scripts]# ifdown eth1

and modify the parameter ONBOOT=no inside the configuration script of eth1 interface.

Start the cluster again:

[root@collabn1 network-scripts]# crsctl start crs

1	[root@collabn1 network-scripts]# crsctl start crs

And check that the resources are up & running:

# crscst stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
ora.LISTENER.lsnr
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
ora.asm
ONLINE ONLINE collabn1 Started
ONLINE ONLINE collabn2 Started
ora.gsd
OFFLINE OFFLINE collabn1
OFFLINE OFFLINE collabn2
ora.net1.network
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
ora.ons
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE collabn2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE collabn1
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE collabn1
ora.collabn1.vip
1 ONLINE ONLINE collabn1
ora.collabn2.vip
1 ONLINE ONLINE collabn2
ora.cvu
1 ONLINE ONLINE collabn1
ora.oc4j
1 ONLINE ONLINE collabn1
ora.orcl.db
1 ONLINE ONLINE collabn1 Open
2 ONLINE ONLINE collabn2 Open
ora.scan1.vip
1 ONLINE ONLINE collabn2
ora.scan2.vip
1 ONLINE ONLINE collabn1
ora.scan3.vip
1 ONLINE ONLINE collabn1

# crscst stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

ora.LISTENER.lsnr

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

ora.asm

ONLINE ONLINE collabn1 Started

ONLINE ONLINE collabn2 Started

ora.gsd

OFFLINE OFFLINE collabn1

OFFLINE OFFLINE collabn2

ora.net1.network

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

ora.ons

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE collabn2

ora.LISTENER_SCAN2.lsnr

1 ONLINE ONLINE collabn1

ora.LISTENER_SCAN3.lsnr

1 ONLINE ONLINE collabn1

ora.collabn1.vip

1 ONLINE ONLINE collabn1

ora.collabn2.vip

1 ONLINE ONLINE collabn2

ora.cvu

1 ONLINE ONLINE collabn1

ora.oc4j

1 ONLINE ONLINE collabn1

ora.orcl.db

1 ONLINE ONLINE collabn1 Open

2 ONLINE ONLINE collabn2 Open

ora.scan1.vip

1 ONLINE ONLINE collabn2

ora.scan2.vip

1 ONLINE ONLINE collabn1

ora.scan3.vip

1 ONLINE ONLINE collabn1

Testing the high availability

Disconnect cable from one of the two interfaces (virtually if you’re in virtualbox 🙂 )

Pay attention at the NO-CARRIER status (in eth2 in this example):

# ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:07:33:94 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
link/ether 08:00:27:7f:b4:88 brd ff:ff:ff:ff:ff:ff
4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
link/ether 08:00:27:51:1d:78 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:39:86:f2 brd ff:ff:ff:ff:ff:ff

# ip l

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 08:00:27:07:33:94 brd ff:ff:ff:ff:ff:ff

3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000

link/ether 08:00:27:7f:b4:88 brd ff:ff:ff:ff:ff:ff

4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000

link/ether 08:00:27:51:1d:78 brd ff:ff:ff:ff:ff:ff

5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 08:00:27:39:86:f2 brd ff:ff:ff:ff:ff:ff

check that the CRS is still up & running:

# crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
ora.LISTENER.lsnr
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
ora.asm
ONLINE ONLINE collabn1 Started
ONLINE ONLINE collabn2 Started
ora.gsd
OFFLINE OFFLINE collabn1
OFFLINE OFFLINE collabn2
ora.net1.network
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
ora.ons
ONLINE ONLINE collabn1
ONLINE ONLINE collabn2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE collabn2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE collabn1
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE collabn1
ora.collabn1.vip
1 ONLINE ONLINE collabn1
ora.collabn2.vip
1 ONLINE ONLINE collabn2
ora.cvu
1 ONLINE ONLINE collabn1
ora.oc4j
1 ONLINE ONLINE collabn1
ora.orcl.db
1 ONLINE ONLINE collabn1 Open
2 ONLINE ONLINE collabn2 Open
ora.scan1.vip
1 ONLINE ONLINE collabn2
ora.scan2.vip
1 ONLINE ONLINE collabn1
ora.scan3.vip
1 ONLINE ONLINE collabn1

# crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

ora.LISTENER.lsnr

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

ora.asm

ONLINE ONLINE collabn1 Started

ONLINE ONLINE collabn2 Started

ora.gsd

OFFLINE OFFLINE collabn1

OFFLINE OFFLINE collabn2

ora.net1.network

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

ora.ons

ONLINE ONLINE collabn1

ONLINE ONLINE collabn2

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE collabn2

ora.LISTENER_SCAN2.lsnr

1 ONLINE ONLINE collabn1

ora.LISTENER_SCAN3.lsnr

1 ONLINE ONLINE collabn1

ora.collabn1.vip

1 ONLINE ONLINE collabn1

ora.collabn2.vip

1 ONLINE ONLINE collabn2

ora.cvu

1 ONLINE ONLINE collabn1

ora.oc4j

1 ONLINE ONLINE collabn1

ora.orcl.db

1 ONLINE ONLINE collabn1 Open

2 ONLINE ONLINE collabn2 Open

ora.scan1.vip

1 ONLINE ONLINE collabn2

ora.scan2.vip

1 ONLINE ONLINE collabn1

ora.scan3.vip

1 ONLINE ONLINE collabn1

The virtual interface eth2:1 as failed over on the second interface as eth3:2

eth3:1    Link encap:Ethernet  HWaddr 08:00:27:39:86:F2
          inet addr:169.254.185.134  Bcast:169.254.255.255  Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth3:2    Link encap:Ethernet  HWaddr 08:00:27:39:86:F2
          inet addr:169.254.104.52  Bcast:169.254.127.255  Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth3:1 Link encap:Ethernet HWaddr 08:00:27:39:86:F2

inet addr:169.254.185.134 Bcast:169.254.255.255 Mask:255.255.128.0

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth3:2 Link encap:Ethernet HWaddr 08:00:27:39:86:F2

inet addr:169.254.104.52 Bcast:169.254.127.255 Mask:255.255.128.0

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

After the cable is reconnected, the virtual interface is back on eth2:

eth2:1 Link encap:Ethernet HWaddr 08:00:27:51:1D:78
inet addr:169.254.104.52 Bcast:169.254.127.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth2:1 Link encap:Ethernet HWaddr 08:00:27:51:1D:78

inet addr:169.254.104.52 Bcast:169.254.127.255 Mask:255.255.128.0

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Further information

For this post I’ve used a RAC version 11.2, but RAC 12c use the very same procedure.

You can discover more here about HAIP:

http://docs.oracle.com/cd/E11882_01/server.112/e10803/config_cw.htm#HABPT5279

And here about how to set it (beside this post!):

https://docs.oracle.com/cd/E11882_01/rac.112/e41959/admin.htm#CWADD90980

https://docs.oracle.com/cd/E11882_01/rac.112/e41959/oifcfg.htm#BCGGEFEI

Cheers

—

Ludo

RAC Attack at IOUG Collaborate 2015

Posted on March 6, 2015 by Ludovico

Once again this year the RAC Attack will be a pre-conference workshop at Collaborate.

Whether you’re a sysadmin, a developer or a DBA, I’m sure you will really enjoy this workshop. Why?

First, you get the opportunity to install a RAC 12c using Virtualbox on your laptop and get coached by many RAC experts, Oracle ACEs and ACE Directors, OCMs and famous bloggers and technologists.

If you’ve never installed it, it will be very challenging because you get hands on network components, shared disks, udev, DNS, Virtual Machine cloning, OS install and so on, and being super-user (root) of your own cluster!! If your a developer, you can then start developing your applications by testing the failover features of RAC and their scalability by checking for global cache wait events.

If you’re already used to RAC, this year we have not one or two, but three deals for you:

Try the semi-automated RAC installation using Vagrant: you’ll be able to have your RAC up and running in minutes and concentrate on advanced features.
Implement advanced labs such as Flex Cluster and Flex ASM or Policy Managed Databases, and discover Hub and Leaf nodes, Server Pools and other features
Ask the ninjas to show you other advanced scenarios or just discuss about other RAC related topics

Isn’t enough?

The participants that will complete at least the Linux install (very first stage of the workshop) will get an OTN-sponsored T-shirt of the event, with the very new RAC SIG Logo (the image is purely indicative, the actual design may change):

Still not enough?

We’ll have free pizza (at lunch) and beer (in the afternoon), again sponsored by the Oracle Technology Network. Can’t believe it? Look at a few images from last year’s edition:

20140407_121101

RACAttackC14LV

Check the pre-conference workshops on the IOUG Collaborate 15 website and don’t forget to full-fill the requirements before attending the workshop:

To participate in the workshop, participants need to bring their own laptop. Recommended specification: a) any 64 bit OS that supports Oracle Virtual Box b) 8GB RAM, 45GB free HDD space, SSD recommended.

Important: it’s required to pre-download Oracle Database 12c and Oracle Grid Infrastructure 12c for Linux x86-64 from the Oracle Website http://tinyurl.com/rac12c-dl (four files: linuxamd64_12c_database_1of2.zip linuxamd64_12c_database_2of2.zip linuxamd64_12c_grid_1of2.zip linuxamd64_12c_grid_2of2.zip). Due to license restrictions it’s not be possible to distribute Oracle Sofware.

Looking forward to meet you there!!!

—

Ludovico

Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle Multitenant (OOW14)

Posted on October 2, 2014 by Ludovico

Here you can find the material related to my session at Oracle Open World 2014. I’m sorry I’m late in publishing them, but I challenge you to find spare time during Oracle Open World! It’s the busiest week of the year! (Hard Work, Hard Play)

Slides

Demo 1 video

Demo 2 video

Demo 1 script

clear

function pause () {
	echo
	read -p "$*"
	echo
}

tnsping cdbatl

pause "next... status db"
clear
echo \$ srvctl status database -db CDBATL

srvctl status database -db CDBATL

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	select INST_ID, CON_ID, name, OPEN_MODE
	 from gv\$pdbs
	  where con_id!=2
	  order by name, inst_id;
	exit
EOF

pause "next... add singleton service"

clear

###  add service MAAZAPP SINGLETON
cmd="srvctl add service -db CDBATL -service  maazapp -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 2 -failoverretry 180 -pdb maaz"
echo \$ $cmd
eval $cmd

pause "next... start service"
clear

cmd="srvctl start service -db CDBATL -service maazapp -instance CDBATL_1"
echo \$ $cmd
eval $cmd

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	select INST_ID, CON_ID, name, OPEN_MODE
	 from gv\$pdbs
	  where con_id!=2
	  order by name, inst_id;
	exit
EOF


cmd="srvctl status database -db cdbatl"
echo
echo \$ $cmd
eval $cmd

cmd="srvctl status service -service maazapp -db cdbatl"
echo
echo \$ $cmd
output=`$cmd`
echo $output

current=`echo $output | awk '{print $NF}'`
if [ $current == "raca01" ] ; then
	target="raca02"
else
	target="raca01"
fi


pause "pause... please launch demo1_client.sh"
pause "next... relocate service from $current to $target"

clear

cmd="srvctl relocate service -db CDBATL -service maazapp -currentnode $current -targetnode $target"
echo \$ $cmd
eval $cmd


pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	select INST_ID, CON_ID, name, OPEN_MODE
	 from gv\$pdbs
	  where con_id!=2
	  order by name, inst_id;
	exit
EOF

pause "next... close pdb immediate on old inst"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
 alter pluggable database maaz close immediate instances=('CDBATL_1');
	exit
EOF

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	select INST_ID, CON_ID, name, OPEN_MODE
	 from gv\$pdbs
	  where con_id!=2
	  order by name, inst_id;
	exit
EOF

pause "next... modify service to uniform"

clear

cmd="srvctl modify service -db CDBATL -service maazapp -cardinality uniform"
echo \$ $cmd
eval $cmd

pause "next... status pdb"
clear
sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	select INST_ID, CON_ID, name, OPEN_MODE
	 from gv\$pdbs
	  where con_id!=2
	  order by name, inst_id;
	exit
EOF

echo
cmd="srvctl status service -service maazapp -db cdbatl"
echo \$ $cmd
eval $cmd 

exit

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

clear

function pause () {

echo

read -p "$*"

echo

}

tnsping cdbatl

pause "next... status db"

clear

echo \$ srvctl status database -db CDBATL

srvctl status database -db CDBATL

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

select INST_ID, CON_ID, name, OPEN_MODE

from gv\$pdbs

where con_id!=2

order by name, inst_id;

exit

EOF

pause "next... add singleton service"

clear

### add service MAAZAPP SINGLETON

cmd="srvctl add service -db CDBATL -service maazapp -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 2 -failoverretry 180 -pdb maaz"

echo \$ $cmd

eval $cmd

pause "next... start service"

clear

cmd="srvctl start service -db CDBATL -service maazapp -instance CDBATL_1"

echo \$ $cmd

eval $cmd

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

select INST_ID, CON_ID, name, OPEN_MODE

from gv\$pdbs

where con_id!=2

order by name, inst_id;

exit

EOF

cmd="srvctl status database -db cdbatl"

echo

echo \$ $cmd

eval $cmd

cmd="srvctl status service -service maazapp -db cdbatl"

echo

echo \$ $cmd

output=`$cmd`

echo $output

current=`echo $output | awk '{print $NF}'`

if [ $current == "raca01" ] ; then

target="raca02"

else

target="raca01"

pause "pause... please launch demo1_client.sh"

pause "next... relocate service from $current to $target"

clear

cmd="srvctl relocate service -db CDBATL -service maazapp -currentnode $current -targetnode $target"

echo \$ $cmd

eval $cmd

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

select INST_ID, CON_ID, name, OPEN_MODE

from gv\$pdbs

where con_id!=2

order by name, inst_id;

exit

EOF

pause "next... close pdb immediate on old inst"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

alter pluggable database maaz close immediate instances=('CDBATL_1');

exit

EOF

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

select INST_ID, CON_ID, name, OPEN_MODE

from gv\$pdbs

where con_id!=2

order by name, inst_id;

exit

EOF

pause "next... modify service to uniform"

clear

cmd="srvctl modify service -db CDBATL -service maazapp -cardinality uniform"

echo \$ $cmd

eval $cmd

pause "next... status pdb"

clear

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

select INST_ID, CON_ID, name, OPEN_MODE

from gv\$pdbs

where con_id!=2

order by name, inst_id;

exit

EOF

echo

cmd="srvctl status service -service maazapp -db cdbatl"

echo \$ $cmd

eval $cmd

exit

Demo 2 script

txtblk='\e[0;30m' # Black - Regular
txtred='\e[0;31m' # Red
txtgrn='\e[0;32m' # Green
txtrst='\e[0m'    # Text Reset

function echop () {
	echo
	echo -e "${txtgrn}$*${txtrst}"
}

function echos () {
	echo
	echo -e "${txtred}$*${txtrst}"
}

function pause() {
	echo
	read -p "$*"
	echo
}

clear

echop "Status of the PRIMARY DATABASE"
sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	select db_unique_name, database_role from v\$database;
	select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;
	exit
EOF

pause "next... standby status"
clear
echos "Status of the STANDBY DATABASE"

sqlplus sys/racattack@cdbgva as sysdba <<EOF
	set echo on
	select db_unique_name, database_role from v\$database;
	select open_mode from v\$database;
	select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;
	exit
EOF



pause "next... dgmgrl status"
clear
echos "Data Guard configuration and status of the STANDBY database"
dgmgrl <<EOF
connect sys/racattack
show configuration;
show database 'CDBGVA';
exit
EOF


pause "please do tail -f on the apply instance"
pause "next... create new pdb ludo on primary "
clear

echop "Create new pluggable database on the primary: "
echop "create pluggable database ludo admin user ludoadmin identified by ludoadmin;"
sqlplus sys/racattack@cdbatl as sysdba <<EOF
	set echo on
	create pluggable database ludo admin user ludoadmin identified by ludoadmin;

	select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;
	exit
EOF

pause "next... create service for primary on both clusters"
clear

echop "Create service for primary ROLE on the primary cluster (CDBATL) via SSH"

cmd="srvctl add service -db CDBATL -service  ludoapp -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"
echo "\$ ssh raca01 $cmd"
ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echos "Create service for primary ROLE on the standby cluster (CDBGVA)"

cmd="srvctl add service -db CDBGVA -service  ludoapp -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"
echo "\$ $cmd"
eval $cmd


pause "next... start service on primary"
clear
echop "Starting service on the primary via SSH"
cmd="srvctl start service -db CDBATL -service  ludoapp"
echo "\$ ssh raca01 $cmd"
ssh raca01 ". /home/oracle/.bash_profile ; $cmd"



pause "next... create read only service for physical standby on both clusters"
clear

echop "Creating temporarily the readonly service for PRIMARY ROLE on the primary cluster (CDBATL) via SSH"
cmd="srvctl add service -db CDBATL -service  ludoread -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"
echo "\$ ssh raca01 $cmd"
ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echop "Starting the readonly service for PRIMARY ROLE on the primary cluster (CDBATL) via SSH"
cmd="srvctl start service -db CDBATL -service  ludoread"
echo "\$ ssh raca01 $cmd"
ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echop "Modifying the readonly service from PRIMARY ROLE to PHYSICAL STANDBY on the primary cluster (CDBATL) via SSH"
cmd="srvctl modify service -db CDBATL -service  ludoread -role physical_standby -pdb ludo"
echo "\$ ssh raca01 $cmd"
ssh raca01 ". /home/oracle/.bash_profile ; $cmd"


echos "Creating the readonly service for PHYSICAL STANDBY on the standby cluster (CDBGVA)"
cmd="srvctl add service -db CDBGVA -service  ludoread -serverpool CDBPOOL -cardinality singleton -role physical_standby -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"
echo "\$ $cmd"
eval $cmd

pause "next... start read only service"

clear
echos "Starting the readonly service on the standby cluster"

cmd="srvctl start service -db CDBGVA -service  ludoread"
echo "\$ $cmd"
eval $cmd


pause "next... standby status"
clear

echos "Standby status"
sqlplus sys/racattack@cdbgva as sysdba <<EOF
	select db_unique_name, database_role from v\$database;
	select open_mode from v\$database;
	select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;
	exit
EOF

pause "please connect to the RW service"

pause "next... dgmgrl status and validate"
clear

echos "Validate Standby database"

dgmgrl <<EOF
connect sys/racattack
show configuration;
validate database 'CDBGVA';
exit
EOF

pause "next... switchover to CDBGVA"
clear

echos "Switchover to CDBGVA! (it takes a while)"
dgmgrl <<EOF
connect sys/racattack
switchover to 'CDBGVA';
exit
EOF

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

txtblk='\e[0;30m' # Black - Regular

txtred='\e[0;31m' # Red

txtgrn='\e[0;32m' # Green

txtrst='\e[0m' # Text Reset

function echop () {

echo

echo -e "${txtgrn}$*${txtrst}"

}

function echos () {

echo

echo -e "${txtred}$*${txtrst}"

}

function pause() {

echo

read -p "$*"

echo

}

clear

echop "Status of the PRIMARY DATABASE"

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

select db_unique_name, database_role from v\$database;

select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;

exit

EOF

pause "next... standby status"

clear

echos "Status of the STANDBY DATABASE"

sqlplus sys/racattack@cdbgva as sysdba <<EOF

set echo on

select db_unique_name, database_role from v\$database;

select open_mode from v\$database;

select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;

exit

EOF

pause "next... dgmgrl status"

clear

echos "Data Guard configuration and status of the STANDBY database"

dgmgrl <<EOF

connect sys/racattack

show configuration;

show database 'CDBGVA';

exit

EOF

pause "please do tail -f on the apply instance"

pause "next... create new pdb ludo on primary "

clear

echop "Create new pluggable database on the primary: "

echop "create pluggable database ludo admin user ludoadmin identified by ludoadmin;"

sqlplus sys/racattack@cdbatl as sysdba <<EOF

set echo on

create pluggable database ludo admin user ludoadmin identified by ludoadmin;

select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;

exit

EOF

pause "next... create service for primary on both clusters"

clear

echop "Create service for primary ROLE on the primary cluster (CDBATL) via SSH"

cmd="srvctl add service -db CDBATL -service ludoapp -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"

echo "\$ ssh raca01 $cmd"

ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echos "Create service for primary ROLE on the standby cluster (CDBGVA)"

cmd="srvctl add service -db CDBGVA -service ludoapp -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"

echo "\$ $cmd"

eval $cmd

pause "next... start service on primary"

clear

echop "Starting service on the primary via SSH"

cmd="srvctl start service -db CDBATL -service ludoapp"

echo "\$ ssh raca01 $cmd"

ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

pause "next... create read only service for physical standby on both clusters"

clear

echop "Creating temporarily the readonly service for PRIMARY ROLE on the primary cluster (CDBATL) via SSH"

cmd="srvctl add service -db CDBATL -service ludoread -serverpool CDBPOOL -cardinality singleton -role primary -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"

echo "\$ ssh raca01 $cmd"

ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echop "Starting the readonly service for PRIMARY ROLE on the primary cluster (CDBATL) via SSH"

cmd="srvctl start service -db CDBATL -service ludoread"

echo "\$ ssh raca01 $cmd"

ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echop "Modifying the readonly service from PRIMARY ROLE to PHYSICAL STANDBY on the primary cluster (CDBATL) via SSH"

cmd="srvctl modify service -db CDBATL -service ludoread -role physical_standby -pdb ludo"

echo "\$ ssh raca01 $cmd"

ssh raca01 ". /home/oracle/.bash_profile ; $cmd"

echos "Creating the readonly service for PHYSICAL STANDBY on the standby cluster (CDBGVA)"

cmd="srvctl add service -db CDBGVA -service ludoread -serverpool CDBPOOL -cardinality singleton -role physical_standby -failovertype select -failovermethod basic -policy automatic -failoverdelay 1 -failoverretry 180 -pdb ludo"

echo "\$ $cmd"

eval $cmd

pause "next... start read only service"

clear

echos "Starting the readonly service on the standby cluster"

cmd="srvctl start service -db CDBGVA -service ludoread"

echo "\$ $cmd"

eval $cmd

pause "next... standby status"

clear

echos "Standby status"

sqlplus sys/racattack@cdbgva as sysdba <<EOF

select db_unique_name, database_role from v\$database;

select open_mode from v\$database;

select inst_id, con_id, name,open_mode from gv\$pdbs where con_id!=2 order by con_id, inst_id;

exit

EOF

pause "please connect to the RW service"

pause "next... dgmgrl status and validate"

clear

echos "Validate Standby database"

dgmgrl <<EOF

connect sys/racattack

show configuration;

validate database 'CDBGVA';

exit

EOF

pause "next... switchover to CDBGVA"

clear

echos "Switchover to CDBGVA! (it takes a while)"

dgmgrl <<EOF

connect sys/racattack

switchover to 'CDBGVA';

exit

EOF

There’s one slide describing the procedure for cloning one PDB using the standbys clause. Oracle has released a Note while I was preparing my slides (one month ago) and I wasn’t aware of it, so you may also checkout this note on MOS:

Making Use of the STANDBYS=NONE Feature with Oracle Multitenant (Doc ID 1916648.1)

UPDATE: I’ve blogged about it in a more recent post: Tales from the Demo Grounds part 2: cloning a PDB with ASM and Data Guard (no ADG)

UPDATE 2: I’ve written another blog post about these topics: Cloning a PDB with ASM and Data Guard (no ADG) without network transfer

Cheers!

—

Ludovico

Speaker and Ninja at Collaborate14 – #C14LV

Posted on February 4, 2014 by Ludovico

This year I will have the honor to present at Collaborate14, from April 7th to 11th. First of all, many thanks to Trivadis that has kindly agreed to send me to the conference.

My session (#603):
Oracle Data Guard 12c: Real-Time Cascade, Far Sync Instances and other goodies
has been accepted, so if you plan to attend Collaborate, I will be glad to see you there!
My paper and presentation are ready, but I’ll wait the post-conference before publishing them. Meanwhile, you can get a little sneak peak of my live demo (I’ll cut something, somewhere, but my new SSD disk should reduce the time elapsed, I have to do it again with the new hardware to get correct timings 🙂 ). There’s no audio, since it’s supposed to be my failover demo if I’ll have problems during my session.

Part I

Part II

I’ve submitted another abstract about Policy Managed Databases, but it has been put in the waiting list, assuming that Data Guard has a lot more users and the interest in new Data Guard 12c features will be higher than PMDBs that are rarely used in production environments (and I’m sad about it, keep in touch if you want to know more about this great technology).

RAC Attack 12c!

I’ll be organizing the RAC Attack again, along with Seth Miller, Yury Velikanov and Kamran Agayev. Sharing this exciting role with an Oracle ACE and two ACE Directors makes me proud of what I’m doing, but more than this, I’m happy to repeat another exciting experience like I had at OOW13.

This Year RAC Attack will be an official pre-conference workshop. We have been contacted directly by the IOUG, and we’re making improvements. We’ll install RAC 12c and discuss about advanced topics, have a lot of fun, drink a beer together and jump a lot! 🙂

Other mentors at the workshop will be Leighton Nelson, Maaz Anjum, Biju Thomas. You should know them already, so join us!

And don’t forget, register before February 12th, so you take benefit of the early bird discount!

—
Ludovico

How many Oracle instances can be consolidated on a single server?

Posted on November 6, 2013 by Ludovico

According to Exadata consolidation guide, this is what you can consolidate on Oracle specialized Hardware:

NOTE: The maximum number of database instances per cluster is 512 for Oracle 11g Release 1 and higher. An upper limit of 128 database instances per X2-2 or X3-2 database node and 256 database instances per X2-8 or X3-8 database node is recommended. The actual number of database instances per database node or cluster depends on application workload and their corresponding system resource consumption.

But how many instances are actually beeing consolidated by DBAs from all around the world?

I’ve asked it to the Twitter community

I’ve sent this tweet a couple of weeks ago and I would like to consolidate some replies into a single blog post.

who has done more than this on a single server? $ ps -eaf | grep ora_pmon | wc -l 77 #oracle #consolidation

— Ludovico Caldara (@ludovicocaldara) October 25, 2013

My customer environment however, was NOT a production one. On the production they have 45.

Some replies…

@CacheFlush @kevinclosson @ludovicocaldara I know one customer with 58

— Bjoern Rost (@brost) October 25, 2013

@ludovicocaldara I have more 🙂 — Wissem El Khlifi (@orawiss) October 25, 2013

@ludovicocaldara ay cant believe it, think they removed some DBs, i have now 73. Ok u win 🙁 — Wissem El Khlifi (@orawiss) October 25, 2013

Wissem cores 73 on a production system, 1TB memory!

@ludovicocaldara about 150 TB of storage, 1TB memory, 40 Cores …

— Wissem El Khlifi (@orawiss) October 25, 2013

@ludovicocaldara we have at least 4 servers like this , big ones! 2 for prods …

— Wissem El Khlifi (@orawiss) October 25, 2013

Chris correctly suggests to give a try to the new 12c consolidation features:

.@ludovicocaldara a good playground for either threaded_execution=true or multitenant 😉

— Christian Antognini (@ChrisAntognini) October 25, 2013

Kevin, as a great expert, already experimented one hundred instances environment:

.@ludovicocaldara up to 100 instances/host when researching for this paper: http://t.co/sIAuvnDJ2D Proof point stayed much lower though

— Kevin (@kevinclosson) October 25, 2013

But Bertrand impresses with his numbers!

@ludovicocaldara Will give you the right number on Monday but I would say around 120 🙂

— Bertrand Drouvot (@BertrandDrouvot) October 25, 2013

@ludovicocaldara Here is the exact number dude: ps -ef | grep -c ora_smon 131

— Bertrand Drouvot (@BertrandDrouvot) October 28, 2013

@BertrandDrouvot @ludovicocaldara damn your the winner, i can’t higher then 118 but this customer is still busy migrating more.

— Klaas-Jan Jongsma (@futureveterans) October 28, 2013

@ludovicocaldara capacity planning is a big challenge here with currently around 1800 databases and this is constantly increasing…

— Bertrand Drouvot (@BertrandDrouvot) October 25, 2013

@ludovicocaldara @orawiss 1TB is our default as of a few months 🙂

— Bertrand Drouvot (@BertrandDrouvot) October 25, 2013

Intel platform with 1TB of RAM = Xeon E7, suggests Kevin:

. @BertrandDrouvot @ludovicocaldara @orawiss so you standardize on Xeon E7 ?

— Kevin (@kevinclosson) October 25, 2013

@kevinclosson how did you guess ? 🙂 @ludovicocaldara @orawiss

— Bertrand Drouvot (@BertrandDrouvot) October 25, 2013

. @BertrandDrouvot @ludovicocaldara @orawiss because there are no 1TB EP offerings… at least not until after Haswell. Ivy EX does 12TB 🙂

— Kevin (@kevinclosson) October 25, 2013

Flashdba has seen 87 instances on a single host, but on a Multi-node RAC: but still huge and complex!

@ludovicocaldara @BertrandDrouvot @orawiss When I worked for Oracle I saw 87 instances on single a node – and it was six node RAC

— flashdba (@flashdba) October 25, 2013

@ludovicocaldara @BertrandDrouvot @orawiss It was a production system too – and yes, it was a nightmare!

— flashdba (@flashdba) October 25, 2013

@kevinclosson @ludovicocaldara @BertrandDrouvot @orawiss The cluster had around 120 DBs, not every database had an instance on every node

— flashdba (@flashdba) October 25, 2013

@kevinclosson @ludovicocaldara @BertrandDrouvot @orawiss The result was between 50 and 87 instances running per node #complicated

— flashdba (@flashdba) October 25, 2013

Conclusion

Does this thread of tweets reply to the question? Are you planning to consolidate your Oracle environment? If you have questions about how to plan your consolidation, don’t hesitate to get in touch! 🙂

—

Ludo

Oracle RAC and Policy Managed Databases

Posted on July 10, 2013 by Ludovico

Some weeks ago I’ve commented a good post of Martin Bach (@MartinDBA on Twitter, make sure to follow him!)

http://martincarstenbach.wordpress.com/2013/06/17/an-introduction-to-policy-managed-databases-in-11-2-rac/

What I’ve realized by is that Policy Managed Databases are not widely used and there is a lot misunderstanding on how it works and some concerns about implementing it in production.

My current employer Trivadis (@Trivadis, make sure to call us if your database needs a health check :-)) use PMDs as best practice, so it’s worth to spend some words on it. Isn’t it?

Why Policy Managed Databases?

PMDs are an efficient way to manage and consolidate several databases and services with the least effort. They rely on Server Pools. Server pools are used to partition physically a big cluster into smaller groups of servers (Server Pool). Each pool have three main properties:

A minumim number of servers required to compose the group
A maximum number of servers
A priority that make a server pool more important than others

If the cluster loses a server, the following rules apply:

If a pool has less than min servers, a server is moved from a pool that has more than min servers, starting with the one with lowest priority.
If a pool has less than min servers and no other pools have more than min servers, the server is moved from the server with the lowest priority.
Poolss with higher priority may give servers to pools with lower priority if the min server property is honored.

This means that if a serverpool has the greatest priority, all other server pools can be reduced to satisfy the number of min servers.

Generally speaking, when creating a policy managed database (can be existent off course!) it is assigned to a server pool rather than a single server. The pool is seen as an abstract resource where you can put workload on.

If you read the definition of Cloud Computing given by the NIST (http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf) you’ll find something similar:

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared
pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that
can be rapidly provisioned and released with minimal management effort or service provider interaction

There are some major benefits in using policy managed databases (that’s my solely opinion):

PMD instances are created/removed automatically. This means that you can add and remove nodes nodes to/from the server pools or the whole cluster, the underlying databases will be expanded or shrinked following the new topology.
Server Pools (that are the base for PMDs) allow to give different priorities to different groups of servers. This means that if correctly configured, you can loose several physical nodes without impacting your most critical applications and without reconfiguring the instances.
PMD are the base for Quality of Service management, a 11gR2 feature that does resource management cluster-wide to achieve predictable performances on critical applications/transactions. QOS is a really advanced topic so I warn you: do not use it without appropriate knowledge. Again, Trivadis has deep knowledge on it so you may want to contact us for a consulting service (and why not, perhaps I’ll try to blog about it in the future).
RAC One Node databases (RONDs?) can work beside PMDs to avoid instance proliferation for non critical applications.
Oracle is pushing it to achieve maximum flexibility for the Cloud, so it’s a trendy technology that’s cool to implement!
I’ll find some other reasons, for sure! 🙂

What changes in real-life DB administration?

Well, the concept of having a relation Server -> Instance disappears, so at the very beginning you’ll have to be prepared to something dynamic (but once configured, things don’t change often).

As Martin pointed out in his blog, you’ll need to configure server pools and think about pools of resources rather than individual configuration items.

The spfile doesn’t contain any information related to specific instances, so the parameters must be database-wide.

The oratab will contain only the dbname, not the instance name, and the dbname is present in oratab disregarding if the server belongs to a serverpool or another.

+ASM1:/oracle/grid/11.2.0.3:N           # line added by Agent
PMU:/oracle/db/11.2.0.3:N               # line added by Agent
TST:/oracle/db/11.2.0.3:N               # line added by Agent

+ASM1:/oracle/grid/11.2.0.3:N # line added by Agent

PMU:/oracle/db/11.2.0.3:N # line added by Agent

TST:/oracle/db/11.2.0.3:N # line added by Agent

Your scripts should take care of this.

Also, when connecting to your database, you should rely on services and access your database remotely rather than trying to figure out where the instances are running. But if you really need it you can get it:

# srvctl status database -d PMU
Instance PMU_4 is running on node node2
Instance PMU_2 is running on node node3
Instance PMU_3 is running on node node4
Instance PMU_5 is running on node node6
Instance PMU_1 is running on node node7
Instance PMU_6 is running on node node8

# srvctl status database -d PMU

Instance PMU_4 is running on node node2

Instance PMU_2 is running on node node3

Instance PMU_3 is running on node node4

Instance PMU_5 is running on node node6

Instance PMU_1 is running on node node7

Instance PMU_6 is running on node node8

An approach for the crontab: every DBA soon or late will need to schedule tasks within the crond. Since the RAC have multiple nodes, you don’t want to run the same script many times but rather choose which node will execute it.

My personal approach (every DBA has his personal preference) is to check the instance with cardinality 1 and match it with the current node. e.g.:

# [ `crsctl stat res ora.tst.db -k 1 | grep STATE=ONLINE | awk '{print $NF}'` == `uname -n` ]
# echo $?
0

# [ `crsctl stat res ora.tst.db -k 1 | grep STATE=ONLINE | awk '{print $NF}'` == `uname -n` ]
# echo $?
1

# [ `crsctl stat res ora.tst.db -k 1 | grep STATE=ONLINE | awk '{print $NF}'` == `uname -n` ]

# echo $?

# [ `crsctl stat res ora.tst.db -k 1 | grep STATE=ONLINE | awk '{print $NF}'` == `uname -n` ]

# echo $?

In the example, TST_1 is running on node1, so the first evaluation returns TRUE. The second evaluation is done after the node2, so it returns FALSE.

This trick can be used to have an identical crontab on every server and choose at the runtime if the local server is the preferred to run tasks for the specified database.

A proof of concept with Policy Managed Databases

My good colleague Jacques Kostic has given me the access to a enterprise-grade private lab so I can show you some “live operations”.

Let’s start with the actual topology: it’s an 8-node stretched RAC with ASM diskgroups with failgroups on the remote site.

This should be enough to show you some capabilities of server pools.

The Generic and Free server pools

After a clean installation, you’ll end up with two default server pools:

The Generic one will contain all non-PMDs (if you use only PMDs it will be empty). The Free one will own servers that are “spare”, when all server pools have reached the maximum size thus they’re not requiring more servers.

New server pools

Actually the cluster I’m working on has two serverpools already defined (PMU and TST):

(the node assignment in the graphic is not relevant here).

They have been created with a command like this one:

# srvctl add serverpool -g PMU -l 5 -u 6 -i 3

1	# srvctl add serverpool -g PMU -l 5 -u 6 -i 3

# srvctl add serverpool -g TST -l 2 -u 3 -i 2

1	# srvctl add serverpool -g TST -l 2 -u 3 -i 2

“srvctl -h ” is a good starting point to have a quick reference of the syntax.

You can check the status with:

# srvctl status serverpool
Server pool name: Free
Active servers count: 0
Server pool name: Generic
Active servers count: 0
Server pool name: PMU
Active servers count: 6
Server pool name: TST
Active servers count: 2

# srvctl status serverpool

Server pool name: Free

Active servers count: 0

Server pool name: Generic

Active servers count: 0

Server pool name: PMU

Active servers count: 6

Server pool name: TST

Active servers count: 2

and the configuration:

# srvctl config serverpool
Server pool name: Free
Importance: 0, Min: 0, Max: -1
Candidate server names:
Server pool name: Generic
Importance: 0, Min: 0, Max: -1
Candidate server names:
Server pool name: PMU
Importance: 3, Min: 5, Max: 6
Candidate server names:
Server pool name: TST
Importance: 2, Min: 2, Max: 3
Candidate server names:

# srvctl config serverpool

Server pool name: Free

Importance: 0, Min: 0, Max: -1

Candidate server names:

Server pool name: Generic

Importance: 0, Min: 0, Max: -1

Candidate server names:

Server pool name: PMU

Importance: 3, Min: 5, Max: 6

Candidate server names:

Server pool name: TST

Importance: 2, Min: 2, Max: 3

Candidate server names:

Modifying the configuration of serverpools

In this scenario, PMU is too big. The sum of minumum nodes is 2+5=7 nodes, so I have only one server that can be used for another server pool without falling below the minimum number of nodes.

I want to make some room to make another server pool composed of two or three nodes, so I reduce the serverpool PMU:

# srvctl modify serverpool -g PMU -l 3

1	# srvctl modify serverpool -g PMU -l 3

Notice that PMU maxsize is still 6, so I don’t have free servers yet.

# srvctl status database -d PMU
Instance PMU_4 is running on node node2
Instance PMU_2 is running on node node3
Instance PMU_3 is running on node node4
Instance PMU_5 is running on node node6
Instance PMU_1 is running on node node7
Instance PMU_6 is running on node node8

# srvctl status database -d PMU

Instance PMU_4 is running on node node2

Instance PMU_2 is running on node node3

Instance PMU_3 is running on node node4

Instance PMU_5 is running on node node6

Instance PMU_1 is running on node node7

Instance PMU_6 is running on node node8

So, if I try to create another serverpool I’m warned that some resources can be taken offline:

# srvctl add serverpool -g LUDO -l 2 -u 3 -i 1
PRCS-1009 : Failed to create server pool LUDO
PRCR-1071 : Failed to register or update server pool ora.LUDO
CRS-2736: The operation requires stopping resource 'ora.pmu.db' on server 'node8'
CRS-2736: The operation requires stopping resource 'ora.pmu.db' on server 'node3'
CRS-2737: Unable to register server pool 'ora.LUDO' as this will affect running resources, but the force option was not specified

# srvctl add serverpool -g LUDO -l 2 -u 3 -i 1

PRCS-1009 : Failed to create server pool LUDO

PRCR-1071 : Failed to register or update server pool ora.LUDO

CRS-2736: The operation requires stopping resource 'ora.pmu.db' on server 'node8'

CRS-2736: The operation requires stopping resource 'ora.pmu.db' on server 'node3'

CRS-2737: Unable to register server pool 'ora.LUDO' as this will affect running resources, but the force option was not specified

The clusterware proposes to stop 2 instances from the db pmu on the serverpool PMU because it can reduce from 6 to 3, but I have to confirm the operation with the flag -f.

Modifying the serverpool layout can take time if resources have to be started/stopped.

# srvctl status serverpool
Server pool name: Free
Active servers count: 0
Server pool name: Generic
Active servers count: 0
Server pool name: LUDO
Active servers count: 2
Server pool name: PMU
Active servers count: 4
Server pool name: TST
Active servers count: 2

# srvctl status serverpool

Server pool name: Free

Active servers count: 0

Server pool name: Generic

Active servers count: 0

Server pool name: LUDO

Active servers count: 2

Server pool name: PMU

Active servers count: 4

Server pool name: TST

Active servers count: 2

My new serverpool is finally composed by two nodes only, because I’ve set an importance of 1 (PMU wins as it has an importance of 3).

Inviting RAC One Node databases to the party

Now that I have some room on my new serverpool, I can start creating new databases.

With PMD I can add two types of databases: RAC or RACONDENODE. Depending on the choice, I’ll have a database running on ALL NODES OF THE SERVER POOL or on ONE NODE ONLY. This is a kind of limitation in my opinion, hope Oracle will improve it in the near future: would be great to specify the cardinality also at database level.

Creating a RAC One DB is as simple as selecting two radio box during in the dbca “standard” procedure:

The Server Pool can be created or you can specify an existent one (as in this lab):

I’ve created two new RAC One Node databases:

DB LUDO (service PRISM :-))
DB VICO (service CHEERS)

I’ve ended up with something like this:

--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
ora.ludo.db   <<<<< RAC ONE
      1        ONLINE  ONLINE       node8                    Open
ora.ludo.prism.svc
      1        ONLINE  ONLINE       node8
ora.pmu.db
      1        ONLINE  ONLINE       node7                    Open
      2        ONLINE  ONLINE       node4                    Open
      3        ONLINE  ONLINE       node5                    Open
      4        ONLINE  ONLINE       node6                    Open
ora.tst.db
      1        ONLINE  ONLINE       node1                    Open
      2        ONLINE  ONLINE       node2                    Open
ora.vico.cheers.svc
      1        ONLINE  ONLINE       node3
ora.vico.db  <<<< RAC ONE
      1        ONLINE  ONLINE       node3                    Open

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

ora.ludo.db <<<<< RAC ONE

1 ONLINE ONLINE node8 Open

ora.ludo.prism.svc

1 ONLINE ONLINE node8

ora.pmu.db

1 ONLINE ONLINE node7 Open

2 ONLINE ONLINE node4 Open

3 ONLINE ONLINE node5 Open

4 ONLINE ONLINE node6 Open

ora.tst.db

1 ONLINE ONLINE node1 Open

2 ONLINE ONLINE node2 Open

ora.vico.cheers.svc

1 ONLINE ONLINE node3

ora.vico.db <<<< RAC ONE

1 ONLINE ONLINE node3 Open

That can be represented with this picture:

RAC One Node databases can be managed as always with online relocation (it’s still called O-Motion?)

Losing the nodes

With this situation, what happens if I loose (stop) one node?

# crsctl stop cluster -n node8
CRS-2673: Attempting to stop 'ora.crsd' on 'node8'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node8'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node8'
CRS-2673: Attempting to stop 'ora.ludo.prism.svc' on 'node8'
CRS-2677: Stop of 'ora.ludo.prism.svc' on 'node8' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.node8.vip' on 'node8'
CRS-2677: Stop of 'ora.node8.vip' on 'node8' succeeded
CRS-2672: Attempting to start 'ora.node8.vip' on 'node4'
CRS-2676: Start of 'ora.node8.vip' on 'node4' succeeded
CRS-2673: Attempting to stop 'ora.ludo.db' on 'node8'
CRS-2677: Stop of 'ora.ludo.db' on 'node8' succeeded
CRS-2672: Attempting to start 'ora.ludo.db' on 'node3'
CRS-2676: Start of 'ora.ludo.db' on 'node3' succeeded
CRS-2672: Attempting to start 'ora.ludo.prism.svc' on 'node3'
CRS-2676: Start of 'ora.ludo.prism.svc' on 'node3' succeeded
CRS-2673: Attempting to stop 'ora.GRID.dg' on 'node8'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node8'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'node8'
CRS-2673: Attempting to stop 'ora.RECO.dg' on 'node8'
CRS-2677: Stop of 'ora.DATA.dg' on 'node8' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'node8' succeeded
CRS-2677: Stop of 'ora.RECO.dg' on 'node8' succeeded
CRS-2677: Stop of 'ora.GRID.dg' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'node8'
CRS-2677: Stop of 'ora.asm' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'node8'
CRS-2677: Stop of 'ora.ons' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'node8'
CRS-2677: Stop of 'ora.net1.network' on 'node8' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node8' has completed
CRS-2677: Stop of 'ora.crsd' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'node8'
CRS-2673: Attempting to stop 'ora.evmd' on 'node8'
CRS-2673: Attempting to stop 'ora.asm' on 'node8'
CRS-2677: Stop of 'ora.evmd' on 'node8' succeeded
CRS-2677: Stop of 'ora.asm' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'node8'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'node8' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'node8' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'node8'
CRS-2677: Stop of 'ora.cssd' on 'node8' succeeded

# crsctl stop cluster -n node8

CRS-2673: Attempting to stop 'ora.crsd' on 'node8'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node8'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node8'

CRS-2673: Attempting to stop 'ora.ludo.prism.svc' on 'node8'

CRS-2677: Stop of 'ora.ludo.prism.svc' on 'node8' succeeded

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.node8.vip' on 'node8'

CRS-2677: Stop of 'ora.node8.vip' on 'node8' succeeded

CRS-2672: Attempting to start 'ora.node8.vip' on 'node4'

CRS-2676: Start of 'ora.node8.vip' on 'node4' succeeded

CRS-2673: Attempting to stop 'ora.ludo.db' on 'node8'

CRS-2677: Stop of 'ora.ludo.db' on 'node8' succeeded

CRS-2672: Attempting to start 'ora.ludo.db' on 'node3'

CRS-2676: Start of 'ora.ludo.db' on 'node3' succeeded

CRS-2672: Attempting to start 'ora.ludo.prism.svc' on 'node3'

CRS-2676: Start of 'ora.ludo.prism.svc' on 'node3' succeeded

CRS-2673: Attempting to stop 'ora.GRID.dg' on 'node8'

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node8'

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'node8'

CRS-2673: Attempting to stop 'ora.RECO.dg' on 'node8'

CRS-2677: Stop of 'ora.DATA.dg' on 'node8' succeeded

CRS-2677: Stop of 'ora.FRA.dg' on 'node8' succeeded

CRS-2677: Stop of 'ora.RECO.dg' on 'node8' succeeded

CRS-2677: Stop of 'ora.GRID.dg' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'node8'

CRS-2677: Stop of 'ora.asm' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'node8'

CRS-2677: Stop of 'ora.ons' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'node8'

CRS-2677: Stop of 'ora.net1.network' on 'node8' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node8' has completed

CRS-2677: Stop of 'ora.crsd' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'node8'

CRS-2673: Attempting to stop 'ora.evmd' on 'node8'

CRS-2673: Attempting to stop 'ora.asm' on 'node8'

CRS-2677: Stop of 'ora.evmd' on 'node8' succeeded

CRS-2677: Stop of 'ora.asm' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'node8'

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'node8' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'node8' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'node8'

CRS-2677: Stop of 'ora.cssd' on 'node8' succeeded

The node was belonging to the pool LUDO, however I have this situation right after:

# srvctl status serverpool
Server pool name: Free
Active servers count: 0
Server pool name: Generic
Active servers count: 0
Server pool name: LUDO
Active servers count: 2
Server pool name: PMU
Active servers count: 3
Server pool name: TST
Active servers count: 2

# srvctl status serverpool

Server pool name: Free

Active servers count: 0

Server pool name: Generic

Active servers count: 0

Server pool name: LUDO

Active servers count: 2

Server pool name: PMU

Active servers count: 3

Server pool name: TST

Active servers count: 2

A server has been taken from the pol PMU and given to the pool LUDO. This is because PMU was having one more server than his minimum server requirement.

Now I can loose one node at time, I’ll have the following situation:

1 node lost: PMU 3, TST 2, LUDO 2
2 nodes lost: PMU 3, TST 2, LUDO 1 (as PMU is already on min and has higher priority, LUDO is penalized because has the lowest priority)
3 nodes lost:PMU 3, TST 2, LUDO 0 (as LUDO has the lowest priority)
4 nodes lost: PMU 3, TST 1, LUDO 0
5 nodes lost: PMU 3, TST 0, LUDO 0

So, my hyper-super-critical application will still have three nodes to have plenty of resources to run even with a multiple physical failure, as it is the server pool with the highest priority and a minimum required server number of 3.

What I would ask to Santa if I’ll be on the Nice List (ad if Santa works at Redwood Shores)

Dear Santa, I would like:

To create databases with node cardinality, to have for example 2 instances in a 3 nodes server pool
Server Pools that are aware of the physical location when I use stretched clusters, so I could end up always with “at least one active instance per site”.

Think about it 😉

—

Ludovico