Posts Tagged ‘RAC’

10gR2 RAC hangs and “KSV master wait”

Friday, June 3rd, 2011

We recently migrated a customer’s 10gR2 RAC on AIX6.1 from GPFS+HACMP to a “basic” Clusterware with datafiles over ASM.
After (many) problems related to various installation bugs (the list of requirements for AIX is very long, incomplete and requires many one-off patches to complete), we had a problem during an import of a new schema: the import hung with no apparent wait events. We found that the event it was waiting for was classified as ‘Idle’:

SQL> SELECT sid, username, STATUS, event, wait_class, program FROM gv$session;
 
 SID USERNAME   STATUS   EVENT                WAIT_CLASS PROGRAM
---- ---------- -------- -------------------- ---------- ----------------------------------------
...
 135 SYS        ACTIVE   KSV master wait      Idle       imp@trndcsaixdb1 (TNS V1-V3)
...

The on ASM instance:

SQL> @wait10g
 
SID USERNAME   MACHINE         PROGRAM              EVENT                  SEQ#           P1  S_IN_WAIT STATE               STATUS
---- ---------- --------------- -------------------- -------------------- ------ ------------ ---------- ------------------- --------
201 SYS        trndcsaixdb1    oracle@trndcsaixdb1  enq: FA - access file      6   1178664965        744 WAITING             ACTIVE

The problem was related to datafile resize (we use autoextend) and according to MOS, we were encountering a bug:

Bug 11712836: RESIZING DATAFILE HUNG WAITING FOR KSV MASTER WAIT IN RAC

Shutting down one instance solved the problem. Now we have to avoid autoextend……. We never encountered this bug in many 10.2.0.4 rac installations.

Dataguard check script for Real Application Clusters (MAA)

Friday, December 31st, 2010

Two years after my posts:
Quick Oracle Dataguard check script and More about Dataguard and how to check it I faced a whole new Dataguard between two Oracle Real Application Clusters, aka Oracle Maximum Availability Architecture (MAA).

This enviromnent is relying on Windows OS. Don’t know how this could be called “availability” but here we are. I revisited my scripts in a quick and very dirty way. Please consider that I did copy and paste to check the alignment once per thread, but it should be improved with some kind of iteration to check each thread in a more structured fashion.

#!D:\oracle\product\10.2.0\db_1\perl\5.8.3\bin\MSWin32-x86-multi-thread\perl.exe -w
use DBI;
use DBD::Oracle qw(:ora_session_modes);
# DB connection #
my $prod  = "prod";
my $stby = "stby";
my $prodh;
unless ($prodh = DBI->connect('dbi:Oracle:'.$prod, 
    'sys', 'strongpwd', 
    {PrintError=>0, AutoCommit => 0,
    ora_session_mode => ORA_SYSDBA}))  {
print "Error connecting to DB: $DBI::errstr\n";
exit(1);
}
$prodh->{RaiseError}=1;
 
my $stbyh;
unless ($stbyh = DBI->connect('dbi:Oracle:'.$stby,
    'sys', 'strongpwd',
    {PrintError=>0, AutoCommit => 0,
    ora_session_mode => ORA_SYSDBA}))  {
print "Error connecting to DB: $DBI::errstr\n";
$prodh->disconnect;
exit(1);
}
$stbyh->{RaiseError}=1;
 
my $sth;
### query stdby MRP0
$sth = $stbyh->prepare( <<EOSQL );
select thread#, SEQUENCE#, BLOCK#
    from gv\$managed_standby 
    where process='MRP0'
EOSQL
$sth->execute();
my ($mrpthread, $mrpsequence, $mrpblock) = $sth->fetchrow_array();
$sth->finish();
 
### query stdby RFS
$sth = $stbyh->prepare( <<EOSQL );
select thread#, SEQUENCE#, BLOCK#
    from gv\$managed_standby 
    where process='RFS' and client_process='LGWR' order by thread#
EOSQL
$sth->execute();
my ($rfsthread1, $rfssequence1, $rfsblock1) = $sth->fetchrow_array();
my ($rfsthread2, $rfssequence2, $rfsblock2) = $sth->fetchrow_array();
$sth->finish();
 
### query prod
$sth = $prodh->prepare( <<EOSQL );
select thread#, SEQUENCE#, BLOCK#
    from gv\$managed_standby
    where process='LNS' order by thread#
EOSQL
$sth->execute();
my ($pthread1, $psequence1, $pblock1) = $sth->fetchrow_array();
my ($pthread2, $psequence2, $pblock2) = $sth->fetchrow_array();
$sth->finish();
 
 
printf ("ENVIRONM  Thread Sequence   Block\n");
printf ("--------- ------ ---------- ----------\n");
printf ("PROD     LNS1  1 %10d %10d\n", $psequence1, $pblock1);
printf ("STANDBY  RFS1  1 %10d %10d\n", $rfssequence1, $rfsblock1);
printf ("PROD     LSN2  2 %10d %10d\n", $psequence2, $pblock2);
printf ("STANDBY  RFS2  2 %10d %10d\n", $rfssequence2, $rfsblock2);
printf ("STANDBY  MRP0  %d %10d %10d\n", $mrpthread, $mrpsequence, $mrpblock);
 
my $psequence;
my $pblock;
if ( $mrpthread == 1 ) {
$psequence=$psequence1;
$pblock=$pblock1;
} else {
$psequence=$psequence2;
$pblock=$pblock2;
}
 
$sth = $stbyh->prepare( <<EOSQL );
select nvl(sum(blocks),0)
+ $pblock - $mrpblock as BLOCK_GAP
from gv\$archived_log
where thread#=$mrpthread and sequence#
between $mrpsequence and $psequence
EOSQL
$sth->execute();
my ($mrpblockgap) = $sth->fetchrow_array();
$sth->finish();
 
$sth = $stbyh->prepare( <<EOSQL );
select nvl(sum(blocks),0)
+ $pblock1 - $rfsblock1 as BLOCK_GAP
from gv\$archived_log
where thread#=1 and sequence#
between $rfssequence1 and $psequence1
EOSQL
$sth->execute();
my ($rfsblockgap1) = $sth->fetchrow_array();
$sth->finish();
 
$sth = $stbyh->prepare( <<EOSQL );
select nvl(sum(blocks),0)
+ $pblock2 - $rfsblock2 as BLOCK_GAP
from gv\$archived_log
where thread#=2 and sequence#
between $rfssequence2 and $psequence2
EOSQL
$sth->execute();
my ($rfsblockgap2) = $sth->fetchrow_array();
$sth->finish();
printf ("\n\n%-10d blocks gap in TRANSMISSION\n", $rfsblockgap1+$rfsblockgap2);
printf ("%-10d blocks gap in APPLY (MRP0)\n", $mrpblockgap);
 
$stbyh->disconnect;
$prodh->disconnect;

Please foreward me every improvement you implement over my code: it would be nice to post it here.

Oracle RAC Standard Edition to achieve low cost and high performance

Friday, November 28th, 2008

I finished today to create a new production environment based on 2 Linux serverX86_64 and running Oracle RAC 10gR2. (I know, there is 11g right now, but I’m a conservative!)
Wheeew, I just spent a couple of hours applying all the recommended patches!
We choosed 2 nodes with a maximum of 2 multi-core processors each one so we can license Standard Edition instead of Enterprise Edition. 64bits addressing allow us to allocate many gigabytes of SGA. I’m starting with 5Gb but I think we’ll need more. And a set of 6x300Gb 15krpms disks (it can be expanded with more disks and more shelves).
This configuration keeps low the total cost of ownership but achieves best performance.
Due to disks layout, costs and needed usable storage, we had to configure one huge RAID5 on the SAN with multi-path. I decided anyway to create 2 ASM disk groups (ASM is mandatory for Standard Edition RAC), one for the DB, the second one for the recovery area. With spare disks we should have enough availability and even if it’s a RAID5 I saw good write performances (>150M/s).

Welcome new RAC, I hope we’ll feel good together!