DBA survival BLOG

DBA stuff and Oracle Data Guard

Oracle Database 12c finally out!! First impressions

Posted on June 25, 2013 by Ludovico

After a long, long wait, Oracle finally announced the availability of his new generation database. And looking at the new features, I think it will take several months before I’ll learn them all. The impressive number of changes brings me back to the release 10gR1, and I’m not surprised that Oracle has waited so long, I still bet that we’ll find a huge amount of bugs in the first release. We need for sure to wait a first Patchset, as always, before going production.

Does ‘c’ stand for cloud?

While Oracle has developed this release with the cloud in mind, the first word that comes out of my mind is “consolidation”. The new claimed feature Pluggable Database (aka Oracle Multitenancy) will be the dream of every datacenter manager along with CloneDB (well, it was somehow already available on 11.2.0.2) and ASM Thin_provisioned diskgroups.

But yes, it’s definitely the best for clouds

Other features like Flex ASM, Flex Cluster, several new security features, crossplatform backups… let imagine how deeply we can work to make private, multi-tenant clouds.

First steps, what changes with a typical installation

The process for a traditional standalone DB+ASM installation is the same as the old 11gR2: You’ll need to install the Grid Infrastructure first (and then take advantage of the Oracle Restart feature) and subsequently the Database installation.

The installation documentation is complete as always and is getting quite huge as the Grid Infrastructure capabilities increment.

To meet most installation prerequisites, Oracle has prepared again an RPM that does the dirty work:

oracle-rdbms-server-12cR1-preinstall-1.0-3.el6.x86_64.rpm

Oracle suggests to use Ksplice and also explicitly recommends to use the deadline I/O scheduler (it has been longtime a best practice but I can’t remember it was documented officially).

The splash screen has become more “red” giving a colorful experience on the installation process. 😉

Once the GI is installed, the Database installation asks for many new OS groups: OSBACKUPDBA, OSDGDBA, OSKMDBA. This give you more possibilities to split administration duties, not specifying them will lead to the “old behavior”.

new_OSGROUPS

You can decide to use an ACFS filesystem for both the installation AND the database files (with some exceptions, e.g. Windows servers). So, you can take advantage of the snapshot features of ACFS for your data, provided that the performance is acceptable (I’ll try to test and blog more about this). You can use the feature Copy-On-Write to provide writable snapshot copies, directly embedding a special syntax inside the “create pluggable database” command. Unfortunately, Oracle has decided to deliver pluggable databases as an extra-cost option. :-/

The database creation with DBCA is even easier, you have an option for a very default installation, you can guess it uses templates with all options installed by default.

But the Hot topic is that you can create it as a “Container Database”. This is done by appending the keywords “enable pluggable database;” at the end of the create database command. The process will then put all the required bricks (creation of the pdb$seed database and so on), I’ll cover the topic in separate posts cause it’s the really biggest new feature.

You can still use advanced mode to have the “old style” database creation, where you can customize your database.

If you try to create only the scripts and run them manually (that’s my habit), you’ll notice that SQL scripts are not run directly within the opened SQL*Plus session, but they’re run from a perl script that basically suppresses all the output to terminal, giving the impression of a cleaner installation. IMO it could be better only if everything runs fine.

host perl /u01/app/oracle/product/12.1.0/rdbms/admin/catcon.pl -l /u01/app/oracle/admin/CDBTEST/scripts -b catalog /u01/app/oracle/product/12.1.0/rdbms/admin/catalog.sql;

1	host perl /u01/app/oracle/product/12.1.0/rdbms/admin/catcon.pl -l /u01/app/oracle/admin/CDBTEST/scripts -b catalog /u01/app/oracle/product/12.1.0/rdbms/admin/catalog.sql;

Finally, I’ll get something familiar, but with a brand new release number! 🙂

[oracle@luc12c01 ~]$ sqlplus sys/*****@classic as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Thu May 9 22:36:27 2013

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

[oracle@luc12c01 ~]$ sqlplus sys/*****@classic as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Thu May 9 22:36:27 2013

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics

and Real Application Testing options

Stay tuned, I’ll write soon about some really interesting features of the new Oracle Database 12c!

Cheers

—

Ludo

Generating graphs massively from Windows Performance Counters logs

Posted on April 23, 2013 by Ludovico

Windows Performance Monitor is an invaluable tool when you don’t have external enterprise monitoring tools and you need to face performance problems, whether you have a web/application server, a mail server or a database server.

But what I don’t personally like of it is what you get in terms of graphing. If you schedule and collect a big amount of performance metrics you will likely get lost in adding/removing such metrics from the graphical interface.

What I’ve done long time ago (and I’ve done again recently after my old laptop has been stolen 🙁 ) is to prepare a PHP script that parse the resulting CSV file and generate automatically one graph for each metric that could be found.

Unfortunately, most of Windows Sysadmin between you will disagree that I’ve done this using a Linux Box. But I guess you can use my script if you install php inside cygwin. The other tool you need, is rrdtool, again I use it massively to resolve my graphing needs.

How to collect your data

Basically you need to create any Data Collector within the Performance Monitor that generates a log file. You can specify directly a CSV file (Log format: Comma separated) or generate a BLG file and convert it later (Log format: Binary). System dumps are not used, so if you use the standard Performace template, you can delete it from your collection.

Remember that the more counters you take, the more the graph generation will take. The script does not run in parallel, so it will use only one core. Generally:

(Time to complete) = (Num Counters) * (Num Samples) * (Speed factor)

1	(Time to complete) = (Num Counters) * (Num Samples) * (Speed factor)

Where (Speed factor) is depending on both the CPU speed and the disk speed because of the huge number of syncs required to update several thousands of files. I’ve tried to reduce the number of rrdupdates by queuing several update values in a single command line and I’ve noticed an important increase of performances, but I know it’s not enough.

Converting a BLG (binary) log into a CSV log

Just use the relog tool:

C:\PerfLogs\Admin\Perftest\LUDO_20130423-000002> <strong>relog "Performance Counter.blg" -f csv -o "Performance Counter.csv"</strong>

Input
----------------
File(s):
     Performance Counter.blg (Binary)

Begin:    23.4.2013 14:56:02
End:      23.4.2013 15:33:37
Samples:  452

100.00%

Output
----------------
File:     Performance Counter.csv

Begin:    23.4.2013 14:56:02
End:      23.4.2013 15:33:37
Samples:  452

The command completed successfully.

C:\PerfLogs\Admin\Perftest\LUDO_20130423-000002> <strong>relog "Performance Counter.blg" -f csv -o "Performance Counter.csv"</strong>

Input

----------------

File(s):

Performance Counter.blg (Binary)

Begin: 23.4.2013 14:56:02

End: 23.4.2013 15:33:37

Samples: 452

100.00%

Output

----------------

File: Performance Counter.csv

Begin: 23.4.2013 14:56:02

End: 23.4.2013 15:33:37

Samples: 452

The command completed successfully.

Generating the graphs

Transfer the CSV on the box where you have the php and rrdtool configured, then run:

[root@lucrac01 temp]# php process_l.php PerformanceCounter.csv

453--Creating rrd /root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.rrd

Creating rrd /root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.rrd
Creating rrd /root/temp/LUDO/IPv4/Fragmented_Datagrams_sec.rrd
Creating rrd /root/temp/LUDO/IPv4/Datagrams_sec.rrd

...

Creating rrd /root/temp/LUDO/Memory/Pages_Input_sec.rrd
Creating rrd /root/temp/LUDO/Memory/Pool_Paged_Resident_Bytes.rrd
Creating rrd /root/temp/LUDO/Memory/Write_Copies_sec.rrd

...

Creating rrd /root/temp/LUDO/PhysicalDisk_2_E__/Avg._Disk_sec_Transfer.rrd
Creating rrd /root/temp/LUDO/PhysicalDisk_1_D__/Avg._Disk_sec_Transfer.rrd
Creating rrd /root/temp/LUDO/PhysicalDisk_0_C__/Avg._Disk_sec_Transfer.rrd
----......

1.Generating Graph: /root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.png
rrdtool graph /root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.png --start "1366721762" --end "1366724017" --width 453 DEF:ds0=/root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.rrd:value:LAST:step=5 LINE1:ds0#0000FF:"IPv4\Datagrams Received Delivered/sec" VDEF:ds0max=ds0,MAXIMUM VDEF:ds0avg=ds0,AVERAGE VDEF:ds0min=ds0,MINIMUM COMMENT:" " COMMENT:" Maximum " GPRINT:ds0max:"%6.2lf" COMMENT:" Average " GPRINT:ds0avg:"%6.2lf" COMMENT:" Minimum " GPRINT:ds0min:"%6.2lf"
534x177
2.Generating Graph: /root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.png
rrdtool graph /root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.png --start "1366721762" --end "1366724017" --width 453 DEF:ds0=/root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.rrd:value:LAST:step=5 LINE1:ds0#0000FF:"IPv4\Datagrams Received Unknown Protocol" VDEF:ds0max=ds0,MAXIMUM VDEF:ds0avg=ds0,AVERAGE VDEF:ds0min=ds0,MINIMUM COMMENT:" " COMMENT:" Maximum " GPRINT:ds0max:"%6.2lf" COMMENT:" Average " GPRINT:ds0avg:"%6.2lf" COMMENT:" Minimum " GPRINT:ds0min:"%6.2lf"
534x177
...

[root@lucrac01 temp]# php process_l.php PerformanceCounter.csv

453--Creating rrd /root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.rrd

Creating rrd /root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.rrd

Creating rrd /root/temp/LUDO/IPv4/Fragmented_Datagrams_sec.rrd

Creating rrd /root/temp/LUDO/IPv4/Datagrams_sec.rrd

...

Creating rrd /root/temp/LUDO/Memory/Pages_Input_sec.rrd

Creating rrd /root/temp/LUDO/Memory/Pool_Paged_Resident_Bytes.rrd

Creating rrd /root/temp/LUDO/Memory/Write_Copies_sec.rrd

...

Creating rrd /root/temp/LUDO/PhysicalDisk_2_E__/Avg._Disk_sec_Transfer.rrd

Creating rrd /root/temp/LUDO/PhysicalDisk_1_D__/Avg._Disk_sec_Transfer.rrd

Creating rrd /root/temp/LUDO/PhysicalDisk_0_C__/Avg._Disk_sec_Transfer.rrd

----......

1.Generating Graph: /root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.png

rrdtool graph /root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.png --start "1366721762" --end "1366724017" --width 453 DEF:ds0=/root/temp/LUDO/IPv4/Datagrams_Received_Delivered_sec.rrd:value:LAST:step=5 LINE1:ds0#0000FF:"IPv4\Datagrams Received Delivered/sec" VDEF:ds0max=ds0,MAXIMUM VDEF:ds0avg=ds0,AVERAGE VDEF:ds0min=ds0,MINIMUM COMMENT:" " COMMENT:" Maximum " GPRINT:ds0max:"%6.2lf" COMMENT:" Average " GPRINT:ds0avg:"%6.2lf" COMMENT:" Minimum " GPRINT:ds0min:"%6.2lf"

534x177

2.Generating Graph: /root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.png

rrdtool graph /root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.png --start "1366721762" --end "1366724017" --width 453 DEF:ds0=/root/temp/LUDO/IPv4/Datagrams_Received_Unknown_Protocol.rrd:value:LAST:step=5 LINE1:ds0#0000FF:"IPv4\Datagrams Received Unknown Protocol" VDEF:ds0max=ds0,MAXIMUM VDEF:ds0avg=ds0,AVERAGE VDEF:ds0min=ds0,MINIMUM COMMENT:" " COMMENT:" Maximum " GPRINT:ds0max:"%6.2lf" COMMENT:" Average " GPRINT:ds0avg:"%6.2lf" COMMENT:" Minimum " GPRINT:ds0min:"%6.2lf"

534x177

...

Now it’s done!

The script generate a folder with the name of the server (LUDO in my example) and a subfolder for each class of counters (as you see in Performance Monitor).

Inside each folder you will have a PNG (and an rrd) for each metric.

Important: The RRD are generated with a single round-robin archive with a size equal to the number of samples. If you want to have the rrd to store your historical data you’ll need to modify the script. Also, the size of the graph will be the same as the number of samples (for best reading), but limited to 1000 to avoid huge images.

Future Improvements

Would be nice to have a prepared set of graphs for standard graphs with multiple metrics (e.g. CPU user, system and idle together) and additional lines like regressions…

Download the script: process_l_php.txt and rename it with a .php extension.

Hope you’ll find it useful!

Cheers

Ludo

ORA-00600 and user identified by values ”

Posted on April 5, 2012 by Ludovico

With rel. 10.2.0.5 was possibile to do this:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Release 10.2.0.5.0 - 64bit Production
...

SQL> create user foo identified by values '';

User created.

SQL> select * from v$version;

BANNER

----------------------------------------------------------------

Oracle Database 10g Release 10.2.0.5.0 - 64bit Production

...

SQL> create user foo identified by values '';

User created.

With 11.2.0.3 an ORA-00600 is raised.

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
...

sipes1 SQL> create user foo identified by values '';

create user foo identified by values ''
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kzsviver:1], [], [], [], [], [], [], [], [], [], [], []

SQL> select * from v$version;

BANNER

--------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

...

sipes1 SQL> create user foo identified by values '';

create user foo identified by values ''

ERROR at line 1:

ORA-00600: internal error code, arguments: [kzsviver:1], [], [], [], [], [], [], [], [], [], [], []

Mass datafile resizing

Posted on March 10, 2012 by Ludovico

Recently I needed to extend many datafiles on a database with more than 500 tablespaces because a lot of tablespaces were reaching the critical threshold.
Autoextend was not an option due to a bug I encountered on 10gR2 RAC on ASM and AIX.

The solution was the following script: it generates statements to autoextend datafiles with usage over a defined threshold (the “80” in the where clause) to low down the percentage below another defined threshold (the “75” in the select clause).

SELECT 'alter database datafile '''||f.file_name||
''' resize '||round(ceil(bytes/1024/1024/75*t.pct_used+50),-2)||'M;'
 FROM
   dba_data_files f, (
   SELECT  a.tablespace_name tablespace_name, a.total mb_total, nvl(b.free,0) mb_free,
          round((a.total-nvl(b.free,0))*100/a.total) pct_used
  FROM    (SELECT tablespace_name,round(sum(bytes)/1024/1024) free, max(bytes) maxfree
          FROM dba_free_space GROUP BY tablespace_name) b,
          (SELECT tablespace_name,decode(round(sum(bytes)/1024/1024),0,1,round(sum(bytes)/1024/1024)) total
          FROM dba_data_files GROUP BY tablespace_name) a
  WHERE a.tablespace_name = b.tablespace_name (+)) t  WHERE t.tablespace_name=f.tablespace_name
  AND t.pct_used>80;

SELECT 'alter database datafile '''||f.file_name||

''' resize '||round(ceil(bytes/1024/1024/75*t.pct_used+50),-2)||'M;'

FROM

dba_data_files f, (

SELECT a.tablespace_name tablespace_name, a.total mb_total, nvl(b.free,0) mb_free,

round((a.total-nvl(b.free,0))*100/a.total) pct_used

FROM (SELECT tablespace_name,round(sum(bytes)/1024/1024) free, max(bytes) maxfree

FROM dba_free_space GROUP BY tablespace_name) b,

(SELECT tablespace_name,decode(round(sum(bytes)/1024/1024),0,1,round(sum(bytes)/1024/1024)) total

FROM dba_data_files GROUP BY tablespace_name) a

WHERE a.tablespace_name = b.tablespace_name (+)) t WHERE t.tablespace_name=f.tablespace_name

AND t.pct_used>80;

Prior to extend it’s possible to show how much space is required to do this mass resizing:

 SELECT sum(mb_new-mb_old) FROM (
  SELECT t.tablespace_name, f.file_name, bytes/1024/1024 mb_old, round(ceil(bytes/1024/1024/75*t.pct_used+50),-2) mb_new FROM
 dba_data_files f, (
 SELECT  a.tablespace_name tablespace_name, a.total mb_total, nvl(b.free,0) mb_free,
        round((a.total-nvl(b.free,0))*100/a.total) pct_used
FROM    (SELECT tablespace_name,round(sum(bytes)/1024/1024) free, max(bytes) maxfree
        FROM dba_free_space GROUP BY tablespace_name) b,
        (SELECT tablespace_name,decode(round(sum(bytes)/1024/1024),0,1,round(sum(bytes)/1024/1024)) total
        FROM dba_data_files GROUP BY tablespace_name) a
WHERE a.tablespace_name = b.tablespace_name (+)) t  WHERE t.tablespace_name=f.tablespace_name
AND t.pct_used>80);

SELECT sum(mb_new-mb_old) FROM (

SELECT t.tablespace_name, f.file_name, bytes/1024/1024 mb_old, round(ceil(bytes/1024/1024/75*t.pct_used+50),-2) mb_new FROM

dba_data_files f, (

SELECT a.tablespace_name tablespace_name, a.total mb_total, nvl(b.free,0) mb_free,

round((a.total-nvl(b.free,0))*100/a.total) pct_used

FROM (SELECT tablespace_name,round(sum(bytes)/1024/1024) free, max(bytes) maxfree

FROM dba_free_space GROUP BY tablespace_name) b,

(SELECT tablespace_name,decode(round(sum(bytes)/1024/1024),0,1,round(sum(bytes)/1024/1024)) total

FROM dba_data_files GROUP BY tablespace_name) a

WHERE a.tablespace_name = b.tablespace_name (+)) t WHERE t.tablespace_name=f.tablespace_name

AND t.pct_used>80);

Dog eat Dog… Oracle deletes itself by mistake!

Posted on July 21, 2011 by Ludovico

While implementing the backup on a new DB inherited from a customer, I scheduled our standard backup “type disk” procedure through rman, on Windows.
The morning after I saw that the “delete obsolete” tried to delete ALL CURRENT DATAFILES!!

i criteri di conservazione RMAN verranno applicati al comando i criteri di conservazione RMAN sono impostati su una ridondanza 1 canale allocato: ORA_DISK_1 canale ORA_DISK_1: sid=29 devtype=DISK Eliminazione dei seguenti backup e copie obsoleti: Tipo Chiave Ora fine Nome file/Handle -------------------- ------ ------------------ -------------------- Set di backup 917 28-GIU-11 ... Set di backup 927 29-GIU-11 Backup piece 1005 29-GIU-11 H:\ORACLE\BACKUP\ORAPERSP\RMAN\SPFILEBCK_20110629 Copia file di dati 14 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF Copia file di dati 16 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TOOLS01.DBF Copia file di dati 17 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\USERS01.DBF Copia file di dati 18 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\DRSYS01.DBF Copia file di dati 19 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\EXAMPLE01.DBF Copia file di dati 20 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\ODM01.DBF Copia file di dati 21 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\XDB01.DBF Copia file di dati 22 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\CWMLITE01.DBF Copia file di dati 23 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLDATI01.ORA Copia file di dati 24 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLINDEX01.ORA Copia file di dati 25 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\OEM_REPOSITORY1.ORA Copia file di dati 26 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\SYSTEM01.DBF Copia file di dati 27 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\UNDOTBS01.DBF backup piece eliminata ... backup piece eliminata handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\C-2220366420-20110628-02 recid=990 stamp=755031582 backup piece eliminata handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\C-2220366420-20110629-00 recid=1002 stamp=755130872 backup piece eliminata handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\CTL_20110629 recid=1004 stamp=755130883 backup piece eliminata handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\SPFILEBCK_20110629 recid=1005 stamp=755130885 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of delete command on ORA_DISK_1 channel at 06/29/2011 22:34:55 ORA-19584: file E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF già in usoRecovery Manager ha terminato.

That’s because all current datafiles were registered into recovery catalog as backup copy. With a recovery redundancy of 1, all datafiles were set as obsolete! But since it’s windows, a delete command doesn’t delete datafiles if they are already in use. What it was on unix? We had just luck!

Then we had to uncatalog all copies.

RMAN> list copy;


la specifica non corrisponde a nessuno dei log di archivio del Recovery Catalog

Lista di copie del file di dati Chiave SCN Ckp file S Ora di completamento Nome Ora ckp ------- ---- - -------------------- ---------- -------------------- ---- 26 1 X 29-NOV-10 18535127593 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\SYSTEM01.DBF 27 2 X 29-NOV-10 18535127762 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\UNDOTBS01.DBF 14 3 X 29-NOV-10 18535122625 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF 16 4 X 29-NOV-10 18535123721 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TOOLS01.DBF 17 5 X 29-NOV-10 18535124423 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\USERS01.DBF 18 6 X 29-NOV-10 18535124439 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\DRSYS01.DBF 19 7 X 29-NOV-10 18535124453 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\EXAMPLE01.DBF 20 8 X 29-NOV-10 18535124554 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\ODM01.DBF 21 9 X 29-NOV-10 18535125790 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\XDB01.DBF 22 10 X 29-NOV-10 18535125874 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\CWMLITE01.DBF 23 11 X 29-NOV-10 18535125887 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLDATI01.ORA 24 12 X 29-NOV-10 18535126750 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLINDEX01.ORA 25 13 X 29-NOV-10 18535127211 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\OEM_REPOSITORY1.ORA

RMAN> change copy of datafile 1..N uncatalog;

copia non catalogata del file di dati filename di copia del file di dati=E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF recid=14 stamp=736336991 Oggetti 1 non catalogati ...

until no “obsolete” current datafile were reported!

RMAN> report obsolete;

i criteri di conservazione RMAN verranno applicati al comando i criteri di conservazione RMAN sono impostati su una ridondanza 1 non sono stati trovati backup obsoleti

Lesson learned: never schedule delete obsolete without actually checking what could be deleted!

10gR2 RAC hangs and “KSV master wait”

Posted on June 3, 2011 by Ludovico

We recently migrated a customer’s 10gR2 RAC on AIX6.1 from GPFS+HACMP to a “basic” Clusterware with datafiles over ASM.
After (many) problems related to various installation bugs (the list of requirements for AIX is very long, incomplete and requires many one-off patches to complete), we had a problem during an import of a new schema: the import hung with no apparent wait events. We found that the event it was waiting for was classified as ‘Idle’:

SQL&gt; select sid, username, status, event, wait_class, program from gv$session;

 SID USERNAME   STATUS   EVENT                WAIT_CLASS PROGRAM
---- ---------- -------- -------------------- ---------- ----------------------------------------
...
 135 SYS        ACTIVE   KSV master wait      Idle       imp@trndcsaixdb1 (TNS V1-V3)
...

SQL> select sid, username, status, event, wait_class, program from gv$session;

SID USERNAME STATUS EVENT WAIT_CLASS PROGRAM

---- ---------- -------- -------------------- ---------- ----------------------------------------

...

135 SYS ACTIVE KSV master wait Idle imp@trndcsaixdb1 (TNS V1-V3)

...

The on ASM instance:

SQL&gt; @wait10g

SID USERNAME   MACHINE         PROGRAM              EVENT                  SEQ#           P1  S_IN_WAIT STATE               STATUS
---- ---------- --------------- -------------------- -------------------- ------ ------------ ---------- ------------------- --------
201 SYS        trndcsaixdb1    oracle@trndcsaixdb1  enq: FA - access file      6   1178664965        744 WAITING             ACTIVE

SQL> @wait10g

SID USERNAME MACHINE PROGRAM EVENT SEQ# P1 S_IN_WAIT STATE STATUS

---- ---------- --------------- -------------------- -------------------- ------ ------------ ---------- ------------------- --------

201 SYS trndcsaixdb1 oracle@trndcsaixdb1 enq: FA - access file 6 1178664965 744 WAITING ACTIVE

The problem was related to datafile resize (we use autoextend) and according to MOS, we were encountering a bug:

Bug 11712836: RESIZING DATAFILE HUNG WAITING FOR KSV MASTER WAIT IN RAC

Shutting down one instance solved the problem. Now we have to avoid autoextend……. We never encountered this bug in many 10.2.0.4 rac installations.

Dataguard check script for Real Application Clusters (MAA)

Posted on December 31, 2010 by Ludovico

Two years after my posts:
Quick Oracle Dataguard check script and More about Dataguard and how to check it I faced a whole new Dataguard between two Oracle Real Application Clusters, aka Oracle Maximum Availability Architecture (MAA).

This enviromnent is relying on Windows OS. Don’t know how this could be called “availability” but here we are. I revisited my scripts in a quick and very dirty way. Please consider that I did copy and paste to check the alignment once per thread, but it should be improved with some kind of iteration to check each thread in a more structured fashion.

#!D:\oracle\product\10.2.0\db_1\perl\5.8.3\bin\MSWin32-x86-multi-thread\perl.exe -w
use DBI;
use DBD::Oracle qw(:ora_session_modes);
# DB connection #
my $prod  = "prod";
my $stby = "stby";
my $prodh;
unless ($prodh = DBI-&gt;connect('dbi:Oracle:'.$prod, 
    'sys', 'strongpwd', 
    {PrintError=&gt;0, AutoCommit =&gt; 0,
    ora_session_mode =&gt; ORA_SYSDBA}))  {
print "Error connecting to DB: $DBI::errstr\n";
exit(1);
}
$prodh-&gt;{RaiseError}=1;

my $stbyh;
unless ($stbyh = DBI-&gt;connect('dbi:Oracle:'.$stby,
    'sys', 'strongpwd',
    {PrintError=&gt;0, AutoCommit =&gt; 0,
    ora_session_mode =&gt; ORA_SYSDBA}))  {
print "Error connecting to DB: $DBI::errstr\n";
$prodh-&gt;disconnect;
exit(1);
}
$stbyh-&gt;{RaiseError}=1;

my $sth;
### query stdby MRP0
$sth = $stbyh-&gt;prepare( &lt;&lt;EOSQL );
select thread#, SEQUENCE#, BLOCK#
    from gv\$managed_standby 
    where process='MRP0'
EOSQL
$sth-&gt;execute();
my ($mrpthread, $mrpsequence, $mrpblock) = $sth-&gt;fetchrow_array();
$sth-&gt;finish();

### query stdby RFS
$sth = $stbyh-&gt;prepare( &lt;&lt;EOSQL );
select thread#, SEQUENCE#, BLOCK#
    from gv\$managed_standby 
    where process='RFS' and client_process='LGWR' order by thread#
EOSQL
$sth-&gt;execute();
my ($rfsthread1, $rfssequence1, $rfsblock1) = $sth-&gt;fetchrow_array();
my ($rfsthread2, $rfssequence2, $rfsblock2) = $sth-&gt;fetchrow_array();
$sth-&gt;finish();

### query prod
$sth = $prodh-&gt;prepare( &lt;&lt;EOSQL );
select thread#, SEQUENCE#, BLOCK#
    from gv\$managed_standby
    where process='LNS' order by thread#
EOSQL
$sth-&gt;execute();
my ($pthread1, $psequence1, $pblock1) = $sth-&gt;fetchrow_array();
my ($pthread2, $psequence2, $pblock2) = $sth-&gt;fetchrow_array();
$sth-&gt;finish();


printf ("ENVIRONM  Thread Sequence   Block\n");
printf ("--------- ------ ---------- ----------\n");
printf ("PROD     LNS1  1 %10d %10d\n", $psequence1, $pblock1);
printf ("STANDBY  RFS1  1 %10d %10d\n", $rfssequence1, $rfsblock1);
printf ("PROD     LSN2  2 %10d %10d\n", $psequence2, $pblock2);
printf ("STANDBY  RFS2  2 %10d %10d\n", $rfssequence2, $rfsblock2);
printf ("STANDBY  MRP0  %d %10d %10d\n", $mrpthread, $mrpsequence, $mrpblock);

my $psequence;
my $pblock;
if ( $mrpthread == 1 ) {
$psequence=$psequence1;
$pblock=$pblock1;
} else {
$psequence=$psequence2;
$pblock=$pblock2;
}

$sth = $stbyh-&gt;prepare( &lt;&lt;EOSQL );
select nvl(sum(blocks),0)
+ $pblock - $mrpblock as BLOCK_GAP
from gv\$archived_log
where thread#=$mrpthread and sequence#
between $mrpsequence and $psequence
EOSQL
$sth-&gt;execute();
my ($mrpblockgap) = $sth-&gt;fetchrow_array();
$sth-&gt;finish();

$sth = $stbyh-&gt;prepare( &lt;&lt;EOSQL );
select nvl(sum(blocks),0)
+ $pblock1 - $rfsblock1 as BLOCK_GAP
from gv\$archived_log
where thread#=1 and sequence#
between $rfssequence1 and $psequence1
EOSQL
$sth-&gt;execute();
my ($rfsblockgap1) = $sth-&gt;fetchrow_array();
$sth-&gt;finish();

$sth = $stbyh-&gt;prepare( &lt;&lt;EOSQL );
select nvl(sum(blocks),0)
+ $pblock2 - $rfsblock2 as BLOCK_GAP
from gv\$archived_log
where thread#=2 and sequence#
between $rfssequence2 and $psequence2
EOSQL
$sth-&gt;execute();
my ($rfsblockgap2) = $sth-&gt;fetchrow_array();
$sth-&gt;finish();
printf ("\n\n%-10d blocks gap in TRANSMISSION\n", $rfsblockgap1+$rfsblockgap2);
printf ("%-10d blocks gap in APPLY (MRP0)\n", $mrpblockgap);

$stbyh-&gt;disconnect;
$prodh-&gt;disconnect;

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

#!D:\oracle\product\10.2.0\db_1\perl\5.8.3\bin\MSWin32-x86-multi-thread\perl.exe -w

use DBI;

use DBD::Oracle qw(:ora_session_modes);

# DB connection #

my $prod = "prod";

my $stby = "stby";

my $prodh;

unless ($prodh = DBI->connect('dbi:Oracle:'.$prod,

'sys', 'strongpwd',

{PrintError=>0, AutoCommit => 0,

ora_session_mode => ORA_SYSDBA})) {

print "Error connecting to DB: $DBI::errstr\n";

exit(1);

}

$prodh->{RaiseError}=1;

my $stbyh;

unless ($stbyh = DBI->connect('dbi:Oracle:'.$stby,

'sys', 'strongpwd',

{PrintError=>0, AutoCommit => 0,

ora_session_mode => ORA_SYSDBA})) {

print "Error connecting to DB: $DBI::errstr\n";

$prodh->disconnect;

exit(1);

}

$stbyh->{RaiseError}=1;

my $sth;

### query stdby MRP0

$sth = $stbyh->prepare( <<EOSQL );

select thread#, SEQUENCE#, BLOCK#

from gv\$managed_standby

where process='MRP0'

EOSQL

$sth->execute();

my ($mrpthread, $mrpsequence, $mrpblock) = $sth->fetchrow_array();

$sth->finish();

### query stdby RFS

$sth = $stbyh->prepare( <<EOSQL );

select thread#, SEQUENCE#, BLOCK#

from gv\$managed_standby

where process='RFS' and client_process='LGWR' order by thread#

EOSQL

$sth->execute();

my ($rfsthread1, $rfssequence1, $rfsblock1) = $sth->fetchrow_array();

my ($rfsthread2, $rfssequence2, $rfsblock2) = $sth->fetchrow_array();

$sth->finish();

### query prod

$sth = $prodh->prepare( <<EOSQL );

select thread#, SEQUENCE#, BLOCK#

from gv\$managed_standby

where process='LNS' order by thread#

EOSQL

$sth->execute();

my ($pthread1, $psequence1, $pblock1) = $sth->fetchrow_array();

my ($pthread2, $psequence2, $pblock2) = $sth->fetchrow_array();

$sth->finish();

printf ("ENVIRONM Thread Sequence Block\n");

printf ("--------- ------ ---------- ----------\n");

printf ("PROD LNS1 1 %10d %10d\n", $psequence1, $pblock1);

printf ("STANDBY RFS1 1 %10d %10d\n", $rfssequence1, $rfsblock1);

printf ("PROD LSN2 2 %10d %10d\n", $psequence2, $pblock2);

printf ("STANDBY RFS2 2 %10d %10d\n", $rfssequence2, $rfsblock2);

printf ("STANDBY MRP0 %d %10d %10d\n", $mrpthread, $mrpsequence, $mrpblock);

my $psequence;

my $pblock;

if ( $mrpthread == 1 ) {

$psequence=$psequence1;

$pblock=$pblock1;

} else {

$psequence=$psequence2;

$pblock=$pblock2;

}

$sth = $stbyh->prepare( <<EOSQL );

select nvl(sum(blocks),0)

+ $pblock - $mrpblock as BLOCK_GAP

from gv\$archived_log

where thread#=$mrpthread and sequence#

between $mrpsequence and $psequence

EOSQL

$sth->execute();

my ($mrpblockgap) = $sth->fetchrow_array();

$sth->finish();

$sth = $stbyh->prepare( <<EOSQL );

select nvl(sum(blocks),0)

+ $pblock1 - $rfsblock1 as BLOCK_GAP

from gv\$archived_log

where thread#=1 and sequence#

between $rfssequence1 and $psequence1

EOSQL

$sth->execute();

my ($rfsblockgap1) = $sth->fetchrow_array();

$sth->finish();

$sth = $stbyh->prepare( <<EOSQL );

select nvl(sum(blocks),0)

+ $pblock2 - $rfsblock2 as BLOCK_GAP

from gv\$archived_log

where thread#=2 and sequence#

between $rfssequence2 and $psequence2

EOSQL

$sth->execute();

my ($rfsblockgap2) = $sth->fetchrow_array();

$sth->finish();

printf ("\n\n%-10d blocks gap in TRANSMISSION\n", $rfsblockgap1+$rfsblockgap2);

printf ("%-10d blocks gap in APPLY (MRP0)\n", $mrpblockgap);

$stbyh->disconnect;

$prodh->disconnect;

Please foreward me every improvement you implement over my code: it would be nice to post it here.

Oracle capacity planning with RRDTOOL

Posted on May 25, 2009 by Ludovico

RRDize everything, chapter 2

Oracle Database Server has the most powerful system catalog that allows to query almost any aspect inside an oracle instance.
You can query many v$ fixed views at regular intervals and populate many RRD files through rrdtool: space usage, wait events. system statistics and so on…

Since release 10.1 Oracle has introduced Automatic Workload Repository, a finer version of old good Statspack.
No matter if you are using AWR or statspack, you can rely on their views to collect data for your RRDs.

If you are administering a new instance and you haven’t collected its statistics so far, you can query (as example) the DBA_HIST_BG_EVENT_SUMMARY view to gather all AWR data about wait events. Historical views could be useful also to collect historical data once a week rather than query the fixed views every few minutes doing the hard work twice (you and AWR).

The whole process of gathering performance data and update rrd files can be resumed into the following steps:

– connect to the database
– query the AWR’s views
– build and execute an rrdtool update command
– check if rrd file exists or create it
– update the rrd file

The less rrdtool update commands you will execute, the better the whole process will perform.
Do it in a language you are comfortable with and that supports easily connection descriptors.

Since I’m very comfortable with php, I did it this way.

This is a very basilar script that works greatly for me with good performances:

#!/usr/bin/php -f
< ?php                                         

define('WD','/opt/oracle/awr');
$cs         = $_SERVER['argv'][1];
$user       = 'mymonitoruser';
$pass       = 'mystrongpassword'; 

/* open a new connection */
$ds = oci_connect($user, $pass, $cs)
        or die ("Cannot connect to Oracle Database ".$cs."\n");

/* setting client nls environment */
$sql = "alter session set nls_timestamp_format='MM/DD/YY HH24:MI'";
$stmt = oci_parse($ds, $sql);
oci_execute($stmt);
oci_free_statement($stmt);                                         

/* create directory that will contain rrds (if not exists) */
if(!file_exists(WD.'/'.$cs))
                mkdir(WD.'/'.$cs);
if(!file_exists(WD.'/'.$cs.'/wait'))
                mkdir(WD.'/'.$cs.'/wait');                   

/* function to create new RRDs */
function createRRD($name, $interval, $cs) {
        $hb = $interval*5; //heartbeat
        $cmd="rrdtool create ".WD."/".$cs."/wait/${name}.rrd -s ".$interval." \
                -b \"now -3month\" DS:waits:DERIVE:$hb:0:U \
                DS:mswaited:DERIVE:$hb:0:U \
                RRA:AVERAGE:0.5:1:1440 RRA:AVERAGE:0.5:30:336 \
                RRA:AVERAGE:0.5:120:372 RRA:AVERAGE:0.5:720:730 \
                RRA:MIN:0.5:1:1440 RRA:MIN:0.5:30:336 \
                RRA:MIN:0.5:120:372 RRA:MIN:0.5:720:730 \
                RRA:MAX:0.5:1:1440 RRA:MAX:0.5:30:336 \
                RRA:MAX:0.5:120:372 RRA:MAX:0.5:720:730 \
                RRA:LAST:0.5:1:1440";
        //print $cmd."\n";
        return passthru($cmd);
}                                                                              

/* take the snapshot frequency from dba_hist_wr_control
 to create the RDD with correct heartbeat value */
$sql = 'select extract(hour from snap_interval)*3600 +
extract(minute from snap_interval)*60 as SEED from DBA_HIST_WR_CONTROL';
$stmt = oci_parse($ds, $sql);
oci_execute($stmt);
$row = oci_fetch_assoc($stmt);
$interval = $row['SEED'];
unset($row);
oci_free_statement($stmt);                                              

/* statement definition that will collect
 all snapshots for a certain wait event with more than
 a certain amonut of time waited.
 Gathering ALL EVENTS could be time consuming and useless.
 I fetch rows ordered by event_name rather
 then by date because I can update many values
 into the same rrd with very few rrdupdate commands
*/
$sql = 'select s.END_INTERVAL_TIME END_INTERVAL_TIME,
    g.EVENT_NAME, g.WAIT_CLASS, g.TOTAL_WAITS,
    round(g.TIME_WAITED_MICRO/1000) MS
  from DBA_HIST_SNAPSHOT s,
   dba_hist_bg_event_summary g,
   v$instance i
 where s.SNAP_ID=g.SNAP_ID and g.wait_class!=\'Idle\'
  and g.TIME_WAITED_MICRO&gt;100000
  and s.instance_number=i.instance_number
  and s.instance_number=g.instance_number
 order by 2,1';                                      

/* default prefetch size (148) matches default snapshot retention (24hx7dd) */
$stmt = oci_parse($ds, $sql);
oci_set_prefetch($stmt, 148);
oci_execute($stmt);

$i=0;
$oldevent="";
while ($row = oci_fetch_assoc($stmt)) {
        if ($oldevent != $row['EVENT_NAME']) {
                //NEW EVENT DETECTED: WILL START A NEW UPDATE CMD
                if ($i != 0 &amp;&amp; !empty($cmd)) {
                        /* not the first occurrence,
                         I bet there's something in my buffer */
                        passthru($cmd);
                }
                $cleanName = preg_replace ("([^[:alnum:]_-])","_",$row['EVENT_NAME']);
                // if there is no rrd for this event, I create a new one
                if (!file_exists(WD."/".$cs."/wait/${cleanName}.rrd")) {
                        createRRD($cleanName, $interval, $cs);
                }
                /*
                * I initialize a new update command. This string act as a buffer: I append many
                * values to be updated so I'll update many values in a single command line:
                * less forks of rrdtool and less file opens: the whole update process has an
                * enormous improvement.
                */
                $precmd="rrdtool update ".WD."/".$cs."/wait/${cleanName}.rrd ";
                $lastcmd="rrdtool info ".WD."/".$cs."/wait/${cleanName}.rrd".
                        "| grep last_update | awk '{print \$NF}'";
                $last=trim(`$lastcmd`);
                printf ("%s - %s - last: %d\n", $row['EVENT_NAME'], $cleanName, $last);
                $i=0;
                $cmd=$precmd;
                $oldevent=$row['EVENT_NAME'];
        }
        $time=strtotime($row['END_INTERVAL_TIME']);
        //print "time: ".$time."  last: ".$last."\n";
        if ( $time &gt; $last ) {
                $cmd.=" ".$time.":".$row['TOTAL_WAITS'].":".$row['MS'];
                $i++;
        }
        if ($i &gt;= 40) {
                // when I reach 40 values per commandline I force
                // the update: next loop will reinitialize a new commandline.
                passthru($cmd);
                $cmd=$precmd;
                $i=0;
        }
        unset($row);

}
if ($i != 0) {
        /* one more update pending in my buffer */
        passthru($cmd);
}
oci_free_statement($stmt);
oci_close($ds);
?>

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

#!/usr/bin/php -f

< ?php

define('WD','/opt/oracle/awr');

$cs = $_SERVER['argv'][1];

$user = 'mymonitoruser';

$pass = 'mystrongpassword';

/* open a new connection */

$ds = oci_connect($user, $pass, $cs)

or die ("Cannot connect to Oracle Database ".$cs."\n");

/* setting client nls environment */

$sql = "alter session set nls_timestamp_format='MM/DD/YY HH24:MI'";

$stmt = oci_parse($ds, $sql);

oci_execute($stmt);

oci_free_statement($stmt);

/* create directory that will contain rrds (if not exists) */

if(!file_exists(WD.'/'.$cs))

mkdir(WD.'/'.$cs);

if(!file_exists(WD.'/'.$cs.'/wait'))

mkdir(WD.'/'.$cs.'/wait');

/* function to create new RRDs */

function createRRD($name, $interval, $cs) {

$hb = $interval*5; //heartbeat

$cmd="rrdtool create ".WD."/".$cs."/wait/${name}.rrd -s ".$interval." \

-b \"now -3month\" DS:waits:DERIVE:$hb:0:U \

DS:mswaited:DERIVE:$hb:0:U \

RRA:AVERAGE:0.5:1:1440 RRA:AVERAGE:0.5:30:336 \

RRA:AVERAGE:0.5:120:372 RRA:AVERAGE:0.5:720:730 \

RRA:MIN:0.5:1:1440 RRA:MIN:0.5:30:336 \

RRA:MIN:0.5:120:372 RRA:MIN:0.5:720:730 \

RRA:MAX:0.5:1:1440 RRA:MAX:0.5:30:336 \

RRA:MAX:0.5:120:372 RRA:MAX:0.5:720:730 \

RRA:LAST:0.5:1:1440";

//print $cmd."\n";

return passthru($cmd);

}

/* take the snapshot frequency from dba_hist_wr_control

to create the RDD with correct heartbeat value */

$sql = 'select extract(hour from snap_interval)*3600 +

extract(minute from snap_interval)*60 as SEED from DBA_HIST_WR_CONTROL';

$stmt = oci_parse($ds, $sql);

oci_execute($stmt);

$row = oci_fetch_assoc($stmt);

$interval = $row['SEED'];

unset($row);

oci_free_statement($stmt);

/* statement definition that will collect

all snapshots for a certain wait event with more than

a certain amonut of time waited.

Gathering ALL EVENTS could be time consuming and useless.

I fetch rows ordered by event_name rather

then by date because I can update many values

into the same rrd with very few rrdupdate commands

$sql = 'select s.END_INTERVAL_TIME END_INTERVAL_TIME,

g.EVENT_NAME, g.WAIT_CLASS, g.TOTAL_WAITS,

round(g.TIME_WAITED_MICRO/1000) MS

from DBA_HIST_SNAPSHOT s,

dba_hist_bg_event_summary g,

v$instance i

where s.SNAP_ID=g.SNAP_ID and g.wait_class!=\'Idle\'

and g.TIME_WAITED_MICRO>100000

and s.instance_number=i.instance_number

and s.instance_number=g.instance_number

order by 2,1';

/* default prefetch size (148) matches default snapshot retention (24hx7dd) */

$stmt = oci_parse($ds, $sql);

oci_set_prefetch($stmt, 148);

oci_execute($stmt);

$i=0;

$oldevent="";

while ($row = oci_fetch_assoc($stmt)) {

if ($oldevent != $row['EVENT_NAME']) {

//NEW EVENT DETECTED: WILL START A NEW UPDATE CMD

if ($i != 0 && !empty($cmd)) {

/* not the first occurrence,

I bet there's something in my buffer */

passthru($cmd);

}

$cleanName = preg_replace ("([^[:alnum:]_-])","_",$row['EVENT_NAME']);

// if there is no rrd for this event, I create a new one

if (!file_exists(WD."/".$cs."/wait/${cleanName}.rrd")) {

createRRD($cleanName, $interval, $cs);

}

* I initialize a new update command. This string act as a buffer: I append many

* values to be updated so I'll update many values in a single command line:

* less forks of rrdtool and less file opens: the whole update process has an

* enormous improvement.

$precmd="rrdtool update ".WD."/".$cs."/wait/${cleanName}.rrd ";

$lastcmd="rrdtool info ".WD."/".$cs."/wait/${cleanName}.rrd".

"| grep last_update | awk '{print \$NF}'";

$last=trim(`$lastcmd`);

printf ("%s - %s - last: %d\n", $row['EVENT_NAME'], $cleanName, $last);

$i=0;

$cmd=$precmd;

$oldevent=$row['EVENT_NAME'];

}

$time=strtotime($row['END_INTERVAL_TIME']);

//print "time: ".$time." last: ".$last."\n";

if ( $time > $last ) {

$cmd.=" ".$time.":".$row['TOTAL_WAITS'].":".$row['MS'];

$i++;

}

if ($i >= 40) {

// when I reach 40 values per commandline I force

// the update: next loop will reinitialize a new commandline.

passthru($cmd);

$cmd=$precmd;

$i=0;

}

unset($row);

}

if ($i != 0) {

/* one more update pending in my buffer */

passthru($cmd);

}

oci_free_statement($stmt);

oci_close($ds);

Depending on how many different wait events you have, you’ll have a certain number of rrd files:

# ls -l
total 3864
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 Streams_AQ__enqueue_blocked_on_low_memory.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 20 08:18 buffer_busy_waits.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 control_file_parallel_write.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 control_file_sequential_read.rrd
-rw-r--r-- 1 ludovico ludovico 165304 Apr 30 10:12 cursor__pin_S_wait_on_X.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 db_file_scattered_read.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 db_file_sequential_read.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 events_in_waitclass_Other.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__cache_buffers_chains.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__library_cache.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 11 13:22 latch__library_cache_lock.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 20 08:18 latch__redo_writing.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__row_cache_objects.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__shared_pool.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 library_cache_load_lock.rrd
-rw-r--r-- 1 ludovico ludovico 165304 Apr 15 13:17 library_cache_lock.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_buffer_space.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_parallel_write.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_sequential_read.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_single_write.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_switch_completion.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 11 13:22 log_file_sync.rrd
-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 os_thread_startup.rrd

# ls -l

total 3864

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 Streams_AQ__enqueue_blocked_on_low_memory.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 20 08:18 buffer_busy_waits.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 control_file_parallel_write.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 control_file_sequential_read.rrd

-rw-r--r-- 1 ludovico ludovico 165304 Apr 30 10:12 cursor__pin_S_wait_on_X.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 db_file_scattered_read.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 db_file_sequential_read.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 events_in_waitclass_Other.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__cache_buffers_chains.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__library_cache.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 11 13:22 latch__library_cache_lock.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 20 08:18 latch__redo_writing.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__row_cache_objects.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 latch__shared_pool.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 library_cache_load_lock.rrd

-rw-r--r-- 1 ludovico ludovico 165304 Apr 15 13:17 library_cache_lock.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_buffer_space.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_parallel_write.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_sequential_read.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_single_write.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 log_file_switch_completion.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 11 13:22 log_file_sync.rrd

-rw-r--r-- 1 ludovico ludovico 165304 May 25 15:00 os_thread_startup.rrd

As you can see, they are not so big…

Once you have your data in rrd files, it’s quite simple to script even complex plots with several datasources. Everything depends on the results you want.
This script stack all my wait events for a certain instance: it takes the directory containing all the rrds as first argument and the number of hours we want to be plotted as second argument:

cs=$1
hours=${2:-148}

eventlist=`ls $cs/wait/*rrd`

colors[1]="#000000"
colors[2]="#000055"
colors[3]="#0000aa"
colors[4]="#0000ff"
colors[5]="#550055"
colors[6]="#aa00aa"
colors[7]="#ff00ff"
colors[8]="#550000"
colors[9]="#aa0000"
colors[10]="#ff0000"
colors[11]="#555500"
colors[12]="#aaaa00"
colors[13]="#ffff00"
colors[14]="#005500"
colors[15]="#00aa00"
colors[16]="#00ff00"
colors[17]="#005555"
colors[18]="#00aaaa"
colors[19]="#00ffff"
colors[20]="#555555"
colors[21]="#aaaaaa"

i=0

for event in $eventlist ; do
        if [ $i -eq 0 ] ; then
                end=`rrdtool info $event | grep last_update | awk '{print $NF}'`
                end=`rrdtool info $cs/wait/control_file_parallel_write.rrd | grep last_update | awk '{print $NF}'`
                cmd="rrdtool graph - -s end-${hours}hours -e $end  -v \"milliseconds waited\" -l 0 -w 640 -h 240 -t \"$cs WAIT PROFILE\""
                i=$(($i+1))
        fi
        color=${colors[$i]}
        echo $color
        evname=`basename $event | sed -e s/\.rrd\$//`
        cmd="$cmd  DEF:$evname=$event:mswaited:AVERAGE"
        cmd="$cmd  AREA:${evname}${color}:"$evname":STACK"
        i=$(($i+1))
        if [ $i -eq 20 ] ; then
                i=1
        fi
done
        cmd="$cmd  |display /dev/input"
        echo $cmd
        eval $cmd
exit

cs=$1

hours=${2:-148}

eventlist=`ls $cs/wait/*rrd`

colors[1]="#000000"

colors[2]="#000055"

colors[3]="#0000aa"

colors[4]="#0000ff"

colors[5]="#550055"

colors[6]="#aa00aa"

colors[7]="#ff00ff"

colors[8]="#550000"

colors[9]="#aa0000"

colors[10]="#ff0000"

colors[11]="#555500"

colors[12]="#aaaa00"

colors[13]="#ffff00"

colors[14]="#005500"

colors[15]="#00aa00"

colors[16]="#00ff00"

colors[17]="#005555"

colors[18]="#00aaaa"

colors[19]="#00ffff"

colors[20]="#555555"

colors[21]="#aaaaaa"

i=0

for event in $eventlist ; do

if [ $i -eq 0 ] ; then

end=`rrdtool info $event | grep last_update | awk '{print $NF}'`

end=`rrdtool info $cs/wait/control_file_parallel_write.rrd | grep last_update | awk '{print $NF}'`

cmd="rrdtool graph - -s end-${hours}hours -e $end -v \"milliseconds waited\" -l 0 -w 640 -h 240 -t \"$cs WAIT PROFILE\""

i=$(($i+1))

color=${colors[$i]}

echo $color

evname=`basename $event | sed -e s/\.rrd\$//`

cmd="$cmd DEF:$evname=$event:mswaited:AVERAGE"

cmd="$cmd AREA:${evname}${color}:"$evname":STACK"

i=$(($i+1))

if [ $i -eq 20 ] ; then

i=1

done

cmd="$cmd |display /dev/input"

echo $cmd

eval $cmd

exit

The resulting command is very long:

rrdtool graph - -s end-148hours -e 1243252800 \
 -v "milliseconds waited" -l 0 -w 640 -h 240 -t "mydb WAIT PROFILE"\
 DEF:Streams_AQ__enqueue_blocked_on_low_memory=mydb/wait/Streams_AQ__enqueue_blocked_on_low_memory.rrd:mswaited:AVERAGE \
 AREA:Streams_AQ__enqueue_blocked_on_low_memory#000000:Streams_AQ__enqueue_blocked_on_low_memory:STACK\
 DEF:buffer_busy_waits=mydb/wait/buffer_busy_waits.rrd:mswaited:AVERAGE \
 AREA:buffer_busy_waits#000055:buffer_busy_waits:STACK\
 DEF:control_file_parallel_write=mydb/wait/control_file_parallel_write.rrd:mswaited:AVERAGE \
 AREA:control_file_parallel_write#0000aa:control_file_parallel_write:STACK\
 DEF:control_file_sequential_read=mydb/wait/control_file_sequential_read.rrd:mswaited:AVERAGE \
 AREA:control_file_sequential_read#0000ff:control_file_sequential_read:STACK\
 DEF:cursor__pin_S_wait_on_X=mydb/wait/cursor__pin_S_wait_on_X.rrd:mswaited:AVERAGE \
 AREA:cursor__pin_S_wait_on_X#550055:cursor__pin_S_wait_on_X:STACK\
 DEF:db_file_scattered_read=mydb/wait/db_file_scattered_read.rrd:mswaited:AVERAGE \
 AREA:db_file_scattered_read#aa00aa:db_file_scattered_read:STACK\
 DEF:db_file_sequential_read=mydb/wait/db_file_sequential_read.rrd:mswaited:AVERAGE \
 AREA:db_file_sequential_read#ff00ff:db_file_sequential_read:STACK\
 DEF:events_in_waitclass_Other=mydb/wait/events_in_waitclass_Other.rrd:mswaited:AVERAGE \
 AREA:events_in_waitclass_Other#550000:events_in_waitclass_Other:STACK\
 DEF:latch__cache_buffers_chains=mydb/wait/latch__cache_buffers_chains.rrd:mswaited:AVERAGE \
 AREA:latch__cache_buffers_chains#aa0000:latch__cache_buffers_chains:STACK\
 DEF:latch__library_cache=mydb/wait/latch__library_cache.rrd:mswaited:AVERAGE \
 AREA:latch__library_cache#ff0000:latch__library_cache:STACK\
 DEF:latch__library_cache_lock=mydb/wait/latch__library_cache_lock.rrd:mswaited:AVERAGE \
 AREA:latch__library_cache_lock#555500:latch__library_cache_lock:STACK\
 DEF:latch__redo_writing=mydb/wait/latch__redo_writing.rrd:mswaited:AVERAGE \
 AREA:latch__redo_writing#aaaa00:latch__redo_writing:STACK\
 DEF:latch__row_cache_objects=mydb/wait/latch__row_cache_objects.rrd:mswaited:AVERAGE \
 AREA:latch__row_cache_objects#ffff00:latch__row_cache_objects:STACK\
 DEF:latch__shared_pool=mydb/wait/latch__shared_pool.rrd:mswaited:AVERAGE \
 AREA:latch__shared_pool#005500:latch__shared_pool:STACK\
 DEF:library_cache_load_lock=mydb/wait/library_cache_load_lock.rrd:mswaited:AVERAGE \
 AREA:library_cache_load_lock#00aa00:library_cache_load_lock:STACK\
 DEF:library_cache_lock=mydb/wait/library_cache_lock.rrd:mswaited:AVERAGE \
 AREA:library_cache_lock#00ff00:library_cache_lock:STACK\
 DEF:log_buffer_space=mydb/wait/log_buffer_space.rrd:mswaited:AVERAGE \
 AREA:log_buffer_space#005555:log_buffer_space:STACK\
 DEF:log_file_parallel_write=mydb/wait/log_file_parallel_write.rrd:mswaited:AVERAGE \
 AREA:log_file_parallel_write#00aaaa:log_file_parallel_write:STACK\
 DEF:log_file_sequential_read=mydb/wait/log_file_sequential_read.rrd:mswaited:AVERAGE \
 AREA:log_file_sequential_read#00ffff:log_file_sequential_read:STACK\
 DEF:log_file_single_write=mydb/wait/log_file_single_write.rrd:mswaited:AVERAGE \
 AREA:log_file_single_write#000000:log_file_single_write:STACK\
 DEF:log_file_switch_completion=mydb/wait/log_file_switch_completion.rrd:mswaited:AVERAGE \
 AREA:log_file_switch_completion#000055:log_file_switch_completion:STACK\
 DEF:log_file_sync=mydb/wait/log_file_sync.rrd:mswaited:AVERAGE \
 AREA:log_file_sync#0000aa:log_file_sync:STACK\
 DEF:os_thread_startup=mydb/wait/os_thread_startup.rrd:mswaited:AVERAGE \
 AREA:os_thread_startup#0000ff:os_thread_startup:STACK |display /dev/input

rrdtool graph - -s end-148hours -e 1243252800 \

-v "milliseconds waited" -l 0 -w 640 -h 240 -t "mydb WAIT PROFILE"\

DEF:Streams_AQ__enqueue_blocked_on_low_memory=mydb/wait/Streams_AQ__enqueue_blocked_on_low_memory.rrd:mswaited:AVERAGE \

AREA:Streams_AQ__enqueue_blocked_on_low_memory#000000:Streams_AQ__enqueue_blocked_on_low_memory:STACK\

DEF:buffer_busy_waits=mydb/wait/buffer_busy_waits.rrd:mswaited:AVERAGE \

AREA:buffer_busy_waits#000055:buffer_busy_waits:STACK\

DEF:control_file_parallel_write=mydb/wait/control_file_parallel_write.rrd:mswaited:AVERAGE \

AREA:control_file_parallel_write#0000aa:control_file_parallel_write:STACK\

DEF:control_file_sequential_read=mydb/wait/control_file_sequential_read.rrd:mswaited:AVERAGE \

AREA:control_file_sequential_read#0000ff:control_file_sequential_read:STACK\

DEF:cursor__pin_S_wait_on_X=mydb/wait/cursor__pin_S_wait_on_X.rrd:mswaited:AVERAGE \

AREA:cursor__pin_S_wait_on_X#550055:cursor__pin_S_wait_on_X:STACK\

DEF:db_file_scattered_read=mydb/wait/db_file_scattered_read.rrd:mswaited:AVERAGE \

AREA:db_file_scattered_read#aa00aa:db_file_scattered_read:STACK\

DEF:db_file_sequential_read=mydb/wait/db_file_sequential_read.rrd:mswaited:AVERAGE \

AREA:db_file_sequential_read#ff00ff:db_file_sequential_read:STACK\

DEF:events_in_waitclass_Other=mydb/wait/events_in_waitclass_Other.rrd:mswaited:AVERAGE \

AREA:events_in_waitclass_Other#550000:events_in_waitclass_Other:STACK\

DEF:latch__cache_buffers_chains=mydb/wait/latch__cache_buffers_chains.rrd:mswaited:AVERAGE \

AREA:latch__cache_buffers_chains#aa0000:latch__cache_buffers_chains:STACK\

DEF:latch__library_cache=mydb/wait/latch__library_cache.rrd:mswaited:AVERAGE \

AREA:latch__library_cache#ff0000:latch__library_cache:STACK\

DEF:latch__library_cache_lock=mydb/wait/latch__library_cache_lock.rrd:mswaited:AVERAGE \

AREA:latch__library_cache_lock#555500:latch__library_cache_lock:STACK\

DEF:latch__redo_writing=mydb/wait/latch__redo_writing.rrd:mswaited:AVERAGE \

AREA:latch__redo_writing#aaaa00:latch__redo_writing:STACK\

DEF:latch__row_cache_objects=mydb/wait/latch__row_cache_objects.rrd:mswaited:AVERAGE \

AREA:latch__row_cache_objects#ffff00:latch__row_cache_objects:STACK\

DEF:latch__shared_pool=mydb/wait/latch__shared_pool.rrd:mswaited:AVERAGE \

AREA:latch__shared_pool#005500:latch__shared_pool:STACK\

DEF:library_cache_load_lock=mydb/wait/library_cache_load_lock.rrd:mswaited:AVERAGE \

AREA:library_cache_load_lock#00aa00:library_cache_load_lock:STACK\

DEF:library_cache_lock=mydb/wait/library_cache_lock.rrd:mswaited:AVERAGE \

AREA:library_cache_lock#00ff00:library_cache_lock:STACK\

DEF:log_buffer_space=mydb/wait/log_buffer_space.rrd:mswaited:AVERAGE \

AREA:log_buffer_space#005555:log_buffer_space:STACK\

DEF:log_file_parallel_write=mydb/wait/log_file_parallel_write.rrd:mswaited:AVERAGE \

AREA:log_file_parallel_write#00aaaa:log_file_parallel_write:STACK\

DEF:log_file_sequential_read=mydb/wait/log_file_sequential_read.rrd:mswaited:AVERAGE \

AREA:log_file_sequential_read#00ffff:log_file_sequential_read:STACK\

DEF:log_file_single_write=mydb/wait/log_file_single_write.rrd:mswaited:AVERAGE \

AREA:log_file_single_write#000000:log_file_single_write:STACK\

DEF:log_file_switch_completion=mydb/wait/log_file_switch_completion.rrd:mswaited:AVERAGE \

AREA:log_file_switch_completion#000055:log_file_switch_completion:STACK\

DEF:log_file_sync=mydb/wait/log_file_sync.rrd:mswaited:AVERAGE \

AREA:log_file_sync#0000aa:log_file_sync:STACK\

DEF:os_thread_startup=mydb/wait/os_thread_startup.rrd:mswaited:AVERAGE \

AREA:os_thread_startup#0000ff:os_thread_startup:STACK |display /dev/input

This is the resulting graph:

OHHHHHHHHHHHH COOOOL!!!
😉

Any comment is appreciated! thanks

How to collect Oracle Application Server performance data with DMS and RRDtool

Posted on March 2, 2009 by Ludovico

RRDize everything, chapter 1

If you are managing some Application Server deployments you should have wondered how to check and collect performance data.
As stated in documentation, you can gather performance metrics with the dmstool utility.
AFAIK, this can be done from 9.0.2 release upwards, but i’m concerned DMS will not work on Weblogic.

Mainly, you should have an external server that acts as collector (it could be a server in the Oracle AS farm as well): copy the dms.jar library from an Oracle AS installation to your collector and use it as you would use dmstool:

java -jar dms.jar [dmstool options]

1	java -jar dms.jar [dmstool options]

There are three basilar methods to get data:

Get all metrics at once:

java -jar dms.jar -dump -a "youraddress://..." [format=xml]

1	java -jar dms.jar -dump -a "youraddress://..." [format=xml]

Get only the interesting metrics:

java -jar dms.jar -a "youraddress://..." metric metric ...

1	java -jar dms.jar -a "youraddress://..." metric metric ...

Get metrics included into specific DMS tables:

java -jar dms.jar -a "youraddress://..." -table table table ...

1	java -jar dms.jar -a "youraddress://..." -table table table ...

What youraddress:// is, it depends on the component you are trying to connect:

opmn://asserver:6003
http://asserver:7200/dms0/Spy
ajp13://asserver:3301/dmsoc4j/Spy

opmn://asserver:6003

http://asserver:7200/dms0/Spy

ajp13://asserver:3301/dmsoc4j/Spy

If you are trying to connect to the OHS (Apache), be careful to allow remote access from the collector by editing the dms.conf file.

Now that you can query dms data, you should store it somewhere.
Personally, I did a first attempt with dmstool -dump format=xml. I wrote a parser in PHP with SimpleXML extension and I did a lot of inserts into a MySQL database. After a few months the whole data collected from tens of servers was too much to be mantained…
To avoid the maintenance of a DWH-grade database I investigated and found RRDTool. Now I’m asking how could I live without it!

I then wrote a parser in awk that parse the output of the dms.jar call and invoke an rrdtool update command.
I always use dms.jar -table command. The output has always the same format:

###SOF

Mon Mar 02 17:01:19 CET 2009

---------------
TABLE1_Name
---------------

record1_metric1.name:     value       units
record1_metric2.name:     value       units
....

record2_metric1.name:     value       units
record2_metric2.name:     value       units
....

---
TABLE2_Name
---

record1_metric1.name:     value       units
record1_metric2.name:     value       units
....

record2_metric1.name:     value       units
record2_metric2.name:     value       units
....

##EOF

###SOF

Mon Mar 02 17:01:19 CET 2009

---------------

TABLE1_Name

---------------

record1_metric1.name: value units

record1_metric2.name: value units

....

record2_metric1.name: value units

record2_metric2.name: value units

....

---

TABLE2_Name

---

record1_metric1.name: value units

record1_metric2.name: value units

....

record2_metric1.name: value units

record2_metric2.name: value units

....

##EOF

So I written an awk file that works for me.
use it this way:

 java -jar dms.jar ... | awk -f parse_output.awk

1	java -jar dms.jar ... \| awk -f parse_output.awk

####################
# parse_output.awk #
####################

#function pl() replaces all non alphanumeric occurrences with an underscore
function pl(input) {
        return gensub("[^[:alnum:]_-]","_","G",input);
}

# function get_rrd_path() returns a path where the rrd files should be placed
# I should rewrite a new path for each dms table... I'll skip many of them
function get_rrd_path() {
        if (table == "mod_oc4j_destination_metrics")
                return sprintf("%s/%s/%s/%s.rrd", record["Host"],
                    pl(table), pl(record["Name.value"]), pl(var) );
        if (table == "mod_oc4j_mount_pt_metrics")
                return sprintf("%s/%s/%s/%s/%s.rrd", record["Host"],
                    pl(table), pl(record["Destination.value"]), pl(record["Name.value"]), pl(var) );
        if (table == "ohs_server")
                return sprintf("%s/%s/%s.rrd", record["Host"], pl(table), pl(var) );
        if (table == "JVM")
                return sprintf("%s/%s/%s/%s.rrd", record["Host"],
                    pl(table), pl(record["Process"]), pl(var) );
        if (table == "opmn_process")
                return sprintf("%s/%s/%s/%s/%s/%s/%s/%s.rrd", record["Host"], pl(table),
                  pl(record["iasInstance.value"]), pl(record["opmn_ias_component"]),
                  pl(record["opmn_process_type"]),pl(record["opmn_process_set"]),
                  pl(record["Name"]), pl(var) );

        return sprintf("%s/%s/%s.rrd", record["Host"], pl(table), pl(var) );
}
# function process_record actually does the dirty work of invoking the update script
function process_record() {
        #every record has a timeStamp.ts metric that I should use to update my rrd
        ts=substr(record["timeStamp.ts"],0,10);
        for ( var in record ) {
        if ( var != "timeStamp.ts" &amp;&amp; record[var] ~ /^[[:digit:]]+$/ ) {
            if ( var ~ /\.(count|completed|time)$/ ) {
                dstype="DERIVE";
            } else {
                if ( var == "responseSize.value" ) {
                    dstype="DERIVE";
                } else {
                    dstype="GAUGE";
                }
            }
            rrdFile=sprintf("/path_to_data/%s",get_rrd_path());
            #### update_metric_rrd is a shell script listed below!!!!!
            cmd=sprintf("/path_to_scripts/update_metric_rrd %s %s %d %d",
                rrdFile,dstype,ts,record[var]);
            system(cmd);
            }
        }
}

# parse_record() populates an hash array
# with all metrics belonging to the table record
function parse_record() {
    #print "RRRR -  START OF RECORD (table " table ")"
    delete record
    while ( ! /^$/ ) {
        # I'm parsing the record as far I'm in this while statement
        # the array hash is the name of the dms metric basename.
        # $1 is the metric name but I have to trim the final ":"
        key=substr($1,0,length($1)-1)
        record[key]=$2
        getline
    }
    # this function is included in funcions.awk:
    # I invoke it to process the record I've just parsed
    process_record();
}
BEGIN {
    # as far as started is 0, I've never reached the first table
    started=0
}

#MAIN
{
    # I jump over the first lines until I reach the first table
    if (started==0) {
        while ( ! /^---/ ) {
           getline
        }
        started=1
    }

    # looking for the next occurrence of a table
    # all tables start with:
    # ----------
    # table_name
    # ----------
    if ( /^---/ ) {
        # first table reached: the next row is my table name,
        # then I reach again a dashed line -----
        getline table
        getline trash
        #print ""
        #print "##########################"
        print "  TABELLA " table
        #print "##########################"
        next
    }

    if ( ! /^$/ ) {
        # reached an empty line: could be the end of a record or the and of a table
        # since a new table is threated in previous "if" statement, I'm starting a new record.
        parse_record()
    }

}

END {
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

####################

# parse_output.awk #

####################

#function pl() replaces all non alphanumeric occurrences with an underscore

function pl(input) {

return gensub("[^[:alnum:]_-]","_","G",input);

}

# function get_rrd_path() returns a path where the rrd files should be placed

# I should rewrite a new path for each dms table... I'll skip many of them

function get_rrd_path() {

if (table == "mod_oc4j_destination_metrics")

return sprintf("%s/%s/%s/%s.rrd", record["Host"],

pl(table), pl(record["Name.value"]), pl(var) );

if (table == "mod_oc4j_mount_pt_metrics")

return sprintf("%s/%s/%s/%s/%s.rrd", record["Host"],

pl(table), pl(record["Destination.value"]), pl(record["Name.value"]), pl(var) );

if (table == "ohs_server")

return sprintf("%s/%s/%s.rrd", record["Host"], pl(table), pl(var) );

if (table == "JVM")

return sprintf("%s/%s/%s/%s.rrd", record["Host"],

pl(table), pl(record["Process"]), pl(var) );

if (table == "opmn_process")

return sprintf("%s/%s/%s/%s/%s/%s/%s/%s.rrd", record["Host"], pl(table),

pl(record["iasInstance.value"]), pl(record["opmn_ias_component"]),

pl(record["opmn_process_type"]),pl(record["opmn_process_set"]),

pl(record["Name"]), pl(var) );

return sprintf("%s/%s/%s.rrd", record["Host"], pl(table), pl(var) );

}

# function process_record actually does the dirty work of invoking the update script

function process_record() {

#every record has a timeStamp.ts metric that I should use to update my rrd

ts=substr(record["timeStamp.ts"],0,10);

for ( var in record ) {

if ( var != "timeStamp.ts" && record[var] ~ /^[[:digit:]]+$/ ) {

if ( var ~ /\.(count|completed|time)$/ ) {

dstype="DERIVE";

} else {

if ( var == "responseSize.value" ) {

dstype="DERIVE";

} else {

dstype="GAUGE";

}

rrdFile=sprintf("/path_to_data/%s",get_rrd_path());

#### update_metric_rrd is a shell script listed below!!!!!

cmd=sprintf("/path_to_scripts/update_metric_rrd %s %s %d %d",

rrdFile,dstype,ts,record[var]);

system(cmd);

}

# parse_record() populates an hash array

# with all metrics belonging to the table record

function parse_record() {

#print "RRRR - START OF RECORD (table " table ")"

delete record

while ( ! /^$/ ) {

# I'm parsing the record as far I'm in this while statement

# the array hash is the name of the dms metric basename.

# $1 is the metric name but I have to trim the final ":"

key=substr($1,0,length($1)-1)

record[key]=$2

getline

}

# this function is included in funcions.awk:

# I invoke it to process the record I've just parsed

process_record();

}

BEGIN {

# as far as started is 0, I've never reached the first table

started=0

}

#MAIN

{

# I jump over the first lines until I reach the first table

if (started==0) {

while ( ! /^---/ ) {

getline

}

started=1

}

# looking for the next occurrence of a table

# all tables start with:

# ----------

# table_name

# ----------

if ( /^---/ ) {

# first table reached: the next row is my table name,

# then I reach again a dashed line -----

getline table

getline trash

#print ""

#print "##########################"

print " TABELLA " table

#print "##########################"

}

if ( ! /^$/ ) {

# reached an empty line: could be the end of a record or the and of a table

# since a new table is threated in previous "if" statement, I'm starting a new record.

parse_record()

}

END {

}

And this is the code for update_metric_rrd:

#!/bin/bash
RRDFILE=$1
DSTYPE=$2
TS=$3
VALUE=$4

rrdtool update $RRDFILE ${TS}:${VALUE}

if [ $? -ne 0 ] ; then
        DIR=`dirname $RRDFILE`

        [ -d $DIR ] || mkdir -p $DIR
        [ -f $RRDFILE ] || rrdtool create $RRDFILE -b "now-1month" -s 1800 \
                DS:metric:${DSTYPE}:7200:0:U \
                RRA:AVERAGE:0.5:1:672 \
                RRA:AVERAGE:0.5:4:1080 \
                RRA:AVERAGE:0.5:12:1460 \
                RRA:AVERAGE:0.5:48:1095 \
                RRA:MAX:0.5:4:1080 \
                RRA:MAX:0.5:12:1460 \
                RRA:MAX:0.5:48:1095 \
                RRA:LAST:0.5:1:672
        rrdtool update $RRDFILE ${TS}:${VALUE}
fi

#!/bin/bash

RRDFILE=$1

DSTYPE=$2

TS=$3

VALUE=$4

rrdtool update $RRDFILE ${TS}:${VALUE}

if [ $? -ne 0 ] ; then

DIR=`dirname $RRDFILE`

[ -d $DIR ] || mkdir -p $DIR

[ -f $RRDFILE ] || rrdtool create $RRDFILE -b "now-1month" -s 1800 \

DS:metric:${DSTYPE}:7200:0:U \

RRA:AVERAGE:0.5:1:672 \

RRA:AVERAGE:0.5:4:1080 \

RRA:AVERAGE:0.5:12:1460 \

RRA:AVERAGE:0.5:48:1095 \

RRA:MAX:0.5:4:1080 \

RRA:MAX:0.5:12:1460 \

RRA:MAX:0.5:48:1095 \

RRA:LAST:0.5:1:672

rrdtool update $RRDFILE ${TS}:${VALUE}

Once you have all your rrd files populated, it’s easy to script automatic reporting. You would probably want a graph with the request count served by your Apache cluster, along with its linear regression:

rrdtool graph - -s "end-${hours}hours" -e $end \
                -v "Requests Completed/sec" \
        -w 640 -h 240 --slope-mode \
                -t "HTTP Requests for www.ludovicocaldara.net" \
                DEF:1request_completed=/data/wwwserver1/ohs_server/request_completed.rrd:metric:AVERAGE \
                DEF:2request_completed=/data/wwwserver2/ohs_server/request_completed.rrd:metric:AVERAGE \
                CDEF:request_completed=1request_completed,2request_completed,+ \
                VDEF:slope=request_completed,LSLSLOPE \
                VDEF:lslint=request_completed,LSLINT \
                CDEF:reg=request_completed,POP,slope,COUNT,*,lslint,+ \
                LINE1:reg#666666:"Regression" \
                AREA:1request_completed#4040AA:"wwwserver1"  \
                AREA:2request_completed#6666FF:"wwwserver1":STACK  \
        &gt; mygraph.png

rrdtool graph - -s "end-${hours}hours" -e $end \

-v "Requests Completed/sec" \

-w 640 -h 240 --slope-mode \

-t "HTTP Requests for www.ludovicocaldara.net" \

DEF:1request_completed=/data/wwwserver1/ohs_server/request_completed.rrd:metric:AVERAGE \

DEF:2request_completed=/data/wwwserver2/ohs_server/request_completed.rrd:metric:AVERAGE \

CDEF:request_completed=1request_completed,2request_completed,+ \

VDEF:slope=request_completed,LSLSLOPE \

VDEF:lslint=request_completed,LSLINT \

CDEF:reg=request_completed,POP,slope,COUNT,*,lslint,+ \

LINE1:reg#666666:"Regression" \

AREA:1request_completed#4040AA:"wwwserver1" \

AREA:2request_completed#6666FF:"wwwserver1":STACK \

> mygraph.png

This is the result:
OHS request completed
OHHHHHHHHHHHH!!!! COOL!!!!

That’s all for DMS capacity planning. Stay tuned, more about rrdtool is coming!

More about Dataguard and how to check it

Posted on February 6, 2009 by Ludovico

After my post Quick Oracle Dataguard check script I have some considerations to add:
to check the gap of applied log stream by MRP0 process it’s sufficient to replace this query in the perl script I posted:

 select SEQUENCE#, BLOCK# from v\$managed_standby
        where process='RFS' and client_process='LGWR'

1 2	select SEQUENCE#, BLOCK# from v\$managed_standby where process='RFS' and client_process='LGWR'

with this new one:

 select SEQUENCE#, BLOCK# from v\$managed_standby
        where process='MRP0'

1 2	select SEQUENCE#, BLOCK# from v\$managed_standby where process='MRP0'

To check this you have to meet the following condition: You should have real-time apply enabled (and possibly NODELAY clause specified in your recover statement). Check it with this query:

SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS;

1	SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS;

It should be “MANAGED REAL TIME APPLY”.
If not using realtime apply your MRP0 process will wait until you have a new archive, so even if you have redo transport mode set to LGWR you’ll wait for standby log completion. Your gap of applied redo stream will be at least one sequence#.

With transport mode set to LGWR and real-time apply the output of the perl script is similar to this one:

# ./checkDataGuard.sh
PROD   :       1230      20631
STANDBY:       1230      20613
18         blocks gap

# ./checkDataGuard.sh

PROD : 1230 20631

STANDBY: 1230 20613

18 blocks gap

The whole gap between your primary and standby database should be LOW.